Chen, Hung
Research Topic 1:Locating Maximum of a Nonlinear Regression Surface
Chu,
Shu-Jane; Huang,
Wen-Jang; Chen,
Hung
A study of asymptotic distributions of concomitants of certain order
statistics.
Statist.
Sinica 9 (1999), no.
3, 811--830.
(download pdf
file)
Chen,
Hung; Huang,
Mong-Na Lo; Huang,
Wen-Jang
Estimation of the location of the maximum of a regression function
using extreme
order statistics.
J.
Multivariate Anal. 57 (1996), no.
2, 191--214. (download pdf
file)
Summary: "We consider the problem of approximating the location, x0 Î C, of a maximum
of a regression function q(x) under certain weak assumptions on q. Here C is a bounded
interval in R. A specific algorithm considered in this paper is as follows. Taking a random
sample X1 ,..., Xn from a distribution over C, we have (Xi,Yi), where Yi is the outcome of
a noisy measurement of q(Xi). Arrange the Yi's in nondecreasing order and take the average
of the r Xi's which are associated with the r largest order statistics of Yi. This average, \hat{x0},
is then used as an estimate of x0. The utility of such an algorithm with fixed r is evaluated
in this paper. To be specific, the convergence rates of \hat{x0} to x0 are derived. Those rates
will depend on the right tail of the noise distribution and the shape of q(·) near x0."
Chen,
Hung
Lower rate of convergence for locating a maximum of a function.
Ann.
Statist. 16 (1988), no.
3, 1330--1334. (download pdf
file)
p>1 is an odd number, F is a class of functions on [-1,1], containing a sufficiently rich
subclass of functions f with |f(p)|
≦ 1; for f
Î
F, let c(f ) be a point of
global maximum
of f . {g(·,x,t): x Î
[-1,1], t Î
R}
is a family of probability densities whose second-order
derivatives with respect to t satisfy some boundedness conditions.
A design D associates
with each f in F two sequences {Xn} and {Yn}
of random variables such that the conditional
distribution of Xn given the past does not depend
on f and that of Yn is given by g(·,Xn,f
(Xn)).
As estimates of c(f ), sequences
{Tn} are considered with each Tn a function
of Xn and Yn.
The following result is proved: For every h Î
(0,1) there is a c>0 such that, for all n and
every design D and estimate {Tn},
inff P(| Tn - c(f)| ≦ cn-(p-1)/(2p)) >=
h.
Research Topic 2: curreent research interestIncomplete Covariate Regression
Chen,
Yi-Hau; Chen,
Hung
Incomplete covariates data in generalized linear models.
J.
Statist. Plann. Inference 79 (1999), no.
2, 247--258. (download pdf
file)
Chen,
Yi-Hau; Chen,
Hung
A unified approach to regression analysis under double sampling design.
J.
Royal Statist. Society Ser. B 62 (2000), 449--460. ( download PostScript
file)
Chen,
Hung; Tseng, Chien-Cheng
A study on conditional mean imputation method for missing covariate in linear
regression models.
manuscript (2000). ( download PostScript
file)
Research Topic 3:
Semiparametric Regression Models
Chen,
Hung and Co-authors
Term Structure of Continuous-Time Interest Rates.
A Very Preliminary Manuscript (download
PostScript file for co-authors)
Chen,
Hung
Asymptotically efficient estimation in semiparametric generalized
linear models.
Ann.
Statist. 23 (1995), no.
4, 1102--1129. (download pdf
file)
Summary: "We use the method of maximum likelihood and regression splines
to derive
estimates of the parametric and nonparametric components of semiparametric
generalized
linear models. The resulting estimators of both components are
shown to be consistent.
Also, the asymptotic theory For the estimator of the parametric component
is derived,
indicating that the parametric component can be estimated efficiently
without
undersmoothing the nonparametric component."
Chen,
Hung; Shiau,
Jyh Jen Horng
Data-driven efficient estimators for a partially linear model.
Ann.
Statist. 22 (1994), no.
1, 211--237. (download pdf
file)
The authors showed [J. Statist. Plann. Inference 27 (1991), no. 2,
187--201] that a two-stage
spline smoothing method and the partial regression method lead to efficient
estimators for
the parametric component of a partially linear model when the smoothing
parameter tends to
zero at an appropriate rate. In this paper, they study the asymptotic behavior
of these estimators
when the smoothing parameter is chosen either by the generalized cross
validation (GCV)
method or by the Mallows CL criterion. Under some regularity conditions, the estimated
parametric component is asymptotically normal with the usual parametric rate of convergence
for both spline estimation methods.
Chen,
Hung; Chen,
Keh-Wei
Selection of the splined variables and convergence rates in a partial
spline model.
Canad.
J. Statist. 19 (1991), no.
3, 323--339. (download pdf
file)
This paper belongs to a relatively new stream of works concerning inference
in semiparametric
models, i.e., observations are made according to the scheme
(i) Y= m(X) + e, where Y Î
R is
the dependent variable, X Î
Rd is the independent (vector)
variable, e
is an unobservable noise with mean 0 and finite variance. A
data set
{(yi,xli,...,xdi), 1 ≦ i ≦ n} is then used to determine the unknown regression function m(·).
The authors assume the following semiparametric model
for m(·):
(ii) m(X)=WT·b+
q(Z),
where X=(Xl,...,Xd)T is partitioned as
X=(WT,ZT)T.
In (ii), q(·) denotes an unknown smooth
function, while b is the vector of unknown
constant parameters. In the general framework of (ii), the authors
consider the problem of
estimating not only q and b
, but the statistician also aims to decide how to partition the
vector X
into subvectors W and Z. More precisely, let A be a subset of
{1,2,...,d} and let XA and ZA
denote column vectors with those Xi, i Î
A, and Xi,
i Ï A, respectively. Now, the problem
is to "recover" the proper subset of A, the vector bA
and the function
q(ZA) in the
model
(iii) Y=WATbA+
q(ZA)+ e.
The estimation procedure proposed in the paper can be summarized
as follows:
(1) For a given index set A, a tensor product polynomial spline with
degree n and with Kn|A|
knots
is used to approximate q(ZA),
where |A| denotes the cardinality of A.
(2) The method of least squares (LSM) is used to fit the function WAT·bA
+ q(ZA) to the data.
Note that, when q(ZA)
is approximated as in Step 1, LSM involves d- |A| +(Kn + n)|A|
parameters.
(3) The structural parameters (n, Kn)
are determined through an adjusted sum of squares,
which is derived from the principle of unbiased
risk estimate (the FPE criterion proposed
by Akaike is used at this step).
(4) The index set is found as the set A for which the adjusted residual
sums of squares
is a minimum.
Under suitable regularity conditions, which are too technical to be
described here in detail,
the following results are proved in the paper:
(a) The estimator obtained by applying the above steps attains the
optimal convergence rate
in the sense of Stone.
(b) If A0 denotes the correct index set, which selects independent
variables in (iii), while
Ậ is the corresponding set obtained
by the proposed method, then
P{Ậ =A0} to 1 as n tends
to infinity.
Chen,
Hung; Shiau,
Jyh-Jen Horng
A two-stage spline smoothing method for partially linear models.
J.
Statist. Plann. Inference 27 (1991), no.
2, 187--201. (download pdf
file)
Summary: "Rice (1986) showed that the partial spline estimate of the
parametric component
in a semiparametric regression model is generally biased and it is
necessary to undersmooth
the nonparametric component to force the bias to be negligible with
respect to the standard
error. We propose a two-stage spline smoothing method for estimating the parametric and nonparametric components in a semiparametric model. By appropriately choosing rates for
the smoothing parameters, we show that the parametric component can be estimated at the parametric rate with the new estimate without undersmoothing the nonparametric component.
We also show that
the same result holds for the partial regression estimate proposed independently by Denby (1986)
and Speckman (1988).
Asymptotic normality results for the parametric component are also
shown for both estimates.
Furthermore, we associate these estimates with Wellner's (1986) efficient
scores methods."
Chen,
Hung
Convergence rates for parametric components in a partly linear model.
Ann.
Statist. 16 (1988), no.
1, 136--146. (download pdf
file)
A regression model with random explanatory variables which is partly nonlinear is
considered. Let y be the real response function, x be a k-dimensional random variable, and
t be a one-dimensional random variable. Then y = x'b+
g(t)+e, where b is an unknown
k-dimensional parameter vector, g is an unknown function from a given
class of real
smooth functions, and e is an unobservable random error having mean
zero and variance
s2.
The aim is to estimate b and g on the basis
of data (yi, xi, ti), i=1,...,n.
Least squares estimation is considered, where piecewise polynomials ĝ are used to
estimate g. Under some assumptions on the degree of smoothness
of g, on the distribution
of t, and on the conditional distribution of x given t, the author studies the asymptotic
behaviour of the least squares estimators. One of the results establishes convergence in
distribution of n1/2(\hat{b} - b) to the normal distribution with mean zero and covariance
matrix s2 Sd-1, where S is the difference of covariance(x) and covariance of E(x| t).
Chen, HungResearch Topic 4:Curve Fitting
Summary: "Let (X,Y)
Î
[0,1]d R be a random
vector and let the conditional
distribution of Y given X= x have mean q(
x) and satisfy a suitable moment condition.
It is assumed that the density function of X is
bounded away from zero and infinity on
[0,1]d.
Suppose that q(x) is known to be
a general d-dimensional smooth function of x only.
Consider an estimator of q having the form
of a polynomial spline with simple knots at
equally spaced grids over [0,1]d, where the coefficients
are determined by the method of
least squares based on a random sample of size n from the distribution of (X,Y). It is shown
that this estimator achieves the optimal rates of convergence for nonparametric regression estimation as defined by C. J. Stone [Ann. Statist. 10 (1982), no. 4, 1040--1053] under L2 norm
and sup norm, respectively."
Chen,
Hung
Estimation of a projection-pursuit type regression model.
Ann.
Statist. 19 (1991), no.
1, 142--157. (download pdf
file)
Summary: "Since the pioneering work of Friedman and Stuetzle in 1981,
projection-pursuit
algorithms have attracted increasing attention. This is mainly
due to their potential for
overcoming or reducing difficulties arising in nonparametric regression models associated with
the so-called curse of dimensionality---that is, the amount of data required to avoid an
unacceptably large variance increasing rapidly with dimensionality. Subsequent work has, however, uncovered a dependence on dimensionality for projection-pursuit regression models. Here we propose a projection-pursuit-type estimation scheme, with two additional constraints imposed, for which the rate of convergence of the estimator is shown to be independent of the dimensionality. Let ( X,Y) be a random vector such that X = (X1,...,Xd)T ranges over Rd. The conditional mean of Y given X= x is assumed to be the sum of no more than d general smooth functions of biT x, where bi Î Sd-1, the unit sphere in Rd centered at the origin. A least-squares polynomial spline and the final prediction error criterion are used to fit the model to a random sample of size n from the distribution of (X,Y). Under appropriate conditions, the rate of convergence of the proposed estimator is independent of d."