Chen, Hung

Research Topic 1:Locating Maximum of a Nonlinear Regression Surface

Chu,
Shu-Jane; Huang,
Wen-Jang; Chen,
Hung
**A study of asymptotic distributions of concomitants of certain order
statistics.**
*Statist.
Sinica* **9 **(1999), no.
3, 811--830.
(download pdf
file)

Chen,
Hung; Huang,
Mong-Na Lo; Huang,
Wen-Jang
**Estimation of the location of the maximum of a regression function
using extreme **

** order statistics.**
*J.
Multivariate Anal.* **57 **(1996), no.
2, 191--214. (download pdf
file)

Summary: "We consider the problem of approximating the location, x_{0}
Î_{
}C, of a maximum

of a regression function q(x) under certain weak assumptions on q. Here C is a bounded

interval in **R**. A specific algorithm considered in this paper is as
follows. Taking a random

sample X_{1} ,..., X_{n}
from a distribution over C, we have (X_{i},Y_{i}),
where Y_{i} is the outcome of

a noisy measurement of
q(X_{i}). Arrange the
Y_{i}'s in nondecreasing order and take the average

of the r X_{i}'s
which are associated with the r largest order statistics of Y_{i}.
This average, \hat{x_{0}},

is then used as
an estimate of x_{0}. The utility of such an algorithm
with fixed r is evaluated

in this paper.
To be specific, the convergence rates of \hat{x_{0}} to x_{0}
are derived. Those rates

will
depend on the right tail of the noise distribution and the shape of
q(·)
near x_{0}."

Chen,
Hung
**Lower rate of convergence for locating a maximum of a function.**
*Ann.
Statist.* **16 **(1988), no.
3, 1330--1334. (download pdf
file)

p>1 is an odd number, **F** is a class of functions on [-1,1], containing
a sufficiently rich

subclass of functions *f *with |f^{(p)}|
≦ 1; for *f*
Î
**F**, let c(*f *) be a point of
global maximum

of* f *. {g(·,x,t): x Î
[-1,1], t Î
**R**}
is a family of probability densities whose second-order

derivatives with respect to t satisfy some boundedness conditions.
A design *D * associates

with each *f *in F two sequences {X_{n}} and {Y_{n}}
of random variables such that the conditional

distribution of X_{n } given the past does not depend
on *f* and that of Y_{n }is given by g(·,X_{n},*f
*(X_{n})).

As estimates of c(*f *), sequences
{T_{n}} are considered with each T_{n }a function
of X_{n} and Y_{n}.

The following result is proved: For every h Î
(0,1) there is a c>0 such that, for all n and

every design *D * and estimate {T_{n}},

inf_{f} P(| T_{n }- c(*f*)| ≦ cn^{-(p-1)/(2p)}) >=
h.

Research Topic 2: curreent research interestIncomplete Covariate Regression

Chen,
Yi-Hau; Chen,
Hung
**Incomplete covariates data in generalized linear models.**
*J.
Statist. Plann. Inference* **79 **(1999), no.
2, 247--258. (download pdf
file)

Chen,
Yi-Hau; Chen,
Hung
**A unified approach to regression analysis under double sampling design. **
*J.
Royal Statist. Society Ser. B* 62** **(2000), 449--460. ( download PostScript
file)

Chen,
Hung; Tseng, Chien-Cheng
**A study on conditional mean imputation method for missing covariate in linear
regression models.**

manuscript ** **(2000). ( download PostScript
file)

Research Topic 3:

Semiparametric Regression Models

Chen,
Hung and Co-authors
**Term Structure of Continuous-Time Interest Rates.**

A Very Preliminary Manuscript (download
PostScript file for co-authors)

Chen,
Hung
**Asymptotically efficient estimation in semiparametric generalized
linear models.**
*Ann.
Statist.* **23 **(1995), no.
4, 1102--1129. (download pdf
file)

Summary: "We use the method of maximum likelihood and regression splines
to derive

estimates of the parametric and nonparametric components of semiparametric
generalized

linear models. The resulting estimators of both components are
shown to be consistent.

Also, the asymptotic theory For the estimator of the parametric component
is derived,

indicating that the parametric component can be estimated efficiently
without

undersmoothing the nonparametric component."

Chen,
Hung; Shiau,
Jyh Jen Horng
**Data-driven efficient estimators for a partially linear model.**
*Ann.
Statist.* **22 **(1994), no.
1, 211--237. (download pdf
file)

The authors showed [J. Statist. Plann. Inference 27 (1991), no. 2,
187--201] that a two-stage

spline smoothing method and the partial regression method lead to efficient
estimators for

the parametric component of a partially linear model when the smoothing
parameter tends to

zero at an appropriate rate. In this paper, they study the asymptotic behavior
of these estimators

when the smoothing parameter is chosen either by the generalized cross
validation (GCV)

method or by the Mallows C_{L} criterion. Under some regularity conditions,
the estimated

parametric component is asymptotically normal with the usual parametric rate of convergence

for both spline estimation methods.

Chen,
Hung; Chen,
Keh-Wei
**Selection of the splined variables and convergence rates in a partial
spline model.**
*Canad.
J. Statist.* **19 **(1991), no.
3, 323--339. (download pdf
file)

This paper belongs to a relatively new stream of works concerning inference
in semiparametric

models, i.e., observations are made according to the scheme

(i) Y= m(X) + e, where Y Î
**R** is
the dependent variable, X Î
**R**^{d} is the independent (vector)

variable, e
is an unobservable noise with mean 0 and finite variance. A
data set

{(y_{i},x_{li},...,x_{di}), 1 ≦
i ≦
n} is then used to determine
the unknown regression function m(·).

The authors assume the following semiparametric model
for m(·):

(ii) m(X)=W^{T}·b+
q(Z),
where X=(X_{l},...,X_{d})^{T }is partitioned as
X=(W^{T},Z^{T})^{T}.

In (ii), q(·) denotes an unknown smooth
function, while b is the vector of unknown

constant parameters. In the general framework of (ii), the authors
consider the problem of

estimating not only q and b
, but the statistician also aims to decide how to partition the
vector X

into subvectors W and Z. More precisely, let A be a subset of
{1,2,...,d} and let X_{A} and Z_{A}

denote column vectors with those X_{i}, i Î
A, and X_{i},
i Ï A, respectively. Now, the problem

is to "recover" the proper subset of A, the vector b_{A}
and the function
q(Z_{A}) in the
model

(iii) Y=W_{A}^{T}b_{A}+
q(Z_{A})+ e.
The estimation procedure proposed in the paper can be summarized

as follows:

(1) For a given index set A, a tensor product polynomial spline with
degree n and with K_{n}^{|A|}
knots

is used to approximate q(Z_{A}),
where |A| denotes the cardinality of A.

(2) The method of least squares (LSM) is used to fit the function W_{A}^{T}·b_{A
}+ q(Z_{A}) to the data.

Note that, when q(Z_{A})
is approximated as in Step 1, LSM involves d- |A| +(K_{n }+ n)^{|A|}

parameters.

(3) The structural parameters (n, K_{n})
are determined through an adjusted sum of squares,

which is derived from the principle of unbiased
risk estimate (the FPE criterion proposed

by Akaike is used at this step).

(4) The index set is found as the set A for which the adjusted residual
sums of squares

is a minimum.

Under suitable regularity conditions, which are too technical to be
described here in detail,

the following results are proved in the paper:

(a) The estimator obtained by applying the above steps attains the
optimal convergence rate

in the sense of Stone.

(b) If A_{0} denotes the correct index set, which selects independent
variables in (iii), while

Ậ is the corresponding set obtained
by the proposed method, then

P{Ậ =A_{0}} to 1 as n tends
to infinity.

Chen,
Hung; Shiau,
Jyh-Jen Horng
**A two-stage spline smoothing method for partially linear models.**
*J.
Statist. Plann. Inference* **27 **(1991), no.
2, 187--201. (download pdf
file)

Summary: "Rice (1986) showed that the partial spline estimate of the
parametric component

in a semiparametric regression model is generally biased and it is
necessary to undersmooth

the nonparametric component to force the bias to be negligible with
respect to the standard

error. We propose a two-stage spline smoothing method for estimating the parametric and nonparametric components in a semiparametric model. By appropriately choosing rates for

the smoothing parameters, we show that the parametric component can be estimated at the parametric rate with the new estimate without undersmoothing the nonparametric component.

We also show that
the same result holds for the partial regression estimate proposed independently by Denby (1986)
and Speckman (1988).

Asymptotic normality results for the parametric component are also
shown for both estimates.

Furthermore, we associate these estimates with Wellner's (1986) efficient
scores methods."

Chen,
Hung
**Convergence rates for parametric components in a partly linear model.**
*Ann.
Statist.* **16 **(1988), no.
1, 136--146. (download pdf
file)

A regression model with random explanatory variables which is partly nonlinear is

considered. Let y be the real response function, x be a k-dimensional random variable, and

t be a one-dimensional random variable. Then y = x'b+
g(t)+e, where b is an unknown

k-dimensional parameter vector, g is an unknown function from a given
class of real

smooth functions, and e is an unobservable random error having mean
zero and variance

s^{2}.
The aim is to estimate b and g on the basis
of data (y_{i}, x_{i}, t_{i}), i=1,...,n.

Least squares estimation is considered, where piecewise polynomials ĝ are used to

estimate g. Under some assumptions on the degree of smoothness
of g, on the distribution

of t, and on the conditional distribution of x given t, the author studies the asymptotic

behaviour of the least squares estimators. One of the results establishes convergence in

distribution of n^{1/2}(\hat{b}
- b) to the normal distribution with
mean zero and covariance

matrix s^{2}
S^{d-1}, where S is the difference of covariance(x)
and covariance of E(x| t).

Chen, HungResearch Topic 4:Curve Fitting

Summary: "Let (**X**,Y)
Î
[0,1]^{d} **R** be a random
vector and let the conditional

distribution of Y given **X**= **x** have mean q(
**x**) and satisfy a suitable moment condition.

It is assumed that the density function of **X** is
bounded away from zero and infinity on

[0,1]^{d}.
Suppose that q(**x**) is known to be
a general d-dimensional smooth function of **x** only.

Consider an estimator of q having the form
of a polynomial spline with simple knots at

equally spaced grids over [0,1]^{d}, where the coefficients
are determined by the method of

least squares based on a random sample of size n from the distribution of (**X**,Y).
It is shown

that this estimator
achieves the optimal rates of convergence for nonparametric regression
estimation as defined by C. J. Stone [Ann. Statist. 10 (1982), no. 4, 1040--1053] under L_{2}
norm

and sup norm, respectively."

Chen,
Hung
**Estimation of a projection-pursuit type regression model.**
*Ann.
Statist.* **19 **(1991), no.
1, 142--157. (download pdf
file)

Summary: "Since the pioneering work of Friedman and Stuetzle in 1981,
projection-pursuit

algorithms have attracted increasing attention. This is mainly
due to their potential for

overcoming or reducing difficulties arising in nonparametric regression models associated with

the so-called curse of dimensionality---that is, the amount of data required to avoid an

unacceptably large
variance increasing rapidly with dimensionality. Subsequent
work has, however, uncovered a dependence on dimensionality for projection-pursuit regression models.
Here we propose a projection-pursuit-type estimation scheme, with two additional
constraints imposed, for which the rate of convergence of the estimator is shown to be independent
of the dimensionality. Let ( **X**,Y) be a random vector such that **X **= (X_{1},...,X_{d})^{T}
ranges over **R**^{d}. The conditional mean of Y given **X**= **x** is assumed to be the sum of no more
than d general smooth functions of b_{i}^{T}
**x**, where b_{i} Î
S^{d-1},
the unit sphere in **R**^{d} centered at the origin.
A least-squares polynomial spline and the final prediction error criterion are used to fit the model
to a random sample of size n from the distribution of (**X**,Y). Under appropriate
conditions, the rate of convergence of the proposed estimator is independent of d."