+ - 0:00:00
Notes for current slide
Notes for next slide

Learning the regularity of the curves in FDA

with applications

Steven Golovkine · Nicolas Klutchnikoff · Valentin Patilea

53es Journée de Statistique de la SFDS

June 16, 2022

1 / 14

Functional Data Analysis



temperature precipitation
Canadian weather dataset (Ramsay and Silverman, 2005).
2 / 14

Functional Data Analysis

power
Household Active Power Consumption (UC Irvine ML Repository).
3 / 14

Sample path regularity — a key concept

  • Data: (randomly) discrete time noisy measurements of the curves

  • Recover the curves

    A nonparametric regression problem where optimality depends on the curve regularity;

    see, e.g., (Tsybakov, 2009).

  • Estimate the mean and the covariance

    The optimal rates depend on the curve regularity;

    see, e.g., (Cai and Yuan, 2010), (Cai and Yuan, 2011), (Cai and Yuan, 2016).

  • Fit usual predictive models (linear...)

    The optimal convergence rates depend on the predictor curve regularity;

    see, e.g., (Hall and Horowitz, 2007).

4 / 14

Sample path regularity


  • Let O be a neighborhood of t.

  • For Ht(0,1), Lt>0, u,vO assume that the stochastic process X satisfies the condition:

E(XuXv)2Lt2|vu|2Ht.

  • Ht is called the local regularity of the process X on O.

  • Our parameter Ht is related to Hurst exponent.

  • The definition extends to smoother sample paths using the derivatives of Xu and Xv instead.

5 / 14

Local regularity vs. eigenvalue decrease rate


  • The (local) regularity is related to the decrease rate of the eigenvalues of the covariance operator of the process.

  • Under some conditions, if

λjjν,j1,

for some ν>1, then

2(H+δ)=ν1

when the sample paths admit derivative up to order δ and H is the regularity of the derivatives of order δ.

6 / 14

Local regularity vs. eigenvalue decrease rate


  • The (local) regularity is related to the decrease rate of the eigenvalues of the covariance operator of the process.

  • Under some conditions, if

λjjν,j1,

for some ν>1, then

2(H+δ)=ν1

when the sample paths admit derivative up to order δ and H is the regularity of the derivatives of order δ.


The value of `ν` is usually supposed given!

6 / 14

Observed data

  • Let X(1),,X(N) be an independent sample of a random process X=(X(t):t[0,1]) with continuous trajectories.

  • For each 1nN

    Mn is a random positive integer.

    Tm(n)[0,1],1mMn be the (random) observation times, design points, for the curve X(n).

  • The observations are (Ym(n),Tm(n)), 1mMn,1nN, where

    Ym(n)=X(n)(Tm(n))+σ(Tm(n),X(n)(Tm(n)))em(n).

  • em(n) are independent copies of a standardized error term.

7 / 14

Estimation

  • For s,tO, let

θ(s,t)=E[(XtXs)2]L2|ts|2Ht0.

  • Let t1 and t3 be such that [t1,t3]O, and denote t2 the middle point of [t1,t3].

  • A natural proxy of Ht0 is given by

log(θ(t1,t3))log(θ(t1,t2))2log2,if t3t1 is small.

8 / 14

Estimation

  • For s,tO, let

θ(s,t)=E[(XtXs)2]L2|ts|2Ht0.

  • Let t1 and t3 be such that [t1,t3]O, and denote t2 the middle point of [t1,t3].

  • A natural proxy of Ht0 is given by

log(θ(t1,t3))log(θ(t1,t2))2log2,if t3t1 is small.

  • An estimator of Ht0 is given by

log(θ^(t1,t3))log(θ^(t1,t2))2log2,if t3t1 is small. where, given a nonparametric estimator Xt~ of Xt, θ^(s,t)=1Nn=1N(X~t(n)X~s(n))2.

8 / 14

Estimator of the mean function

  • For any tT, let

wn(t;h)=1ifm=1Mn1{|Tm(n)th|}k0,n{1,,N}.

  • Then,

WN(t,h)=n=1Nwn(t,h).

  • The estimator of the mean function is

μ^N(t,h)=1WN(t,h)n=1Nwn(t,h)X^(n)(t),tT.

  • An adaptive optimal bandwidth is

h^μ=Cμ(Nm)1/(1+2H^t).

9 / 14

Results for mean estimation

exp1_mu
ISE with respect to the true mean function `μ`.
10 / 14

Estimator of the covariance function

  • For any st, let

WN(s,t,h)=n=1Nwn(s,h)wn(t,h).

  • The estimator of the covariance function is, for |st|>δ,

Γ^N(s,t,h)=1WN(s,t,h)n=1Nwn(s,h)X^(n)(s)wn(t,h)X^(n)(t)μ^(s)μ^(t).

  • An adaptive optimal bandwidth is

h^Γ=CΓ(Nm)1/(1+2min(H^s,H^t)).

11 / 14

Results for covariance estimation

exp1_gamma
ISE with respect to the true mean function `Γ`.
12 / 14

Takeaway ideas

  • The available data in FDA are usually noisy measurements at a discrete, possibly random design points.

  • The usual FDA methods require the reconstruction of the curves.

  • The optimal curve recovery depends on the purpose, but in most cases depends on the regularity of the sample paths.

  • We formalize the concept of local regularity of the process, propose a first simple estimator for it and mean and covariance functions.

  • The paper concerning the estimation of the regularity is available at

    DOI: 10.1214/22-EJS1997
  • A preprint of the paper for the estimation of the mean and covariance is available at

    https://arxiv.org/abs/2108.06507

Thank you for your attention!

13 / 14

References

Cai, T. T. and M. Yuan (2010). “Nonparametric Covariance Function Estimation for Functional and Longitudinal Data”. In: University of Pennsylvania and Georgia institute of technology, p. 36.

Cai, T. T. and M. Yuan (2011). “Optimal estimation of the mean function based on discretely sampled functional data: Phase transition”. EN. In: Annals of Statistics 39.5, pp. 2330–2355. ISSN: 0090-5364, 2168-8966. (Visited on Dec. 31, 2020).

Cai, T. T. and M. Yuan (2016). “Minimax and Adaptive Estimation of Covariance Operator for Random Variables Observed on a Lattice Graph”. In: Journal of the American Statistical Association 111.513, pp. 253–265.

Hall, P. and J. L. Horowitz (2007). “Methodology and convergence rates for functional linear regression”. EN. In: The Annals of Statistics 35.1, pp. 70–91. (Visited on Nov. 29, 2018).

Ramsay, J. and B. W. Silverman (2005). Functional Data Analysis. 2nd ed. Springer Series in Statistics. New York: Springer-Verlag.

Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. En. Springer Series in Statistics. New York, NY: Springer New York. DOI: 10.1007/b13794. (Visited on Jan. 22, 2020).

14 / 14

Functional Data Analysis



temperature precipitation
Canadian weather dataset (Ramsay and Silverman, 2005).
2 / 14
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow