class: center, middle, inverse, title-slide .title[ # Learning the regularity of the curves in FDA ] .subtitle[ ## with applications ] .author[ ###
Steven Golovkine
· Nicolas Klutchnikoff · Valentin Patilea ] .institute[ ### CSDA & EcoSta Workshop on Statistical Data Science ] .date[ ### August 27, 2022 ] --- # Functional Data Analysis </br></br> <figure> <center> <img src="data:image/png;base64,#img/temperature.png" alt="temperature" width="500"/> <img src="data:image/png;base64,#img/precipitation.png" alt="precipitation" width="500"/> <figcaption>Canadian weather dataset <a id='cite-ramsay2005'></a>(<a href='#bib-ramsay2005'>Ramsay and Silverman, 2005</a>).</figcaption> </center> </figure> --- # Functional Data Analysis <figure> <center> <img src="data:image/png;base64,#img/power.png" alt="power" width="700"/> <figcaption>Household Active Power Consumption (UC Irvine ML Repository).</figcaption> </center> </figure> --- # Sample path regularity — a key concept * Data: (randomly) discrete time noisy measurements of the curves * Recover the curves > A nonparametric regression problem where optimality depends on the curve regularity; > see, *e.g.*, <a id='cite-tsybakov2009'></a>(<a href='https://doi.org/10.1007/b13794'>Tsybakov, 2009</a>). * Estimate the mean and the covariance > The optimal rates depend on the curve regularity; > see, *e.g.*, <a id='cite-cai2010'></a>(<a href='#bib-cai2010'>Cai and Yuan, 2010</a>), <a id='cite-cai2011'></a>(<a href='#bib-cai2011'>Cai and Yuan, 2011</a>), <a id='cite-cai2016'></a>(<a href='#bib-cai2016'>Cai and Yuan, 2016</a>). * Fit usual predictive models (linear...) > The optimal convergence rates depend on the predictor curve regularity; > see, *e.g.*, <a id='cite-hall2007'></a>(<a href='#bib-hall2007'>Hall and Horowitz, 2007</a>). --- # Sample path regularity </br> * Let `\(\mathcal{O}_\star\)` be a neighborhood of `\(t\)`. * For `\(H_{t} \in (0, 1)\)`, `\(L_t > 0\)`, `\(u, v \in \mathcal{O}_\star\)` assume that the stochastic process `\(X\)` satisfies the condition: `$$\mathbb{E} {(X_u - X_v)^2} \asymp L_t^2\lvert v - u \rvert^{2H_{t}}.$$` * `\(H_{t}\)` is called **the local regularity of the process `\(X\)`** on `\(\mathcal{O}_\star\)`. * Our parameter `\(H_t\)` is related to Hurst exponent. * The definition extends to smoother sample paths using the derivatives of `\(X_u\)` and `\(X_v\)` instead. --- # Local regularity vs. eigenvalue decrease rate </br> * The (local) regularity is related to the decrease rate of the eigenvalues of the covariance operator of the process. * Under some conditions, if `$$\lambda_j \sim j^{-\nu}, \qquad j \geq 1,$$` for some `\(\nu > 1\)`, then $$ 2(H+\delta) = \nu - 1$$ when the sample paths admit derivative up to order `\(\delta\)` and `\(H\)` is the regularity of the derivatives of order `\(\delta\)`. -- </br></br></br> <h3 style="color:#005844;"><center>The value of `\(\nu\)` is usually supposed given!</center></h3> --- # Observed data * Let `\(X^{(1)}, \dots, X^{(N)}\)` be an independent sample of a random process `\(X = (X (t) : t \in [0, 1])\)` with continuous trajectories. * For each `\(1 \leq n \leq N\)` > `\(M_n\)` is a random positive integer. > `\(T_m^{(n)} \in [0, 1], 1 \leq m \leq M_n\)` be the (random) observation times, design points, for the curve `\(X^{(n)}\)`. * The observations are `\((Y_m^{(n)}, T_m^{(n)})\)`, `\(1 \leq m \leq M_n, 1 \leq n \leq N\)`, where `$$Y_m^{(n)} = X^{(n)}(T_m^{(n)}) + \sigma(T_m^{(n)}, X^{(n)}(T_m^{(n)})) e_m^{(n)}.$$` * `\(e_m^{(n)}\)` are independent copies of a standardized error term. --- # Estimation * For `\(s, t \in \mathcal{O}_\star\)`, let `$$\theta(s, t) = \mathbb{E}\left[(X_t - X_s)^2\right] \approx L^2 \lvert t - s \rvert^{2H_{t_0}}.$$` * Let `\(t_1\)` and `\(t_3\)` be such that `\([t_1, t_3] \subset \mathcal{O}_\star\)`, and denote `\(t_2\)` the middle point of `\([t_1, t_3]\)`. * A natural proxy of `\(H_{t_0}\)` is given by `$$\frac{\log(\theta(t_1, t_3)) - \log(\theta(t_1, t_2))}{2\log 2}, \quad\text{if}~ t_3 - t_1 ~\text{is small.}$$` -- * An estimator of `\(H_{t_0}\)` is given by `$$\frac{\log(\widehat \theta(t_1, t_3)) - \log(\widehat \theta(t_1, t_2))}{2\log 2}, \quad\text{if}~ t_3 - t_1 ~\text{is small.}$$` where, given a nonparametric estimator `\(\widetilde{X_{t}}\)` of `\(X_{t}\)`, `$$\widehat\theta(s, t) = \frac{1}{N}\sum_{n = 1}^N \left(\widetilde{X}^{(n)}_{t} - \widetilde{X}^{(n)}_{s}\right)^2.$$` --- # Estimator of the mean function * For any `\(t \in \mathcal{T}\)`, let `$$w_n(t; h) = 1 \quad \text{if}\quad \sum_{m = 1}^{M_n} \mathbf{1}\{\vert T_m^{(n)} - t \leq h\vert\} \geq k_0, \quad n \in \{1, \dots, N\}.$$` * Then, `$$W_N(t, h) = \sum_{n = 1}^N w_n(t, h).$$` * The estimator of the mean function is `$$\widehat{\mu}_N(t, h) = \frac{1}{W_N(t, h)}\sum_{n = 1}^N w_n(t, h)\widehat{X}^{(n)}(t), \quad t \in \mathcal{T}.$$` * An adaptive optimal bandwidth is `$$\widehat{h}_{\mu}^\star = C_\mu (N\mathfrak{m})^{-1 / (1 + 2\widehat{H}_t)}.$$` --- # Results for mean estimation <figure> <center> <img src="data:image/png;base64,#img/exp1_mu.png" alt="exp1_mu" width="800"/> <figcaption>ISE with respect to the true mean function `\(\mu\)`.</figcaption> </center> </figure> --- # Estimator of the covariance function * For any `\(s \neq t\)`, let `$$W_N(s, t, h) = \sum_{n = 1}^N w_n(s, h)w_n(t, h).$$` * The estimator of the covariance function is, for `\(\lvert s - t \rvert > \delta\)`, `$$\widehat{\Gamma}_N(s, t, h) = \frac{1}{W_N(s, t, h)}\sum_{n = 1}^N w_n(s, h)\widehat{X}^{(n)}(s) w_n(t, h)\widehat{X}^{(n)}(t) - \widehat{\mu}^\star(s)\widehat{\mu}^\star(t).$$` * An adaptive optimal bandwidth is `$$\widehat{h}_{\Gamma}^\star = C_\Gamma (N\mathfrak{m})^{-1 / (1 + 2\min(\widehat{H}_s, \widehat{H}_t))}.$$` --- # Results for covariance estimation <figure> <center> <img src="data:image/png;base64,#img/exp1_gamma.png" alt="exp1_gamma" width="800"/> <figcaption>ISE with respect to the true mean function `\(\Gamma\)`.</figcaption> </center> </figure> --- # Takeaway ideas * The available data in FDA are usually **noisy** measurements at a **discrete, possibly random design points**. * The usual FDA methods require the **reconstruction** of the curves. * The optimal curve recovery depends on the purpose, but in most cases **depends on the regularity** of the sample paths. * We formalize the concept of local regularity of the process, propose a first **simple** estimator for it and mean and covariance functions. * The paper concerning the estimation of the regularity is available at <center> <a href="https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-16/issue-1/Learning-the-smoothness-of-noisy-curves-with-application-to-online/10.1214/22-EJS1997.full">DOI: 10.1214/22-EJS1997</a> </center> * A preprint of the paper for the estimation of the mean and covariance is available at <center> <a href="https://arxiv.org/abs/2108.06507">https://arxiv.org/abs/2108.06507</a> </center> <h2 style="color:#005844;"><center>Thank you for your attention!</center></h2> --- # References <p><cite><a id='bib-cai2010'></a><a href="#cite-cai2010">Cai, T. T. and M. Yuan</a> (2010). “Nonparametric Covariance Function Estimation for Functional and Longitudinal Data”. In: <em>University of Pennsylvania and Georgia institute of technology</em>, p. 36.</cite></p> <p><cite><a id='bib-cai2011'></a><a href="#cite-cai2011">Cai, T. T. and M. Yuan</a> (2011). “Optimal estimation of the mean function based on discretely sampled functional data: Phase transition”. EN. In: <em>Annals of Statistics</em> 39.5, pp. 2330–2355. ISSN: 0090-5364, 2168-8966. (Visited on Dec. 31, 2020).</cite></p> <p><cite><a id='bib-cai2016'></a><a href="#cite-cai2016">Cai, T. T. and M. Yuan</a> (2016). “Minimax and Adaptive Estimation of Covariance Operator for Random Variables Observed on a Lattice Graph”. In: <em>Journal of the American Statistical Association</em> 111.513, pp. 253–265.</cite></p> <p><cite><a id='bib-hall2007'></a><a href="#cite-hall2007">Hall, P. and J. L. Horowitz</a> (2007). “Methodology and convergence rates for functional linear regression”. EN. In: <em>The Annals of Statistics</em> 35.1, pp. 70–91. (Visited on Nov. 29, 2018).</cite></p> <p><cite><a id='bib-ramsay2005'></a><a href="#cite-ramsay2005">Ramsay, J. and B. W. Silverman</a> (2005). <em>Functional Data Analysis</em>. 2nd ed. Springer Series in Statistics. New York: Springer-Verlag.</cite></p> <p><cite><a id='bib-tsybakov2009'></a><a href="#cite-tsybakov2009">Tsybakov, A. B.</a> (2009). <em>Introduction to Nonparametric Estimation</em>. En. Springer Series in Statistics. New York, NY: Springer New York. DOI: <a href="https://doi.org/10.1007/b13794">10.1007/b13794</a>. (Visited on Jan. 22, 2020).</cite></p>