class: center, middle, inverse, title-slide .title[ # On the geometric interpretation of MFPCA ] .subtitle[ ## and the usage of the Gram matrix ] .author[ ###
Steven Golovkine
· Edward Gunning · Andrew J. Simpkin · Norma Bargary ] .institute[ ### 43rd Conference on Applied Statistics in Ireland ] .date[ ### May 16th, 2023 ] --- # Multivariate functional data <figure> <center> <img src="data:image/png;base64,#./img/data_matrix.svg" alt="observation" width="60%"/> <figcaption>Data matrix.</figcaption> </center> </figure> --- # Some notations - Observation space: `$$\mathcal{H} = \underbrace{\mathcal{L}^2(\mathcal{T}_1) \times \cdots \times \mathcal{L}^2(\mathcal{T}_P)}_{P \text{ terms}}.$$` - Inner product in `\(\mathcal{H}\)`: `$$\langle\!\langle f, g \rangle\!\rangle = \sum_{p = 1}^P \int_{\mathcal{T}_p} f^{(p)}(t_p)g^{(p)}(t_p)\mathrm{d}t_p.$$` - For `\(N\)` realizations of a process `\(X\)`, we note the mean function `\(\mu\)`, the covariance operator `\(\Gamma\)` and the Gram (inner-product) matrix `\(M\)`. - Each feature of each observation is sampled on a regular grid of `\(M_p\)` points. --- # Cloud of individuals <br> <figure> <center> <img src="data:image/png;base64,#./img/cloud_obs.svg" alt="cloud_obs" width="45%"/> <img src="data:image/png;base64,#./img/cloud_obs_proj.svg" alt="cloud_obs_proj" width="45%"/> <figcaption>Cloud of observations.</figcaption> </center> </figure> --- # Cloud of individuals * Let `\(p_n, n \in \{1, \dots, N\}\)` be a weight on each observation such that `\(\sum_n p_n = 1\)`. * Distance between observations `$$d^2(\mathrm{M}_f, \mathrm{M}_g) = \langle\!\langle f - g, f - g \rangle\!\rangle, \quad f, g \in \mathcal{H}.$$` * Inertia of the cloud `\(\mathcal{C}_{\!N}\)` `$$\sum_{n = 1}^N p_n d^2(\mathrm{M}_n, \mathrm{G}_{\!N}) = \frac{1}{2}\sum_{n = 1}^N \sum_{m = 1}^N p_np_m d^2(\mathrm{M}_n, \mathrm{M}_m) = \sum_{p = 1}^P \int_{\mathcal{T}_p} \text{Var} X^{(p)}(t_p)\mathrm{d}t_p.$$` --- # Cloud of features <figure> <center> <img src="data:image/png;base64,#./img/cloud_features.svg" alt="cloud_features" width="45%"/> <img src="data:image/png;base64,#./img/cloud_features_proj.svg" alt="cloud_features_proj" width="45%"/> <figcaption>Cloud of features.</figcaption> </center> </figure> --- # Cloud of features * Distance between features `$$d^2(\mathrm{M}_f, \mathrm{M}_g) = \sum_{n = 1}^N p_n \langle\!\langle X_n - \mu, f - g\rangle\!\rangle, \quad f, g \in \mathcal{H}.$$` * Inertia of the cloud `\(\mathcal{C}_{\!P}\)` `$$\sum_{n = 1}^N p_n d^2(\mathrm{M}_n, \mathrm{O}_{\mathbb{R}}) = \sum_{p = 1}^P \int_{\mathcal{T}_p} \text{Var} X^{(p)}(t_p)\mathrm{d}t_p.$$` * Correlation coefficient `$$\cos \theta_{fg} = \frac{\sum_{n = 1}^N \pi_n \langle\!\langle X_n - \mu, f \rangle\!\rangle \langle\!\langle X_n - \mu, g \rangle\!\rangle}{\left(\sum_{n = 1}^N \pi_n \langle\!\langle X_n - \mu, f \rangle\!\rangle^2\right)^{1/2}\left(\sum_{n = 1}^N \pi_n \langle\!\langle X_n - \mu, g \rangle\!\rangle^2\right)^{1/2}}.$$` --- # Duality diagram <figure> <center> <img src="data:image/png;base64,#img/duality_diagram.svg" alt="diagram" width="50%"/> <figcaption>Duality diagram (extended from <a id='cite-delacruzDualityDiagramData2011'></a><a href='#bib-delacruzDualityDiagramData2011'>De la Cruz and Holmes (2011)</a>).</figcaption> </center> </figure> --- # MFPCA * Consider the matrix `\(M\)` of size `\(N \times N\)` with entries `$$M_{ij} = \langle\!\langle X_i - \mu, X_j - \mu\rangle\!\rangle., \quad i, j \in 1, \dots N.$$` * Eigenvalues of `\(\Gamma\)` and `\(M\)` are related by `$$\lambda_k = l_k / N, \quad k = 1, 2, \dots$$` * Eigenvectors of `\(\Gamma\)` and `\(M\)` are related by `$$\phi_k(t) = \frac{1}{\sqrt{l_k}}\sum_{n = 1}^N v_{nk}(X_n(t) - \mu(t)), \quad k = 1, 2, \dots$$` * Scores are given by `$$c_{nk} = \sqrt{l_k}v_{nk}, \quad n = 1, \dots, N, \quad k = 1, 2, \dots$$` --- # Computational complexity * Assume `\(M^a = \sum_{p = 1}^P M_p^a\)` and `\(K = \sum_{p = 1}^P K_p\)`. * Using the diagonalization of the covariance operator (<a id='cite-happMultivariateFunctionalPrincipal2015'></a><a href='https://doi.org/10.1080/01621459.2016.1273115'>Happ and Greven (2018)</a>) `$$\mathcal{O}\left(\underbrace{NM^2 + M^3 + N\sum_{p = 1}^P M_pK_p}_{\substack{\text{Univariate covariance}} \\ \substack{\text{decomposition}}} + \underbrace{NK^2 + K^3}_{\substack{\text{Univariate scores}} \\ \substack{\text{decomposition}}} + \underbrace{K\sum_{p = 1}^P M_pK_p + NK^2}_{\substack{\text{Multivariate eigencomponents}} \\ \substack{\text{and scores estimation}}}\right).$$` * Using the diagonalization of the inner product matrix `$$\mathcal{O}\left(\underbrace{N^2M^1 + N^3}_{\substack{\text{Gram matrix}} \\ \substack{\text{decomposition}}} + \underbrace{KPN + KN}_{\substack{\text{Multivariate eigencomponents}} \\ \substack{\text{and scores estimation}}}\right).$$` * Note that, here, the smoothing part is not considered into the computational complexity. --- # Simulation of multivariate functional data <figure> <center> <img src="data:image/png;base64,#./img/computation_time_1.svg" alt="comput_time_image_1" width="100%"/> </center> </figure> --- # Simulation of multivariate functional data <figure> <center> <img src="data:image/png;base64,#./img/mise_1.svg" alt="reconst_error_image_1" width="100%"/> </center> </figure> --- # Simulation of images data <figure> <center> <img src="data:image/png;base64,#./img/computation_time.svg" alt="comput_time_image" width="100%"/> </center> </figure> --- # Simulation of images data <figure> <center> <img src="data:image/png;base64,#./img/mise.svg" alt="reconst_error_image" width="100%"/> </center> </figure> --- # Takeaway ideas * We gave a geometric interpretation of the duality between rows and columns of a functional data matrix. * We provided relationships between the eigenelements of the covariance operator and the ones of the Gram matrix. * When to use the covariance operator? - Only one-dimensional curves / For sparse to relatively dense functional data * When to use the Gram matrix? - For two-dimensional functional data (images) / For ultra-dense functional data <br> <h2 style="color:#005844;"><center>Thank you for your attention!</center></h2> --- # References <p><cite><a id='bib-delacruzDualityDiagramData2011'></a><a href="#cite-delacruzDualityDiagramData2011">De la Cruz, O. and S. Holmes</a> (2011). “The Duality Diagram in Data Analysis: Examples of Modern Applications”. In: <em>The annals of applied statistics</em> 5.4, pp. 2266–2277. ISSN: 1932-6157.</cite></p> <p><cite><a id='bib-happMultivariateFunctionalPrincipal2015'></a><a href="#cite-happMultivariateFunctionalPrincipal2015">Happ, C. and S. Greven</a> (2018). “Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains”. In: <em>Journal of the American Statistical Association</em> 113.522, pp. 649-659. DOI: <a href="https://doi.org/10.1080/01621459.2016.1273115">10.1080/01621459.2016.1273115</a>.</cite></p>