On the geometric interpretation of MFPCA
and the usage of the Gram matrix
Steven Golovkine · Edward Gunning · Andrew J. Simpkin · Norma Bargary
54es Journée de Statistique de la SFDS
July 5th, 2023
1 / 16

Multivariate functional data

2 / 16

Some notations

Observation space:

$H = \underset{P terms}{\underset{⏟}{L^{2} (T_{1}) \times \dots \times L^{2} (T_{P})}} .$

Inner product in $H$ :

$⟨ ⟨ f, g ⟩ ⟩ = \sum_{p = 1}^{P} \int_{T_{p}} f^{(p)} (t_{p}) g^{(p)} (t_{p}) d t_{p} .$

For $N$ realizations of a process $X$ , we note
- the mean function $μ$ ,
- the covariance operator $Γ$ , with covariance kernel $C$ ,
- the Gram (inner-product) matrix $M$ .
Each feature of each observation is sampled on a regular grid of $M_{p}$ points.

3 / 16

Cloud of individuals

4 / 16

Cloud of individuals

Let $π_{n}, n \in {1, \dots, N}$ be a weight on each observation such that $\sum_{n} π_{n} = 1$ .
Distance between observations

$d^{2} (M_{f}, M_{g}) = ⟨ ⟨ f - g, f - g ⟩ ⟩, f, g \in H .$

Inertia of the cloud $C_{N}$ using $d$

$\sum_{n = 1}^{N} π_{n} d^{2} (M_{n}, G_{μ}) = \frac{1}{2} \sum_{n = 1}^{N} \sum_{m = 1}^{N} π_{n} π_{m} d^{2} (M_{n}, M_{m}) = \sum_{p = 1}^{P} \int_{T_{p}} Var X^{(p)} (t_{p}) d t_{p} .$

5 / 16

Cloud of individuals

Let $π_{n}, n \in {1, \dots, N}$ be a weight on each observation such that $\sum_{n} π_{n} = 1$ .
Distance between observations

$d^{2} (M_{f}, M_{g}) = ⟨ ⟨ f - g, f - g ⟩ ⟩, f, g \in H .$

Inertia of the cloud $C_{N}$ using $d$

$\sum_{n = 1}^{N} π_{n} d^{2} (M_{n}, G_{μ}) = \frac{1}{2} \sum_{n = 1}^{N} \sum_{m = 1}^{N} π_{n} π_{m} d^{2} (M_{n}, M_{m}) = \sum_{p = 1}^{P} \int_{T_{p}} Var X^{(p)} (t_{p}) d t_{p} .$

Another distance between observations

$d_{Γ}^{2} (M_{f}, M_{g}) = ⟨ ⟨ f - g, Γ (f - g) ⟩ ⟩, f, g \in H .$

Inertia of the cloud $C_{N}$ using $d_{Γ}$

$\sum_{n = 1}^{N} π_{n} d_{Γ}^{2} (M_{n}, G_{μ}) = \frac{1}{2} \sum_{n = 1}^{N} \sum_{m = 1}^{N} π_{n} π_{m} d_{Γ}^{2} (M_{n}, M_{m}) = \sum_{p = 1}^{P} \int_{T_{p}} | | | C_{p \cdot} (t_{p}, \cdot) | | |^{2} d t_{p} .$

5 / 16

Cloud of features

6 / 16

Cloud of features

Distance between features

$d^{2} (M_{f}, M_{g}) = \sum_{n = 1}^{N} π_{n} ⟨ ⟨ X_{n} - μ, f - g ⟩ ⟩^{2}, f, g \in H .$

Inertia of the cloud $C_{P}$

$\sum_{n = 1}^{N} π_{n} d^{2} (M_{n}, G_{μ}) = \frac{1}{2} \sum_{n = 1}^{N} \sum_{m = 1}^{N} π_{n} π_{m} d_{Γ}^{2} (M_{n}, M_{m}) = \sum_{p = 1}^{P} \int_{T_{p}} | | | C_{p \cdot} (t_{p}, \cdot) | | |^{2} d t_{p} .$

Correlation coefficient

$\cos θ_{f g} = \frac{\sum_{n = 1}^{N} π_{n} ⟨ ⟨ X_{n} - μ, f ⟩ ⟩ ⟨ ⟨ X_{n} - μ, g ⟩ ⟩}{{(\sum_{n = 1}^{N} π_{n} ⟨ ⟨ X_{n} - μ, f ⟩ ⟩^{2})}^{1 / 2} {(\sum_{n = 1}^{N} π_{n} ⟨ ⟨ X_{n} - μ, g ⟩ ⟩^{2})}^{1 / 2}} = \frac{⟨ ⟨ f, Γ g ⟩ ⟩}{⟨ ⟨ f, Γ f ⟩ ⟩ ⟨ ⟨ g, Γ g ⟩ ⟩} .$

7 / 16

Duality diagram

8 / 16

MFPCA

Consider the matrix $M$ of size $N \times N$ with entries

$M_{i j} = \sqrt{π_{i} π_{j}} ⟨ ⟨ X_{i} - μ, X_{j} - μ ⟩ ⟩, i, j \in 1, \dots N .$

Eigenvalues of $Γ$ and $M$ are related by

$λ_{k} = l_{k}, k = 1, 2, \dots$

Eigenvectors of $Γ$ and $M$ are related by

$ϕ_{k} (t) = \frac{1}{\sqrt{N l_{k}}} \sum_{n = 1}^{N} v_{n k} {X_{n} (t) - μ (t)}, k = 1, 2, \dots$

Scores are given by

$c_{n k} = \sqrt{N l_{k}} v_{n k}, n = 1, \dots, N, k = 1, 2, \dots$

9 / 16

Computational complexity

Assume $M^{a} = \sum_{p = 1}^{P} M_{p}^{a}$ and $K = \sum_{p = 1}^{P} K_{p}$ .
Using the diagonalization of the covariance operator (Happ and Greven (2018))

$O (\underset{\begin{matrix} Univariate covariance \end{matrix} \begin{matrix} decomposition \end{matrix}}{\underset{⏟}{N M^{2} + M^{3} + N \sum_{p = 1}^{P} M_{p} K_{p}}} + \underset{\begin{matrix} Univariate scores \end{matrix} \begin{matrix} decomposition \end{matrix}}{\underset{⏟}{N K^{2} + K^{3}}} + \underset{\begin{matrix} Multivariate eigencomponents \end{matrix} \begin{matrix} and scores estimation \end{matrix}}{\underset{⏟}{K \sum_{p = 1}^{P} M_{p} K_{p} + N K^{2}}}) .$

Using the diagonalization of the inner product matrix

$O (\underset{\begin{matrix} Gram matrix \end{matrix} \begin{matrix} decomposition \end{matrix}}{\underset{⏟}{N^{2} M^{1} + N^{3}}} + \underset{\begin{matrix} Multivariate eigencomponents \end{matrix} \begin{matrix} and scores estimation \end{matrix}}{\underset{⏟}{K P N + K N}}) .$

Note that, here, the smoothing part is not considered into the computational complexity.

10 / 16

Simulation of multivariate functional data

11 / 16

Simulation of multivariate functional data

12 / 16

Simulation of images data

13 / 16

Simulation of images data

14 / 16

Takeaway ideas

We gave a geometric interpretation of the duality between rows and columns of a functional data matrix.
We provided relationships between the eigenelements of the covariance operator and the ones of the Gram matrix.
When to use the covariance operator?
- Only one-dimensional curves.
- For sparse to relatively dense functional data.
When to use the Gram matrix?
- For two-dimensional (or higher dimensional) functional data (images).
- For ultra-dense functional data.
The paper is available on arXiv: arXiv:2306.12949

Thank you for your attention!

15 / 16

References

De la Cruz, O. and S. Holmes (2011). “The Duality Diagram in Data Analysis: Examples of Modern Applications”. In: The annals of applied statistics 5.4, pp. 2266–2277. ISSN: 1932-6157.

Happ, C. and S. Greven (2018). “Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains”. In: Journal of the American Statistical Association 113.522, pp. 649-659. DOI: 10.1080/01621459.2016.1273115.

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

On the geometric interpretation of MFPCA

and the usage of the Gram matrix

Steven Golovkine · Edward Gunning · Andrew J. Simpkin · Norma Bargary

54es Journée de Statistique de la SFDS

July 5th, 2023

Multivariate functional data

Some notations

Cloud of individuals

Cloud of individuals

Cloud of individuals

Cloud of features

Cloud of features

Duality diagram

MFPCA

Computational complexity

Simulation of multivariate functional data

Simulation of multivariate functional data

Simulation of images data

Simulation of images data

Takeaway ideas

Thank you for your attention!

References

Multivariate functional data

Help