Mathematical Foundations¶
Canonical Correlation Analysis (CCA) and Partial Least Squares (PLS) models are effective ways of finding associations between multiple views of data.
PCA¶
It is helpful to start off by formulating PCA in its mathematical form. The first principle component can be written as the solution to the convex optimisation problem:
\[ \begin{align}\begin{aligned}\w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_1w_1 \}\\\text{subject to:}\\w_1^Tw_1=1\end{aligned}\end{align} \]
That is the singular vectors of the covariance matrix \(X^TX\)
PLS¶
Now consider two data matrices with the same number of samples \(X_1\) and \(X_2\). It is tempting to write a slightly different optimisation problem:
\[ \begin{align}\begin{aligned}\w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\text{subject to:}\\w_1^Tw_1=1\\w_2^Tw_2=1\end{aligned}\end{align} \]
Which is optimised for the left and right singular vectors of the cross covariance matrix \(X_1^TX_2\)
CCA¶
To arrive at Canonical Correlation
\[ \begin{align}\begin{aligned}\w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\text{subject to:}\\w_1^TX_1^TX_1w_1=1\\w_2^TX_2^TX_2w_2=1\end{aligned}\end{align} \]
Deep CCA¶
To arrive
\[ \begin{align}\begin{aligned}\w_{opt}=\underset{w}{\mathrm{argmax}}\{ f(X_1)^Tf(X_2) \}\\\text{subject to:}\\f(X_1)^Tf(X_1)=1\\f(X_2)^Tf(X_2)=1\end{aligned}\end{align} \]