Models
Regularized Canonical Correlation Analysis and Partial Least Squares
Canonical Correlation Analysis
- class cca_zoo.models.rcca.CCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None)[source]
A class used to fit a simple CCA model
Implements CCA by inheriting regularised CCA with 0 regularisation
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_1^TX_1^TX_1w_1=n\\w_2^TX_2^TX_2w_2=n\end{aligned}\end{align} \]- Citation
Hotelling, Harold. “Relations between two sets of variates.” Breakthroughs in statistics. Springer, New York, NY, 1992. 162-190.
- Example
>>> from cca_zoo.models import CCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = CCA() >>> model.fit((X1,X2)).score((X1,X2)) array([1.])
Constructor for CCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Partial Least Squares
- class cca_zoo.models.rcca.PLS(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None)[source]
A class used to fit a simple PLS model
Implements PLS by inheriting regularised CCA with maximal regularisation
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_1^Tw_1=1\\w_2^Tw_2=1\end{aligned}\end{align} \]- Example
>>> from cca_zoo.models import PLS >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = PLS() >>> model.fit((X1,X2)).score((X1,X2)) array([0.81796873])
Constructor for PLS
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Ridge Regularized Canonical Correlation Analysis
- class cca_zoo.models.rcca.rCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None, eps=0.001, accept_sparse=None)[source]
A class used to fit Regularised CCA (canonical ridge) model. Uses PCA to perform the optimization efficiently for high dimensional data.
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\(1-c_1)w_1^TX_1^TX_1w_1+c_1w_1^Tw_1=n\\(1-c_2)w_2^TX_2^TX_2w_2+c_2w_2^Tw_2=n\end{aligned}\end{align} \]- Citation
Vinod, Hrishikesh D. “Canonical ridge and econometrics of joint production.” Journal of econometrics 4.2 (1976): 147-166.
- Example
>>> from cca_zoo.models import rCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = rCCA(c=[0.1,0.1]) >>> model.fit((X1,X2)).score((X1,X2)) array([0.95222128])
Constructor for rCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)eps – epsilon for stability
accept_sparse – which forms are accepted for sparse data
- fit(views, y=None, **kwargs)[source]
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
GCCA and KGCCA
Generalized (MAXVAR) Canonical Correlation Analysis
- class cca_zoo.models.gcca.GCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None, view_weights=None)[source]
A class used to fit GCCA model. For more than 2 views, GCCA optimizes the sum of correlations with a shared auxiliary vector
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ \sum_iw_i^TX_i^TT \}\\\end{split}\\\text{subject to:}\\T^TT=1\end{aligned}\end{align} \]- Citation
Tenenhaus, Arthur, and Michel Tenenhaus. “Regularized generalized canonical correlation analysis.” Psychometrika 76.2 (2011): 257.
- Example
>>> from cca_zoo.models import GCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = GCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.97229856])
Constructor for GCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – regularisation between 0 (CCA) and 1 (PLS)view_weights (
Optional
[Iterable
[float
]]) – list of weights of each view
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Kernel Generalized (MAXVAR) Canonical Correlation Analysis
- class cca_zoo.models.gcca.KGCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None, eps=0.001, kernel=None, gamma=None, degree=None, coef0=None, kernel_params=None)[source]
A class used to fit KGCCA model. For more than 2 views, KGCCA optimizes the sum of correlations with a shared auxiliary vector
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ \sum_i\alpha_i^TK_i^TT \}\\\end{split}\\\text{subject to:}\\T^TT=1\end{aligned}\end{align} \]- Citation
Tenenhaus, Arthur, Cathy Philippe, and Vincent Frouin. “Kernel generalized canonical correlation analysis.” Computational Statistics & Data Analysis 90 (2015): 114-131.
- Example
>>> from cca_zoo.models import KGCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = KGCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.97019284])
Constructor for PLS
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)eps – epsilon for stability
kernel (
Optional
[Iterable
[Union
[float
,callable
]]]) – Iterable of kernel mappings used internally. This parameter is directly passed topairwise_kernel
. If element of kernel is a string, it must be one of the metrics in pairwise.PAIRWISE_KERNEL_FUNCTIONS. Alternatively, if element of kernel is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number. This means that callables fromsklearn.metrics.pairwise
are not allowed, as they operate on matrices, not single samples. Use the string identifying the kernel instead.gamma (
Optional
[Iterable
[float
]]) – Iterable of gamma parameters for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. Ignored by other kernels.degree (
Optional
[Iterable
[float
]]) – Iterable of degree parameters of the polynomial kernel. Ignored by other kernels.coef0 (
Optional
[Iterable
[float
]]) – Iterable of zero coefficients for polynomial and sigmoid kernels. Ignored by other kernels.kernel_params (
Optional
[Iterable
[dict
]]) – Iterable of additional parameters (keyword arguments) for kernel function passed as callable object.eps – epsilon value to ensure stability of smallest eigenvalues
- transform(views, y=None, **kwargs)[source]
Transforms data given a fit KGCCA model
- Parameters
views (
ndarray
) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
MCCA and KCCA
Multiset (SUMCOR) Canonical Correlation Analysis
- class cca_zoo.models.mcca.MCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None, eps=0.001)[source]
A class used to fit MCCA model. For more than 2 views, MCCA optimizes the sum of pairwise correlations.
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} w_i^TX_i^TX_jw_j \}\\\end{split}\\\text{subject to:}\\(1-c_i)w_i^TX_i^TX_iw_i+c_iw_i^Tw_i=1\end{aligned}\end{align} \]- Citation
Kettenring, Jon R. “Canonical analysis of several sets of variables.” Biometrika 58.3 (1971): 433-451.
- Example
>>> from cca_zoo.models import MCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = MCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.97200847])
Constructor for MCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)eps – epsilon for stability
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Kernel Multiset (SUMCOR) Canonical Correlation Analysis
- class cca_zoo.models.mcca.KCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None, eps=0.001, kernel=None, gamma=None, degree=None, coef0=None, kernel_params=None)[source]
A class used to fit KCCA model.
- Maths
\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\c_i\alpha_i^TK_i\alpha_i + (1-c_i)\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]- Example
>>> from cca_zoo.models import KCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = KCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.96893666])
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)eps – epsilon for stability
kernel (
Optional
[Iterable
[Union
[float
,callable
]]]) – Iterable of kernel mappings used internally. This parameter is directly passed topairwise_kernel
. If element of kernel is a string, it must be one of the metrics in pairwise.PAIRWISE_KERNEL_FUNCTIONS. Alternatively, if element of kernel is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number. This means that callables fromsklearn.metrics.pairwise
are not allowed, as they operate on matrices, not single samples. Use the string identifying the kernel instead.gamma (
Optional
[Iterable
[float
]]) – Iterable of gamma parameters for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. Ignored by other kernels.degree (
Optional
[Iterable
[float
]]) – Iterable of degree parameters of the polynomial kernel. Ignored by other kernels.coef0 (
Optional
[Iterable
[float
]]) – Iterable of zero coefficients for polynomial and sigmoid kernels. Ignored by other kernels.kernel_params (
Optional
[Iterable
[dict
]]) – Iterable of additional parameters (keyword arguments) for kernel function passed as callable object.eps – epsilon value to ensure stability of smallest eigenvalues
- transform(views, **kwargs)[source]
Transforms data given a fit KCCA model
- Parameters
views (
ndarray
) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
Tensor Canonical Correlation Analysis
Tensor Canonical Correlation Analysis
- class cca_zoo.models.tcca.TCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None)[source]
Fits a Tensor CCA model. Tensor CCA maximises higher order correlations
- Maths
\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]- Citation
Kim, Tae-Kyun, Shu-Fai Wong, and Roberto Cipolla. “Tensor canonical correlation analysis for action classification.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2007 https://github.com/rciszek/mdr_tcca
- Example
>>> from cca_zoo.models import TCCA >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = TCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([1.14595755])
Constructor for TCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale – normalize variance in each column before fitting
centre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)
- fit(views, y=None, **kwargs)[source]
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- correlations(views, **kwargs)[source]
Predicts the correlation for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- score(views, **kwargs)[source]
Returns the higher order correlations in each dimension
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Kernel Tensor Canonical Correlation Analysis
- class cca_zoo.models.tcca.KTCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, eps=0.001, c=None, kernel=None, gamma=None, degree=None, coef0=None, kernel_params=None)[source]
Fits a Kernel Tensor CCA model. Tensor CCA maximises higher order correlations
- Maths
\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]- Citation
Kim, Tae-Kyun, Shu-Fai Wong, and Roberto Cipolla. “Tensor canonical correlation analysis for action classification.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2007
- Example
>>> from cca_zoo.models import KTCCA >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = KTCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([1.69896269])
Constructor for TCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)kernel (
Optional
[Iterable
[Union
[float
,callable
]]]) – Iterable of kernel mappings used internally. This parameter is directly passed topairwise_kernel
. If element of kernel is a string, it must be one of the metrics in pairwise.PAIRWISE_KERNEL_FUNCTIONS. Alternatively, if element of kernel is a callable function, it is called on each pair of instances (rows) and the resulting value recorded. The callable should take two rows from X as input and return the corresponding kernel value as a single number. This means that callables fromsklearn.metrics.pairwise
are not allowed, as they operate on matrices, not single samples. Use the string identifying the kernel instead.gamma (
Optional
[Iterable
[float
]]) – Iterable of gamma parameters for the RBF, laplacian, polynomial, exponential chi2 and sigmoid kernels. Interpretation of the default value is left to the kernel; see the documentation for sklearn.metrics.pairwise. Ignored by other kernels.degree (
Optional
[Iterable
[float
]]) – Iterable of degree parameters of the polynomial kernel. Ignored by other kernels.coef0 (
Optional
[Iterable
[float
]]) – Iterable of zero coefficients for polynomial and sigmoid kernels. Ignored by other kernels.kernel_params (
Optional
[Iterable
[dict
]]) – Iterable of additional parameters (keyword arguments) for kernel function passed as callable object.eps – epsilon value to ensure stability
- transform(views, **kwargs)[source]
Transforms data given a fit k=KCCA model
- Parameters
views (
ndarray
) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- correlations(views, **kwargs)
Predicts the correlation for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- fit(views, y=None, **kwargs)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, **kwargs)
Returns the higher order correlations in each dimension
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
More Complex Regularisation using Iterative Models
Normal CCA and PLS by alternating least squares
Quicker and more memory efficient for very large data
CCA by Alternating Least Squares
- class cca_zoo.models.CCA_ALS(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, deflation='cca', max_iter=100, initialization='random', tol=1e-09, stochastic=True, positive=None)[source]
Fits a CCA model with CCA deflation by NIPALS algorithm. Implemented by ElasticCCA with no regularisation
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]- Citation
Golub, Gene H., and Hongyuan Zha. “The canonical correlations of matrix pairs and their numerical computation.” Linear algebra for signal processing. Springer, New York, NY, 1995. 27-49.
- Example
>>> from cca_zoo.models import CCA_ALS >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,3)) >>> X2 = rng.random((10,3)) >>> model = CCA_ALS(random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.858906])
Constructor for CCA_ALS
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
max_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
str
) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingstochastic – use stochastic regression optimisers for subproblems
positive (
Union
[Iterable
[bool
],bool
,None
]) – constrain model weights to be positive
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
PLS by Alternating Least Squares
- class cca_zoo.models.PLS_ALS(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, max_iter=100, initialization='random', tol=1e-09)[source]
A class used to fit a PLS model
Fits a partial least squares model with CCA deflation by NIPALS algorithm
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2\}\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\end{aligned}\end{align} \]- Example
>>> from cca_zoo.models import PLS >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = PLS_ALS(random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.81796854])
Constructor for PLS
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
max_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
Union
[str
,callable
]) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stopping
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Sparsity Inducing Models
Penalized Matrix Decomposition (Sparse PLS)
- class cca_zoo.models.PMD(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, deflation='cca', c=None, max_iter=100, initialization='pls', tol=1e-09, positive=None)[source]
Fits a Sparse CCA (Penalized Matrix Decomposition) model.
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\\\|w_i\|<=c_i\end{aligned}\end{align} \]- Citation
Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. “A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.” Biostatistics 10.3 (2009): 515-534.
- Example
>>> from cca_zoo.models import PMD >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = PMD(c=[1,1],random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.81796873])
Constructor for PMD
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – l1 regularisation parameter between 1 and sqrt(number of features) for each viewmax_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
Union
[str
,callable
]) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingpositive (
Union
[Iterable
[bool
],bool
,None
]) – constrain model weights to be positive
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Sparse CCA by iterative lasso regression
- class cca_zoo.models.SCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, deflation='cca', c=None, max_iter=100, maxvar=False, initialization='pls', tol=1e-09, stochastic=False, positive=None)[source]
Fits a sparse CCA model by iterative rescaled lasso regression. Implemented by ElasticCCA with l1 ratio=1
For default maxvar=False, the optimisation is given by:
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]- Citation
Mai, Qing, and Xin Zhang. “An iterative penalized least squares approach to sparse canonical correlation analysis.” Biometrics 75.3 (2019): 734-744.
For maxvar=True, the optimisation is given by the ElasticCCA problem with no l2 regularisation:
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}, t_{opt}=\underset{w,t}{\mathrm{argmax}}\{\sum_i \|X_iw_i-t\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\t^Tt=n\end{aligned}\end{align} \]- Citation
Fu, Xiao, et al. “Scalable and flexible multiview MAX-VAR canonical correlation analysis.” IEEE Transactions on Signal Processing 65.16 (2017): 4150-4165.
- Example
>>> from cca_zoo.models import SCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA(c=[0.001,0.001], random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.99998761])
Constructor for SCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
max_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopmaxvar (
bool
) – use auxiliary variable “maxvar” forminitialization (
Union
[str
,callable
]) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingc (
Union
[Iterable
[float
],float
,None
]) – lasso alphastochastic – use stochastic regression optimisers for subproblems
positive (
Union
[Iterable
[bool
],bool
,None
]) – constrain model weights to be positive
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Elastic CCA by MAXVAR
- class cca_zoo.models.ElasticCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, deflation='cca', max_iter=100, initialization='pls', tol=1e-09, c=None, l1_ratio=None, maxvar=True, stochastic=False, positive=None)[source]
Fits an elastic CCA by iterating elastic net regressions.
By default, ElasticCCA uses CCA with an auxiliary variable target i.e. MAXVAR configuration
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}, t_{opt}=\underset{w,t}{\mathrm{argmax}}\{\sum_i \|X_iw_i-t\|^2 + c\|w_i\|^2_2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\t^Tt=n\end{aligned}\end{align} \]- Citation
Fu, Xiao, et al. “Scalable and flexible multiview MAX-VAR canonical correlation analysis.” IEEE Transactions on Signal Processing 65.16 (2017): 4150-4165.
But we can force it to attempt to use the SUMCOR form which will approximate a solution to the problem:
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + c\|w_i\|^2_2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]- Example
>>> from cca_zoo.models import ElasticCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = ElasticCCA(c=[1e-1,1e-1],l1_ratio=[0.5,0.5], random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.9316638])
Constructor for ElasticCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
deflation – the type of deflation.
max_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
Union
[str
,callable
]) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingc (
Union
[Iterable
[float
],float
,None
]) – lasso alphal1_ratio (
Union
[Iterable
[float
],float
,None
]) – l1 ratio in lasso subproblemsmaxvar (
bool
) – use auxiliary variable “maxvar” formulationstochastic – use stochastic regression optimisers for subproblems
positive (
Union
[Iterable
[bool
],bool
,None
]) – constrain model weights to be positive
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Span CCA
- class cca_zoo.models.SpanCCA(latent_dims=1, scale=True, centre=True, copy_data=True, max_iter=100, initialization='uniform', tol=1e-09, regularisation='l0', c=None, rank=1, positive=None, random_state=None, deflation='cca')[source]
Fits a Sparse CCA model using SpanCCA.
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=1\end{aligned}\end{align} \]- Citation
Asteris, Megasthenis, et al. “A simple and provable algorithm for sparse diagonal CCA.” International Conference on Machine Learning. PMLR, 2016.
- Example
>>> from cca_zoo.models import SpanCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SpanCCA(regularisation="l0", c=[2, 2]) >>> model.fit((X1,X2)).score((X1,X2)) array([0.84556666])
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
max_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
str
) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingregularisation –
c (
Union
[Iterable
[Union
[float
,int
]],float
,int
,None
]) – regularisation parameterrank – rank of the approximation
positive (
Union
[Iterable
[bool
],bool
,None
]) – constrain weights to be positive
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Parkhomenko (penalized) CCA
- class cca_zoo.models.ParkhomenkoCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, deflation='cca', c=None, max_iter=100, initialization='pls', tol=1e-09)[source]
Fits a sparse CCA (penalized CCA) model
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \} + c_i\|w_i\|\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\end{aligned}\end{align} \]- Citation
Parkhomenko, Elena, David Tritchler, and Joseph Beyene. “Sparse canonical correlation analysis with application to genomic data integration.” Statistical applications in genetics and molecular biology 8.1 (2009).
- Example
>>> from cca_zoo.models import ParkhomenkoCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = ParkhomenkoCCA(c=[0.001,0.001],random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.81803527])
Constructor for ParkhomenkoCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – l1 regularisation parametermax_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
Union
[str
,callable
]) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stopping
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Sparse CCA by ADMM
- class cca_zoo.models.SCCA_ADMM(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, deflation='cca', c=None, mu=None, lam=None, eta=None, max_iter=100, initialization='pls', tol=1e-09)[source]
Fits a sparse CCA model by alternating ADMM
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=1\end{aligned}\end{align} \]- Citation
Suo, Xiaotong, et al. “Sparse canonical correlation analysis.” arXiv preprint arXiv:1705.10865 (2017).
- Example
>>> from cca_zoo.models import SCCA_ADMM >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA_ADMM(random_state=0,c=[1e-1,1e-1]) >>> model.fit((X1,X2)).score((X1,X2)) array([0.84348183])
Constructor for SCCA_ADMM
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – l1 regularisation parametermax_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
Union
[str
,callable
]) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingmu (
Union
[Iterable
[float
],float
,None
]) –lam (
Union
[Iterable
[float
],float
,None
]) –
- Param
eta:
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Miscellaneous
Nonparametric CCA
- class cca_zoo.models.NCCA(latent_dims=1, scale=True, centre=True, copy_data=True, accept_sparse=False, random_state=None, nearest_neighbors=None, gamma=None)[source]
A class used to fit nonparametric (NCCA) model.
- Citation
Michaeli, Tomer, Weiran Wang, and Karen Livescu. “Nonparametric canonical correlation analysis.” International conference on machine learning. PMLR, 2016.
- Example
>>> from cca_zoo.models import NCCA >>> X1 = np.random.rand(10,5) >>> X2 = np.random.rand(10,5) >>> model = NCCA() >>> model.fit((X1,X2)).score((X1,X2)) array([1.])
Constructor for NCCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale – normalize variance in each column before fitting
centre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
accept_sparse – Whether model can take sparse data as input
random_state (
Union
[int
,RandomState
,None
]) – Pass for reproducible output across multiple function callsnearest_neighbors – Number of nearest neighbors (l2 distance) to consider when constructing affinity
gamma (
Optional
[Iterable
[float
]]) – Bandwidth parameter for rbf kernel
- fit(views, y=None, **kwargs)[source]
Fits a given model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)[source]
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
Partial CCA
- class cca_zoo.models.PartialCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, c=None, eps=0.001)[source]
A class used to fit a partial cca model. The key difference between this and a vanilla CCA or MCCA is that the canonical score vectors must be orthogonal to the supplied confounding variables.
- Citation
Rao, B. Raja. “Partial canonical correlations.” Trabajos de estadistica y de investigación operativa 20.2-3 (1969): 211-219.
- Example
>>> from cca_zoo.models import PartialCCA >>> X1 = np.random.rand(10,5) >>> X2 = np.random.rand(10,5) >>> partials = np.random.rand(10,3) >>> model = PartialCCA() >>> model.fit((X1,X2),partials=partials).score((X1,X2),partials=partials) array([0.99993046])
Constructor for Partial CCA
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
c (
Union
[Iterable
[float
],float
,None
]) – Iterable of regularisation parameters for each view (between 0:CCA and 1:PLS)eps – epsilon for stability
- transform(views, partials=None, **kwargs)[source]
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commas
- fit(views, y=None, **kwargs)
Fits a regularised CCA (canonical ridge) model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
Sparse Weighted CCA
- class cca_zoo.models.SWCCA(latent_dims=1, scale=True, centre=True, copy_data=True, random_state=None, max_iter=500, initialization='uniform', tol=1e-09, regularisation='l0', c=None, sample_support=None, positive=False)[source]
A class used to fit SWCCA model
- Citation
- Example
>>> from cca_zoo.models import SWCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SWCCA(regularisation='l0',c=[2, 2], sample_support=5, random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.61620969])
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale (
bool
) – normalize variance in each column before fittingcentre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
random_state – Pass for reproducible output across multiple function calls
max_iter (
int
) – the maximum number of iterations to perform in the inner optimization loopinitialization (
str
) – either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for iterative methodstol (
float
) – tolerance value used for early stoppingregularisation – the type of regularisation on the weights either ‘l0’ or ‘l1’
c (
Union
[Iterable
[Union
[float
,int
]],float
,int
,None
]) – regularisation parametersample_support – the l0 norm of the sample weights
positive – constrain weights to be positive
- fit(views, y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views, **kwargs)
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views
- score(views, y=None, **kwargs)
Returns average correlation in each dimension (averages over all pairs for multiview)
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
Base Class
- class cca_zoo.models._cca_base._CCA_Base(latent_dims=1, scale=True, centre=True, copy_data=True, accept_sparse=False, random_state=None)[source]
A class used as the base for methods in the package. Allows methods to inherit fit_transform, predict_corr, and gridsearch_fit when only fit (and transform where it is different to the default) is provided.
- weights
- Type
list of weights for each view
Constructor for _CCA_Base
- Parameters
latent_dims (
int
) – number of latent dimensions to fitscale – normalize variance in each column before fitting
centre – demean data by column before fitting (and before transforming out of sample
copy_data – If True, X will be copied; else, it may be overwritten
accept_sparse – Whether model can take sparse data as input
random_state (
Union
[int
,RandomState
,None
]) – Pass for reproducible output across multiple function calls
- abstract fit(views, y=None, **kwargs)[source]
Fits a given model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)y – unused but needed to integrate with scikit-learn
- transform(views, **kwargs)[source]
Transforms data given a fit model
- Parameters
views (
Iterable
[ndarray
]) – numpy arrays with the same number of rows (samples) separated by commaskwargs – any additional keyword arguments required by the given model
- fit_transform(views, **kwargs)[source]
Fits and then transforms the training data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- get_loadings(views, normalize=False, **kwargs)[source]
Returns the model loadings for each view for the given data
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
normalize – scales loadings to ensure that they represent correlations between features and scores
- pairwise_correlations(views, **kwargs)[source]
Predicts the correlations between each view for each dimension for the given data using the fit model
- Parameters
views (
Iterable
[ndarray
]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)kwargs – any additional keyword arguments required by the given model
- Returns
all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views