Models
- class cca_zoo.models.GCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, view_weights: Optional[Iterable[float]] = None, eps=1e-09)[source]
Bases:
rCCA
A class used to fit GCCA model. For more than 2 views, GCCA optimizes the sum of correlations with a shared auxiliary vector
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ \sum_iw_i^TX_i^TT \}\\\end{split}\\\text{subject to:}\\T^TT=1\end{aligned}\end{align} \]References
Tenenhaus, Arthur, and Michel Tenenhaus. “Regularized generalized canonical correlation analysis.” Psychometrika 76.2 (2011): 257.
Examples
>>> from cca_zoo.models import GCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = GCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.97229856])
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.KGCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001, kernel: Optional[Iterable[Union[float, callable]]] = None, gamma: Optional[Iterable[float]] = None, degree: Optional[Iterable[float]] = None, coef0: Optional[Iterable[float]] = None, kernel_params: Optional[Iterable[dict]] = None)[source]
Bases:
GCCA
A class used to fit KGCCA model. For more than 2 views, KGCCA optimizes the sum of correlations with a shared auxiliary vector
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ \sum_i\alpha_i^TK_i^TT \}\\\end{split}\\\text{subject to:}\\T^TT=1\end{aligned}\end{align} \]References
Tenenhaus, Arthur, Cathy Philippe, and Vincent Frouin. “Kernel generalized canonical correlation analysis.” Computational Statistics & Data Analysis 90 (2015): 114-131.
Examples
>>> from cca_zoo.models import KGCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = KGCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.97019284])
- transform(views: ndarray, y=None, **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.PLS_ALS(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, max_iter: int = 100, initialization: Union[str, callable] = 'random', tol: float = 1e-09, verbose=0)[source]
Bases:
_BaseIterative
A class used to fit a PLS model
Fits a partial least squares model with CCA deflation by NIPALS algorithm
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2\}\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\end{aligned}\end{align} \]- Example
>>> from cca_zoo.models import PLS >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = PLS_ALS(random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.81796854])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.SCCA_PMD(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, positive: Optional[Union[Iterable[bool], bool]] = None, verbose=0)[source]
Bases:
_BaseIterative
Fits a Sparse CCA (Penalized Matrix Decomposition) model.
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\\\|w_i\|<=c_i\end{aligned}\end{align} \]References
Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. “A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.” Biostatistics 10.3 (2009): 515-534.
Examples
>>> from cca_zoo.models import SCCA_PMD >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA_PMD(c=[1,1],random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.81796873])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.ElasticCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, c: Optional[Union[Iterable[float], float]] = None, l1_ratio: Optional[Union[Iterable[float], float]] = None, maxvar: bool = True, stochastic=False, positive: Optional[Union[Iterable[bool], bool]] = None, verbose=0)[source]
Bases:
_BaseIterative
Fits an elastic CCA by iterating elastic net regressions.
By default, ElasticCCA uses CCA with an auxiliary variable target i.e. MAXVAR configuration
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}, t_{opt}=\underset{w,t}{\mathrm{argmax}}\{\sum_i \|X_iw_i-t\|^2 + c\|w_i\|^2_2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\t^Tt=n\end{aligned}\end{align} \]But we can force it to attempt to use the SUMCOR form which will approximate a solution to the problem:
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + c\|w_i\|^2_2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]Examples
>>> from cca_zoo.models import ElasticCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = ElasticCCA(c=[1e-1,1e-1],l1_ratio=[0.5,0.5], random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.9316638])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.SCCA_Parkhomenko(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, verbose=0)[source]
Bases:
_BaseIterative
Fits a sparse CCA (penalized CCA) model
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \} + c_i\|w_i\|\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\end{aligned}\end{align} \]References
Parkhomenko, Elena, David Tritchler, and Joseph Beyene. “Sparse canonical correlation analysis with application to genomic data integration.” Statistical applications in genetics and molecular biology 8.1 (2009).
Examples
>>> from cca_zoo.models import SCCA_Parkhomenko >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA_Parkhomenko(c=[0.001,0.001],random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.81803527])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.SCCA_IPLS(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, maxvar: bool = False, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, stochastic=False, positive: Optional[Union[Iterable[bool], bool]] = None, verbose=0)[source]
Bases:
ElasticCCA
Fits a sparse CCA model by _iterative rescaled lasso regression. Implemented by ElasticCCA with l1 ratio=1
For default maxvar=False, the optimisation is given by:
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]- Citation
Mai, Qing, and Xin Zhang. “An _iterative penalized least squares approach to sparse canonical correlation analysis.” Biometrics 75.3 (2019): 734-744.
For maxvar=True, the optimisation is given by the ElasticCCA problem with no l2 regularisation:
- Maths
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}, t_{opt}=\underset{w,t}{\mathrm{argmax}}\{\sum_i \|X_iw_i-t\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\t^Tt=n\end{aligned}\end{align} \]- Citation
Fu, Xiao, et al. “Scalable and flexible multiview MAX-VAR canonical correlation analysis.” IEEE Transactions on Signal Processing 65.16 (2017): 4150-4165.
- Example
>>> from cca_zoo.models import SCCA_IPLS >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA_IPLS(c=[0.001,0.001], random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.99998761])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.SCCA_ADMM(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, mu: Optional[Union[Iterable[float], float]] = None, lam: Optional[Union[Iterable[float], float]] = None, eta: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, verbose=0)[source]
Bases:
_BaseIterative
Fits a sparse CCA model by alternating ADMM
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=1\end{aligned}\end{align} \]References
Suo, Xiaotong, et al. “Sparse canonical correlation analysis.” arXiv preprint arXiv:1705.10865 (2017).
Examples
>>> from cca_zoo.models import SCCA_ADMM >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA_ADMM(random_state=0,c=[1e-1,1e-1]) >>> model.fit((X1,X2)).score((X1,X2)) array([0.84348183])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.SCCA_Span(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, max_iter: int = 100, initialization: str = 'uniform', tol: float = 1e-09, regularisation='l0', c: Optional[Union[Iterable[Union[float, int]], float, int]] = None, rank=1, positive: Optional[Union[Iterable[bool], bool]] = None, random_state=None, deflation='cca', verbose=0)[source]
Bases:
_BaseIterative
Fits a Sparse CCA model using SpanCCA.
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=1\end{aligned}\end{align} \]References
Asteris, Megasthenis, et al. “A simple and provable algorithm for sparse diagonal CCA.” International Conference on Machine Learning. PMLR, 2016.
Examples
>>> from cca_zoo.models import SCCA_Span >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SCCA_Span(regularisation="l0", c=[2, 2]) >>> model.fit((X1,X2)).score((X1,X2)) array([0.84556666])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.SWCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, max_iter: int = 500, initialization: str = 'random', tol: float = 1e-09, regularisation='l0', c: Optional[Union[Iterable[Union[float, int]], float, int]] = None, sample_support=None, positive=False, verbose=0)[source]
Bases:
_BaseIterative
A class used to fit SWCCA model
References
Examples
>>> from cca_zoo.models import SWCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = SWCCA(regularisation='l0',c=[2, 2], sample_support=5, random_state=0) >>> model.fit((X1,X2)).score((X1,X2)) array([0.61620969])
Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.MCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=1e-09)[source]
Bases:
rCCA
A class used to fit MCCA model. For more than 2 views, MCCA optimizes the sum of pairwise correlations.
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} w_i^TX_i^TX_jw_j \}\\\end{split}\\\text{subject to:}\\(1-c_i)w_i^TX_i^TX_iw_i+c_iw_i^Tw_i=1\end{aligned}\end{align} \]References
Kettenring, Jon R. “Canonical analysis of several sets of variables.” Biometrika 58.3 (1971): 433-451.
Examples
>>> from cca_zoo.models import MCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = MCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.97200847])
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.KCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001, kernel: Optional[Iterable[Union[float, callable]]] = None, gamma: Optional[Iterable[float]] = None, degree: Optional[Iterable[float]] = None, coef0: Optional[Iterable[float]] = None, kernel_params: Optional[Iterable[dict]] = None)[source]
Bases:
MCCA
A class used to fit KCCA model.
\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\c_i\alpha_i^TK_i\alpha_i + (1-c_i)\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]Examples
>>> from cca_zoo.models import KCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = KCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([0.96893666])
- transform(views: ndarray, **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.NCCA(latent_dims: int = 1, scale=True, centre=True, copy_data=True, accept_sparse=False, random_state: Optional[Union[int, RandomState]] = None, nearest_neighbors=None, gamma: Optional[Iterable[float]] = None)[source]
Bases:
_BaseCCA
A class used to fit nonparametric (NCCA) model.
References
Michaeli, Tomer, Weiran Wang, and Karen Livescu. “Nonparametric canonical correlation analysis.” International conference on machine learning. PMLR, 2016.
Example
>>> from cca_zoo.models import NCCA >>> X1 = np.random.rand(10,5) >>> X2 = np.random.rand(10,5) >>> model = NCCA() >>> model._fit((X1,X2)).score((X1,X2)) array([1.])
- fit(views: Iterable[ndarray], y=None, **kwargs)[source]
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- transform(views: Iterable[ndarray], **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.PartialCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001)[source]
Bases:
MCCA
A class used to fit a partial cca model. The key difference between this and a vanilla CCA or MCCA is that the canonical score vectors must be orthogonal to the supplied confounding variables.
References
Rao, B. Raja. “Partial canonical correlations.” Trabajos de estadistica y de investigación operativa 20.2-3 (1969): 211-219.
Example
>>> from cca_zoo.models import PartialCCA >>> X1 = np.random.rand(10,5) >>> X2 = np.random.rand(10,5) >>> partials = np.random.rand(10,3) >>> model = PartialCCA() >>> model.fit((X1,X2),partials=partials).score((X1,X2)) array([0.99993046])
- fit(views: Iterable[ndarray], y=None, partials=None, **kwargs)[source]
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- transform(views: Iterable[ndarray], partials=None, **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.rCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001, accept_sparse=None)[source]
Bases:
_BaseCCA
A class used to fit Regularised CCA (canonical ridge) model. Uses PCA to perform the optimization efficiently for high dimensional data.
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\(1-c_1)w_1^TX_1^TX_1w_1+c_1w_1^Tw_1=n\\(1-c_2)w_2^TX_2^TX_2w_2+c_2w_2^Tw_2=n\end{aligned}\end{align} \]- Parameters
latent_dims (int, optional) – Number of latent dimensions to use, by default 1
scale (bool, optional) – Whether to scale the data, by default True
centre (bool, optional) – Whether to centre the data, by default True
copy_data (bool, optional) – Whether to copy the data, by default True
random_state (int, optional) – Random state for reproducibility, by default None
c (Union[Iterable[float], float], optional) – Regularisation parameter, by default None
eps (float, optional) – Tolerance for convergence, by default 1e-3
accept_sparse (Union[Iterable[str], str], optional) – Whether to accept sparse matrices, by default None
References
Vinod, Hrishikesh D. “Canonical ridge and econometrics of joint production.” Journal of econometrics 4.2 (1976): 147-166.
Example
>>> from cca_zoo.models import rCCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = rCCA(c=[0.1,0.1]) >>> model.fit((X1,X2)).score((X1,X2)) array([0.95222128])
- fit(views: Iterable[ndarray], y=None, **kwargs)[source]
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.CCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None)[source]
Bases:
rCCA
A class used to fit a simple CCA model
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_1^TX_1^TX_1w_1=n\\w_2^TX_2^TX_2w_2=n\end{aligned}\end{align} \]- Parameters
latent_dims (int, optional) – The number of latent dimensions to use, by default 1
scale (bool, optional) – Whether to scale the data, by default True
centre (bool, optional) – Whether to centre the data, by default True
copy_data (bool, optional) – Whether to copy the data, by default True
random_state (int, optional) – The random state to use, by default None
References
Hotelling, Harold. “Relations between two sets of variates.” Breakthroughs in statistics. Springer, New York, NY, 1992. 162-190.
Example
>>> from cca_zoo.models import CCA >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = CCA() >>> model.fit((X1,X2)).score((X1,X2)) array([1.])
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.PLS(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None)[source]
Bases:
rCCA
A class used to fit a simple PLS model
Implements PLS by inheriting regularised CCA with maximal regularisation
\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_1^Tw_1=1\\w_2^Tw_2=1\end{aligned}\end{align} \]Example
>>> from cca_zoo.models import PLS >>> import numpy as np >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> model = PLS() >>> model.fit((X1,X2)).score((X1,X2)) array([0.81796873])
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.TCCA(latent_dims: int = 1, scale=True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None)[source]
Bases:
_BaseCCA
Fits a Tensor CCA model. Tensor CCA maximises higher order correlations between the views.
\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]References
Kim, Tae-Kyun, Shu-Fai Wong, and Roberto Cipolla. “Tensor canonical correlation analysis for action classification.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2007 https://github.com/rciszek/mdr_tcca
Examples
>>> from cca_zoo.models import TCCA >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = TCCA() >>> model._fit((X1,X2,X3)).score((X1,X2,X3)) array([1.14595755])
- fit(views: Iterable[ndarray], y=None, **kwargs)[source]
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- correlations(views: Iterable[ndarray], **kwargs)[source]
Predicts the correlation for the given data using the fit model
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model
- score(views: Iterable[ndarray], **kwargs)[source]
Returns the higher order correlations in each dimension
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.KTCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, eps=0.001, c: Optional[Union[Iterable[float], float]] = None, kernel: Optional[Iterable[Union[float, callable]]] = None, gamma: Optional[Iterable[float]] = None, degree: Optional[Iterable[float]] = None, coef0: Optional[Iterable[float]] = None, kernel_params: Optional[Iterable[dict]] = None)[source]
Bases:
TCCA
Fits a Kernel Tensor CCA model. Tensor CCA maximises higher order correlations
\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]References
Kim, Tae-Kyun, Shu-Fai Wong, and Roberto Cipolla. “Tensor canonical correlation analysis for action classification.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2007
Examples
>>> from cca_zoo.models import KTCCA >>> rng=np.random.RandomState(0) >>> X1 = rng.random((10,5)) >>> X2 = rng.random((10,5)) >>> X3 = rng.random((10,5)) >>> model = KTCCA() >>> model.fit((X1,X2,X3)).score((X1,X2,X3)) array([1.69896269])
- transform(views: ndarray, **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- correlations(views: Iterable[ndarray], **kwargs)
Predicts the correlation for the given data using the fit model
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model
- fit(views: Iterable[ndarray], y=None, **kwargs)
Fits the model to the given data
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
self
- Return type
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], **kwargs)
Returns the higher order correlations in each dimension
- Parameters
views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.PRCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, eps=0.001, c=0)[source]
Bases:
MCCA
Partially Regularized Canonical Correlation Analysis
References
Tuzhilina, Elena, Leonardo Tozzi, and Trevor Hastie. “Canonical correlation analysis in high dimensions with structured regularization.” Statistical Modelling (2021): 1471082X211041033.
- fit(views: Iterable[ndarray], y=None, idxs=None, **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
idxs (list/tuple of integers indicating which features from each view are the partially regularised features) –
kwargs (any additional keyword arguments required by the given model) –
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- class cca_zoo.models.GRCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, eps=0.001, c: float = 0, mu: float = 0)[source]
Bases:
MCCA
Grouped Regularized Canonical Correlation Analysis
References
Tuzhilina, Elena, Leonardo Tozzi, and Trevor Hastie. “Canonical correlation analysis in high dimensions with structured regularization.” Statistical Modelling (2021): 1471082X211041033.
- fit(views: Iterable[ndarray], y=None, feature_groups=None, **kwargs)[source]
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
feature_groups (list/tuple of integer numpy arrays or array likes with dimensions (,view shape)) –
kwargs (any additional keyword arguments required by the given model) –
- fit_transform(views: Iterable[ndarray], **kwargs)
Fits the model to the given data and returns the transformed views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
transformed_views
- Return type
list of numpy arrays
- get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)
Returns the factor loadings for each view
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –
- Returns
factor_loadings
- Return type
list of numpy arrays
- get_params(deep=True)
Get parameters for this estimator.
- pairwise_correlations(views: Iterable[ndarray], **kwargs)
Returns the pairwise correlations between the views in each dimension
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
pairwise_correlations
- Return type
numpy array of shape (n_views, n_views, latent_dims)
- score(views: Iterable[ndarray], y=None, **kwargs)
Returns the average pairwise correlation between the views
- Parameters
views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –
- Returns
score
- Return type
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance