Models

class cca_zoo.models.GCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, view_weights: Optional[Iterable[float]] = None, eps=1e-09)[source]

Bases: rCCA

A class used to fit GCCA model. For more than 2 views, GCCA optimizes the sum of correlations with a shared auxiliary vector

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ \sum_iw_i^TX_i^TT \}\\\end{split}\\\text{subject to:}\\T^TT=1\end{aligned}\end{align} \]

References

Tenenhaus, Arthur, and Michel Tenenhaus. “Regularized generalized canonical correlation analysis.” Psychometrika 76.2 (2011): 257.

Examples

>>> from cca_zoo.models import GCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> X3 = rng.random((10,5))
>>> model = GCCA()
>>> model.fit((X1,X2,X3)).score((X1,X2,X3))
array([0.97229856])

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.KGCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001, kernel: Optional[Iterable[Union[float, callable]]] = None, gamma: Optional[Iterable[float]] = None, degree: Optional[Iterable[float]] = None, coef0: Optional[Iterable[float]] = None, kernel_params: Optional[Iterable[dict]] = None)[source]

Bases: GCCA

A class used to fit KGCCA model. For more than 2 views, KGCCA optimizes the sum of correlations with a shared auxiliary vector

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ \sum_i\alpha_i^TK_i^TT \}\\\end{split}\\\text{subject to:}\\T^TT=1\end{aligned}\end{align} \]

References

Tenenhaus, Arthur, Cathy Philippe, and Vincent Frouin. “Kernel generalized canonical correlation analysis.” Computational Statistics & Data Analysis 90 (2015): 114-131.

Examples

>>> from cca_zoo.models import KGCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> X3 = rng.random((10,5))
>>> model = KGCCA()
>>> model.fit((X1,X2,X3)).score((X1,X2,X3))
array([0.97019284])

transform(views: ndarray, y=None, **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

class cca_zoo.models.PLS_ALS(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, max_iter: int = 100, initialization: Union[str, callable] = 'random', tol: float = 1e-09, verbose=0)[source]

Bases: _BaseIterative

A class used to fit a PLS model

Fits a partial least squares model with CCA deflation by NIPALS algorithm

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2\}\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\end{aligned}\end{align} \]

Example

>>> from cca_zoo.models import PLS
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = PLS_ALS(random_state=0)
>>> model.fit((X1,X2)).score((X1,X2))
array([0.81796854])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.SCCA_PMD(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, positive: Optional[Union[Iterable[bool], bool]] = None, verbose=0)[source]

Bases: _BaseIterative

Fits a Sparse CCA (Penalized Matrix Decomposition) model.

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\\\|w_i\|<=c_i\end{aligned}\end{align} \]

References

Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. “A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.” Biostatistics 10.3 (2009): 515-534.

Examples

>>> from cca_zoo.models import SCCA_PMD
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = SCCA_PMD(c=[1,1],random_state=0)
>>> model.fit((X1,X2)).score((X1,X2))
array([0.81796873])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.ElasticCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, c: Optional[Union[Iterable[float], float]] = None, l1_ratio: Optional[Union[Iterable[float], float]] = None, maxvar: bool = True, stochastic=False, positive: Optional[Union[Iterable[bool], bool]] = None, verbose=0)[source]

Bases: _BaseIterative

Fits an elastic CCA by iterating elastic net regressions.

By default, ElasticCCA uses CCA with an auxiliary variable target i.e. MAXVAR configuration

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}, t_{opt}=\underset{w,t}{\mathrm{argmax}}\{\sum_i \|X_iw_i-t\|^2 + c\|w_i\|^2_2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\t^Tt=n\end{aligned}\end{align} \]

But we can force it to attempt to use the SUMCOR form which will approximate a solution to the problem:

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + c\|w_i\|^2_2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]

Examples

>>> from cca_zoo.models import ElasticCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = ElasticCCA(c=[1e-1,1e-1],l1_ratio=[0.5,0.5], random_state=0)
>>> model.fit((X1,X2)).score((X1,X2))
array([0.9316638])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.SCCA_Parkhomenko(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, verbose=0)[source]

Bases: _BaseIterative

Fits a sparse CCA (penalized CCA) model

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \} + c_i\|w_i\|\\\end{split}\\\text{subject to:}\\w_i^Tw_i=1\end{aligned}\end{align} \]

References

Parkhomenko, Elena, David Tritchler, and Joseph Beyene. “Sparse canonical correlation analysis with application to genomic data integration.” Statistical applications in genetics and molecular biology 8.1 (2009).

Examples

>>> from cca_zoo.models import SCCA_Parkhomenko
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = SCCA_Parkhomenko(c=[0.001,0.001],random_state=0)
>>> model.fit((X1,X2)).score((X1,X2))
array([0.81803527])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.SCCA_IPLS(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, maxvar: bool = False, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, stochastic=False, positive: Optional[Union[Iterable[bool], bool]] = None, verbose=0)[source]

Bases: ElasticCCA

Fits a sparse CCA model by _iterative rescaled lasso regression. Implemented by ElasticCCA with l1 ratio=1

For default maxvar=False, the optimisation is given by:

Maths

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=n\end{aligned}\end{align} \]

Citation

Mai, Qing, and Xin Zhang. “An _iterative penalized least squares approach to sparse canonical correlation analysis.” Biometrics 75.3 (2019): 734-744.

For maxvar=True, the optimisation is given by the ElasticCCA problem with no l2 regularisation:

Maths

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}, t_{opt}=\underset{w,t}{\mathrm{argmax}}\{\sum_i \|X_iw_i-t\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\t^Tt=n\end{aligned}\end{align} \]

Citation

Fu, Xiao, et al. “Scalable and flexible multiview MAX-VAR canonical correlation analysis.” IEEE Transactions on Signal Processing 65.16 (2017): 4150-4165.

Example

>>> from cca_zoo.models import SCCA_IPLS
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = SCCA_IPLS(c=[0.001,0.001], random_state=0)
>>> model.fit((X1,X2)).score((X1,X2))
array([0.99998761])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.SCCA_ADMM(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, deflation='cca', c: Optional[Union[Iterable[float], float]] = None, mu: Optional[Union[Iterable[float], float]] = None, lam: Optional[Union[Iterable[float], float]] = None, eta: Optional[Union[Iterable[float], float]] = None, max_iter: int = 100, initialization: Union[str, callable] = 'pls', tol: float = 1e-09, verbose=0)[source]

Bases: _BaseIterative

Fits a sparse CCA model by alternating ADMM

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=1\end{aligned}\end{align} \]

References

Suo, Xiaotong, et al. “Sparse canonical correlation analysis.” arXiv preprint arXiv:1705.10865 (2017).

Examples

>>> from cca_zoo.models import SCCA_ADMM
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = SCCA_ADMM(random_state=0,c=[1e-1,1e-1])
>>> model.fit((X1,X2)).score((X1,X2))
array([0.84348183])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.SCCA_Span(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, max_iter: int = 100, initialization: str = 'uniform', tol: float = 1e-09, regularisation='l0', c: Optional[Union[Iterable[Union[float, int]], float, int]] = None, rank=1, positive: Optional[Union[Iterable[bool], bool]] = None, random_state=None, deflation='cca', verbose=0)[source]

Bases: _BaseIterative

Fits a Sparse CCA model using SpanCCA.

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \|X_iw_i-X_jw_j\|^2 + \text{l1_ratio}\|w_i\|_1\}\\\end{split}\\\text{subject to:}\\w_i^TX_i^TX_iw_i=1\end{aligned}\end{align} \]

References

Asteris, Megasthenis, et al. “A simple and provable algorithm for sparse diagonal CCA.” International Conference on Machine Learning. PMLR, 2016.

Examples

>>> from cca_zoo.models import SCCA_Span
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = SCCA_Span(regularisation="l0", c=[2, 2])
>>> model.fit((X1,X2)).score((X1,X2))
array([0.84556666])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.SWCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, max_iter: int = 500, initialization: str = 'random', tol: float = 1e-09, regularisation='l0', c: Optional[Union[Iterable[Union[float, int]], float, int]] = None, sample_support=None, positive=False, verbose=0)[source]

Bases: _BaseIterative

A class used to fit SWCCA model

References

Examples

>>> from cca_zoo.models import SWCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = SWCCA(regularisation='l0',c=[2, 2], sample_support=5, random_state=0)
>>> model.fit((X1,X2)).score((X1,X2))
array([0.61620969])

Constructor for _BaseIterative :param latent_dims: number of latent dimensions to fit :param scale: normalize variance in each column before fitting :param centre: demean data by column before fitting (and before transforming out of sample :param copy_data: If True, views will be copied; else, it may be overwritten :param random_state: Pass for reproducible output across multiple function calls :param deflation: the type of deflation. :param max_iter: the maximum number of iterations to perform in the inner optimization loop :param initialization: either string from “pls”, “cca”, “random”, “uniform” or callable to initialize the score variables for _iterative methods :param tol: tolerance value used for early stopping

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model by running an inner loop to convergence and then using either CCA or PLS deflation

Parameters: views – list/tuple of numpy arrays or array likes with the same number of rows (samples)

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.MCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=1e-09)[source]

Bases: rCCA

A class used to fit MCCA model. For more than 2 views, MCCA optimizes the sum of pairwise correlations.

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} w_i^TX_i^TX_jw_j \}\\\end{split}\\\text{subject to:}\\(1-c_i)w_i^TX_i^TX_iw_i+c_iw_i^Tw_i=1\end{aligned}\end{align} \]

References

Kettenring, Jon R. “Canonical analysis of several sets of variables.” Biometrika 58.3 (1971): 433-451.

Examples

>>> from cca_zoo.models import MCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> X3 = rng.random((10,5))
>>> model = MCCA()
>>> model.fit((X1,X2,X3)).score((X1,X2,X3))
array([0.97200847])

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.KCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001, kernel: Optional[Iterable[Union[float, callable]]] = None, gamma: Optional[Iterable[float]] = None, degree: Optional[Iterable[float]] = None, coef0: Optional[Iterable[float]] = None, kernel_params: Optional[Iterable[dict]] = None)[source]

Bases: MCCA

A class used to fit KCCA model.

\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\c_i\alpha_i^TK_i\alpha_i + (1-c_i)\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]

Examples

>>> from cca_zoo.models import KCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> X3 = rng.random((10,5))
>>> model = KCCA()
>>> model.fit((X1,X2,X3)).score((X1,X2,X3))
array([0.96893666])

transform(views: ndarray, **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

class cca_zoo.models.NCCA(latent_dims: int = 1, scale=True, centre=True, copy_data=True, accept_sparse=False, random_state: Optional[Union[int, RandomState]] = None, nearest_neighbors=None, gamma: Optional[Iterable[float]] = None)[source]

Bases: _BaseCCA

A class used to fit nonparametric (NCCA) model.

References

Michaeli, Tomer, Weiran Wang, and Karen Livescu. “Nonparametric canonical correlation analysis.” International conference on machine learning. PMLR, 2016.

Example

>>> from cca_zoo.models import NCCA
>>> X1 = np.random.rand(10,5)
>>> X2 = np.random.rand(10,5)
>>> model = NCCA()
>>> model._fit((X1,X2)).score((X1,X2))
array([1.])

fit(views: Iterable[ndarray], y=None, **kwargs)[source]

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

transform(views: Iterable[ndarray], **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

class cca_zoo.models.PartialCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001)[source]

Bases: MCCA

A class used to fit a partial cca model. The key difference between this and a vanilla CCA or MCCA is that the canonical score vectors must be orthogonal to the supplied confounding variables.

References

Rao, B. Raja. “Partial canonical correlations.” Trabajos de estadistica y de investigación operativa 20.2-3 (1969): 211-219.

Example

>>> from cca_zoo.models import PartialCCA
>>> X1 = np.random.rand(10,5)
>>> X2 = np.random.rand(10,5)
>>> partials = np.random.rand(10,3)
>>> model = PartialCCA()
>>> model.fit((X1,X2),partials=partials).score((X1,X2))
array([0.99993046])

fit(views: Iterable[ndarray], y=None, partials=None, **kwargs)[source]

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

transform(views: Iterable[ndarray], partials=None, **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

class cca_zoo.models.rCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None, eps=0.001, accept_sparse=None)[source]

Bases: _BaseCCA

A class used to fit Regularised CCA (canonical ridge) model. Uses PCA to perform the optimization efficiently for high dimensional data.

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\(1-c_1)w_1^TX_1^TX_1w_1+c_1w_1^Tw_1=n\\(1-c_2)w_2^TX_2^TX_2w_2+c_2w_2^Tw_2=n\end{aligned}\end{align} \]

Parameters

latent_dims (int, optional) – Number of latent dimensions to use, by default 1
scale (bool, optional) – Whether to scale the data, by default True
centre (bool, optional) – Whether to centre the data, by default True
copy_data (bool, optional) – Whether to copy the data, by default True
random_state (int, optional) – Random state for reproducibility, by default None
c (Union[Iterable[float], float], optional) – Regularisation parameter, by default None
eps (float, optional) – Tolerance for convergence, by default 1e-3
accept_sparse (Union[Iterable[str], str], optional) – Whether to accept sparse matrices, by default None

References

Vinod, Hrishikesh D. “Canonical ridge and econometrics of joint production.” Journal of econometrics 4.2 (1976): 147-166.

Example

>>> from cca_zoo.models import rCCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = rCCA(c=[0.1,0.1])
>>> model.fit((X1,X2)).score((X1,X2))
array([0.95222128])

fit(views: Iterable[ndarray], y=None, **kwargs)[source]

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.CCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None)[source]

Bases: rCCA

A class used to fit a simple CCA model

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_1^TX_1^TX_1w_1=n\\w_2^TX_2^TX_2w_2=n\end{aligned}\end{align} \]

Parameters

latent_dims (int, optional) – The number of latent dimensions to use, by default 1
scale (bool, optional) – Whether to scale the data, by default True
centre (bool, optional) – Whether to centre the data, by default True
copy_data (bool, optional) – Whether to copy the data, by default True
random_state (int, optional) – The random state to use, by default None

References

Hotelling, Harold. “Relations between two sets of variates.” Breakthroughs in statistics. Springer, New York, NY, 1992. 162-190.

Example

>>> from cca_zoo.models import CCA
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = CCA()
>>> model.fit((X1,X2)).score((X1,X2))
array([1.])

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.PLS(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None)[source]

Bases: rCCA

A class used to fit a simple PLS model

Implements PLS by inheriting regularised CCA with maximal regularisation

\[ \begin{align}\begin{aligned}\begin{split}w_{opt}=\underset{w}{\mathrm{argmax}}\{ w_1^TX_1^TX_2w_2 \}\\\end{split}\\\text{subject to:}\\w_1^Tw_1=1\\w_2^Tw_2=1\end{aligned}\end{align} \]

Example

>>> from cca_zoo.models import PLS
>>> import numpy as np
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> model = PLS()
>>> model.fit((X1,X2)).score((X1,X2))
array([0.81796873])

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.TCCA(latent_dims: int = 1, scale=True, centre=True, copy_data=True, random_state=None, c: Optional[Union[Iterable[float], float]] = None)[source]

Bases: _BaseCCA

Fits a Tensor CCA model. Tensor CCA maximises higher order correlations between the views.

\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]

References

Kim, Tae-Kyun, Shu-Fai Wong, and Roberto Cipolla. “Tensor canonical correlation analysis for action classification.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2007 https://github.com/rciszek/mdr_tcca

Examples

>>> from cca_zoo.models import TCCA
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> X3 = rng.random((10,5))
>>> model = TCCA()
>>> model._fit((X1,X2,X3)).score((X1,X2,X3))
array([1.14595755])

fit(views: Iterable[ndarray], y=None, **kwargs)[source]

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

correlations(views: Iterable[ndarray], **kwargs)[source]

Predicts the correlation for the given data using the fit model

Parameters

views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model

score(views: Iterable[ndarray], **kwargs)[source]

Returns the higher order correlations in each dimension

Parameters

views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.KTCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, eps=0.001, c: Optional[Union[Iterable[float], float]] = None, kernel: Optional[Iterable[Union[float, callable]]] = None, gamma: Optional[Iterable[float]] = None, degree: Optional[Iterable[float]] = None, coef0: Optional[Iterable[float]] = None, kernel_params: Optional[Iterable[dict]] = None)[source]

Bases: TCCA

Fits a Kernel Tensor CCA model. Tensor CCA maximises higher order correlations

\[ \begin{align}\begin{aligned}\begin{split}\alpha_{opt}=\underset{\alpha}{\mathrm{argmax}}\{\sum_i\sum_{j\neq i} \alpha_i^TK_i^TK_j\alpha_j \}\\\end{split}\\\text{subject to:}\\\alpha_i^TK_i^TK_i\alpha_i=1\end{aligned}\end{align} \]

References

Kim, Tae-Kyun, Shu-Fai Wong, and Roberto Cipolla. “Tensor canonical correlation analysis for action classification.” 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2007

Examples

>>> from cca_zoo.models import KTCCA
>>> rng=np.random.RandomState(0)
>>> X1 = rng.random((10,5))
>>> X2 = rng.random((10,5))
>>> X3 = rng.random((10,5))
>>> model = KTCCA()
>>> model.fit((X1,X2,X3)).score((X1,X2,X3))
array([1.69896269])

transform(views: ndarray, **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

correlations(views: Iterable[ndarray], **kwargs)

Predicts the correlation for the given data using the fit model

Parameters

views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model

fit(views: Iterable[ndarray], y=None, **kwargs)

Fits the model to the given data

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

self

Return type

object

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], **kwargs)

Returns the higher order correlations in each dimension

Parameters

views – list/tuple of numpy arrays or array likes with the same number of rows (samples)
kwargs – any additional keyword arguments required by the given model

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

class cca_zoo.models.PRCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, eps=0.001, c=0)[source]

Bases: MCCA

Partially Regularized Canonical Correlation Analysis

References

Tuzhilina, Elena, Leonardo Tozzi, and Trevor Hastie. “Canonical correlation analysis in high dimensions with structured regularization.” Statistical Modelling (2021): 1471082X211041033.

fit(views: Iterable[ndarray], y=None, idxs=None, **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
idxs (list/tuple of integers indicating which features from each view are the partially regularised features) –
kwargs (any additional keyword arguments required by the given model) –

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

class cca_zoo.models.GRCCA(latent_dims: int = 1, scale: bool = True, centre=True, copy_data=True, random_state=None, eps=0.001, c: float = 0, mu: float = 0)[source]

Bases: MCCA

Grouped Regularized Canonical Correlation Analysis

References

Tuzhilina, Elena, Leonardo Tozzi, and Trevor Hastie. “Canonical correlation analysis in high dimensions with structured regularization.” Statistical Modelling (2021): 1471082X211041033.

fit(views: Iterable[ndarray], y=None, feature_groups=None, **kwargs)[source]

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
feature_groups (list/tuple of integer numpy arrays or array likes with dimensions (,view shape)) –
kwargs (any additional keyword arguments required by the given model) –

fit_transform(views: Iterable[ndarray], **kwargs)

Fits the model to the given data and returns the transformed views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays

get_factor_loadings(views: Iterable[ndarray], normalize=True, **kwargs)

Returns the factor loadings for each view

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
normalize (bool, optional) – Whether to normalize the factor loadings. Default is True.
kwargs (any additional keyword arguments required by the given model) –

Returns

factor_loadings

Return type

list of numpy arrays

get_params(deep=True)

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

pairwise_correlations(views: Iterable[ndarray], **kwargs)

Returns the pairwise correlations between the views in each dimension

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

pairwise_correlations

Return type

numpy array of shape (n_views, n_views, latent_dims)

score(views: Iterable[ndarray], y=None, **kwargs)

Returns the average pairwise correlation between the views

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
y (None) –
kwargs (any additional keyword arguments required by the given model) –

Returns

score

Return type

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance

transform(views: Iterable[ndarray], **kwargs)

Parameters

views (list/tuple of numpy arrays or array likes with the same number of rows (samples)) –
kwargs (any additional keyword arguments required by the given model) –

Returns

transformed_views

Return type

list of numpy arrays