Probabilistic Models

Variational CCA

class cca_zoo.probabilisticmodels.probabilisticcca.ProbabilisticCCA(latent_dims=1, copy_data=True, random_state=0, num_samples=100, num_warmup=100)[source]

Bases: cca_zoo.models._cca_base._CCA_Base

A class used to fit a Probabilistic CCA. Not quite the same due to using VI methods rather than EM


Bach, Francis R., and Michael I. Jordan. “A probabilistic interpretation of canonical correlation analysis.” (2005). Wang, Chong. “Variational Bayesian approach to canonical correlation analysis.” IEEE Transactions on Neural Networks 18.3 (2007): 905-910.

fit(views, y=None, **kwargs)[source]

Infer the parameters (mu: mean, psi: within view variance) and latent variables (z) of the generative CCA model

Parameters:views (Iterable[ndarray]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
transform(views, y=None, **kwargs)[source]

Predict the latent variables that generate the data in views using the sampled model parameters

Parameters:views (Iterable[ndarray]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
correlations(views, y=None, **kwargs)

Predicts the correlation for the given data using the fit model

  • views (Iterable[ndarray]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
  • kwargs – any additional keyword arguments required by the given model

all_corrs: an array of the pairwise correlations (k,k,self.latent_dims) where k is the number of views

Return type:


fit_transform(views, y=None, **kwargs)

Fits and then transforms the training data

  • views (Iterable[ndarray]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
  • kwargs – any additional keyword arguments required by the given model
get_loadings(views, y=None, normalize=False, **kwargs)

Returns the model loadings for each view for the given data

  • views (Iterable[ndarray]) – list/tuple of numpy arrays or array likes with the same number of rows (samples)
  • kwargs – any additional keyword arguments required by the given model
score(views, y=None, **kwargs)

Return the coefficient of determination of the prediction.

The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

  • X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
  • y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

score\(R^2\) of self.predict(X) wrt. y.

Return type:



The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score(). This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).