Data
Simulated Data
- cca_zoo.data.simulated.generate_covariance_data(n, view_features, latent_dims=1, view_sparsity=None, correlation=1, structure=None, sigma=None, decay=0.5, positive=None, random_state=None)[source]
Function to generate CCA dataset with defined population correlations
- Parameters
n (
int
) – number of samplesview_sparsity (
Optional
[List
[Union
[int
,float
]]]) – level of sparsity in features in each view either as number of active variables or percentage activeview_features (
List
[int
]) – number of features in each viewlatent_dims (
int
) – number of latent dimensionscorrelation (
Union
[List
[float
],float
]) – correlation either as list with element for each latent dimension or as float which is scaled by ‘decay’structure (
Union
[str
,List
[str
],None
]) – within view covariance structure (‘identity’,’gaussian’,’toeplitz’,’random’)sigma (
Union
[float
,List
[float
],None
]) – gaussian sigmadecay (
float
) – ratio of second signal to first signal
- Returns
tuple of numpy arrays: view_1, view_2, true weights from view 1, true weights from view 2, overall covariance structure
- Example
>>> from cca_zoo.data import generate_covariance_data >>> [train_view_1,train_view_2],[true_weights_1,true_weights_2]=generate_covariance_data(200,[10,10],latent_dims=1,correlation=1)
- cca_zoo.data.simulated.generate_simple_data(n, view_features, view_sparsity=None, eps=0, transform=False, random_state=None)[source]
Simple latent variable model to generate data with one latent factor
- Parameters
n (
int
) – number of samplesview_features (
List
[int
]) – number of features view 1view_sparsity (
Optional
[List
[Union
[int
,float
]]]) – number of features view 2eps (
float
) – gaussian noise std
- Returns
view1 matrix, view2 matrix, true weights view 1, true weights view 2
- Example
>>> from cca_zoo.data import generate_simple_data >>> [train_view_1,train_view_2],[true_weights_1,true_weights_2]=generate_covariance_data(200,[10,10])