`cca_zoo.model_selection`.cross_validate#

class cca_zoo.model_selection.cross_validate(estimator, views, y=None, *, groups=None, scoring=None, cv=None, n_jobs=None, verbose=0, fit_params=None, pre_dispatch='2*n_jobs', return_train_score=False, return_estimator=False, error_score=nan)[source]#

Bases:

Evaluate metric(s) by cross-validation and also record fit/score times.

Read more in the User Guide.

Parameters:

estimator (object) – Estimator object implementing ‘fit’. The object to use to fit the data.
views (list or tuple of array-like) – List or tuple of numpy arrays or array-likes with the same number of rows (samples).
y (array-like of shape (n_samples,) or (n_samples, n_outputs), optional, default=None) – The target variable to try to predict in the case of supervised learning.
groups (array-like of shape (n_samples,), optional, default=None) – Group labels for the samples used while splitting the dataset into train/test set. Only used in conjunction with a “Group” cv instance (e.g., GroupKFold).
scoring (str, callable, list, tuple, or dict, optional, default=None) – Strategy to evaluate the performance of the cross-validated model on the test set. See notes below for more detail.
cv (int, cross-validation generator or an iterable, optional, default=None) – Determines the cross-validation splitting strategy. See notes below for more detail.
n_jobs (int, optional, default=None) – Number of jobs to run in parallel.
verbose (int, default=0) – The verbosity level.
fit_params (dict, optional, default=None) – Parameters to pass to the fit method of the estimator.
pre_dispatch (int or str, default='2*n_jobs') – Controls the number of jobs that get dispatched during parallel execution. See notes below for more detail.

Notes

For scoring: If scoring represents a single score, one can use:

a single string (see The scoring parameter: defining model evaluation rules);

a callable (see Defining your scoring strategy from metric functions) that returns a single value.

If scoring represents multiple scores, one can use:

a list or tuple of unique strings;
a callable returning a dictionary where the keys are the metric names and the values are the metric scores;
a dictionary with metric names as keys and callables a values.

See Specifying multiple metrics for evaluation for an example.

For cv: Possible inputs for cv are:

None, to use the default 5-fold cross validation,

int, to specify the number of folds in a (Stratified)KFold,

CV splitter,

An iterable yielding (train, test) splits as arrays of indices.

For int/None inputs, if the estimator is a classifier and y is either binary or multiclass, StratifiedKFold is used. In all other cases, Fold is used. These splitters are instantiated with shuffle=False so the splits will be the same across calls. Refer User Guide for the various cross-validation strategies that can be used here.

For pre_dispatch: This parameter can be:

None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs

An int, giving the exact number of total jobs that are spawned

A str, giving an expression as a function of n_jobs, as in ‘2*n_jobs’

cca_zoo.model_selection.cross_validate#

`cca_zoo.model_selection`.cross_validate#