cca_zoo.model_selection
.cross_validate#
- class cca_zoo.model_selection.cross_validate(estimator, views, y=None, *, groups=None, scoring=None, cv=None, n_jobs=None, verbose=0, fit_params=None, pre_dispatch='2*n_jobs', return_train_score=False, return_estimator=False, error_score=nan)[source]#
Bases:
Evaluate metric(s) by cross-validation and also record fit/score times.
Read more in the User Guide.
- Parameters:
estimator (object) – Estimator object implementing ‘fit’. The object to use to fit the data.
views (list or tuple of array-like) – List or tuple of numpy arrays or array-likes with the same number of rows (samples).
y (array-like of shape (n_samples,) or (n_samples, n_outputs), optional, default=None) – The target variable to try to predict in the case of supervised learning.
groups (array-like of shape (n_samples,), optional, default=None) – Group labels for the samples used while splitting the dataset into train/test set. Only used in conjunction with a “Group” cv instance (e.g.,
GroupKFold
).scoring (str, callable, list, tuple, or dict, optional, default=None) – Strategy to evaluate the performance of the cross-validated model on the test set. See notes below for more detail.
cv (int, cross-validation generator or an iterable, optional, default=None) – Determines the cross-validation splitting strategy. See notes below for more detail.
n_jobs (int, optional, default=None) – Number of jobs to run in parallel.
verbose (int, default=0) – The verbosity level.
fit_params (dict, optional, default=None) – Parameters to pass to the fit method of the estimator.
pre_dispatch (int or str, default='2*n_jobs') – Controls the number of jobs that get dispatched during parallel execution. See notes below for more detail.
Notes
For scoring: If scoring represents a single score, one can use:
a single string (see The scoring parameter: defining model evaluation rules);
a callable (see Defining your scoring strategy from metric functions) that returns a single value.
- If scoring represents multiple scores, one can use:
a list or tuple of unique strings;
a callable returning a dictionary where the keys are the metric names and the values are the metric scores;
a dictionary with metric names as keys and callables a values.
See Specifying multiple metrics for evaluation for an example.
For cv: Possible inputs for cv are:
None, to use the default 5-fold cross validation,
int, to specify the number of folds in a (Stratified)KFold,
An iterable yielding (train, test) splits as arrays of indices.
For int/None inputs, if the estimator is a classifier and
y
is either binary or multiclass,StratifiedKFold
is used. In all other cases,Fold
is used. These splitters are instantiated with shuffle=False so the splits will be the same across calls. Refer User Guide for the various cross-validation strategies that can be used here.For pre_dispatch: This parameter can be:
None, in which case all the jobs are immediately created and spawned. Use this for lightweight and fast-running jobs, to avoid delays due to on-demand spawning of the jobs
An int, giving the exact number of total jobs that are spawned
A str, giving an expression as a function of n_jobs, as in ‘2*n_jobs’