Skip to contents

Splits data into training and test sets in a cross-validation fashion based on a user-provided categorical vector. This vector can be passed during instantiation either via an arbitrary factor f with the same length as task$nrow, or via a single string col referring to a column in the task.

An alternative but equivalent approach using leave-one-out resampling is showcased in the examples of mlr_resamplings_loo.

Dictionary

This Resampling can be instantiated via the dictionary mlr_resamplings or with the associated sugar function rsmp():

mlr_resamplings$get("custom_cv")
rsmp("custom_cv")

See also

Super class

mlr3::Resampling -> ResamplingCustomCV

Active bindings

iters

(integer(1))
Returns the number of resampling iterations, depending on the values stored in the param_set.

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method instantiate()

Instantiate this Resampling as cross-validation with custom splits.

Usage

ResamplingCustomCV$instantiate(task, f = NULL, col = NULL)

Arguments

task

Task
Used to extract row ids.

f

(factor() | character())
Vector of type factor or character with the same length as task$nrow. Row ids are split on this vector, each distinct value results in a fold. Empty factor levels are dropped and row ids corresponding to missing values are removed, c.f. split().

col

(character(1))
Name of the task column to use for splitting. Alternative and mutually exclusive to providing the factor levels as a vector via parameter f.


Method clone()

The objects of this class are cloneable with this method.

Usage

ResamplingCustomCV$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Create a task with 10 observations
task = tsk("penguins")
task$filter(1:10)

# Instantiate Resampling:
custom_cv = rsmp("custom_cv")
f = factor(c(rep(letters[1:3], each = 3), NA))
custom_cv$instantiate(task, f = f)
custom_cv$iters # 3 folds
#> [1] 3

# Individual sets:
custom_cv$train_set(1)
#> [1] 4 5 6 7 8 9
custom_cv$test_set(1)
#> [1] 1 2 3

# Disjunct sets:
intersect(custom_cv$train_set(1), custom_cv$test_set(1))
#> integer(0)