This is the abstract base class for resampling objects like ResamplingCV and ResamplingBootstrap.

The objects of this class define how a task is partitioned for resampling (e.g., in resample() or benchmark()), using a set of hyperparameters such as the number of folds in cross-validation.

Resampling objects can be instantiated on a Task, which applies the strategy on the task and manifests in a fixed partition of row_ids of the Task.

Predefined resamplings are stored in mlr_resamplings.

Format

R6::R6Class object.

Construction

Note: This object is typically constructed via a derived classes, e.g. [ResamplingCV] or [ResamplingHoldout].
    r = Resampling$new(id, param_set, param_vals)
  • id :: character(1)
    Identifier for the resampling strategy.

  • param_set :: paradox::ParamSet
    Set of hyperparameters.

  • param_vals :: named list()
    List of hyperparameter settings.

Fields

  • id :: character(1)
    Identifier of the learner.

  • param_set :: paradox::ParamSet
    Description of available hyperparameters and hyperparameter settings.

  • hash :: character(1)
    Hash (unique identifier) for this object.

  • instance :: any
    During instantiate(), the instance is stored in this slot. Types vary from resampling strategy to resampling strategy.

  • is_instantiated :: logical(1)
    Is TRUE, if the resampling has been instantiated.

  • duplicated_ids :: logical(1)
    Is TRUE if this resampling strategy may have duplicated row ids in a single training set or test set. E.g., this is TRUE for Bootstrap, and FALSE for cross validation.

  • iters :: integer(1)
    Return the number of resampling iterations, depending on the values stored in the param_set.

  • task_hash :: character(1)
    The hash of the task which was passed to r$instantiate().

Methods

See also

Other Resampling: mlr_resamplings

Examples

r = mlr_resamplings$get("subsampling") # Default parametrization r$param_set$values
#> $repeats #> [1] 30 #> #> $ratio #> [1] 0.6666667 #>
# Do only 3 repeats on 10% of the data r$param_set$values = list(ratio = 0.1, repeats = 3) r$param_set$values
#> $ratio #> [1] 0.1 #> #> $repeats #> [1] 3 #>
# Instantiate on iris task task = mlr_tasks$get("iris") r$instantiate(task) # Extract train/test sets train_set = r$train_set(1) print(train_set)
#> [1] 14 21 149 69 88 82 74 118 8 32 119 147 40 90 112
intersect(train_set, r$test_set(1))
#> integer(0)
# Another example: 10-fold CV r = mlr_resamplings$get("cv")$instantiate(task) r$train_set(1)
#> [1] 1 11 17 22 60 66 69 79 88 100 108 114 117 124 143 15 19 25 #> [19] 26 31 41 46 53 55 93 115 130 140 145 150 6 30 36 42 45 49 #> [37] 51 67 73 76 90 97 112 120 142 2 14 52 57 62 81 94 105 107 #> [55] 111 128 131 137 139 144 16 34 40 43 50 80 85 86 92 98 116 118 #> [73] 126 146 147 8 18 23 44 64 70 75 77 102 109 132 135 136 141 149 #> [91] 5 7 21 28 39 54 59 63 68 72 95 103 110 119 121 3 12 24 #> [109] 56 58 78 89 91 99 104 106 127 133 134 148 4 9 10 27 32 33 #> [127] 35 38 61 71 74 84 101 123 125
# Stratification task = mlr_tasks$get("pima") prop.table(table(task$truth())) # moderately unbalanced
#> #> pos neg #> 0.3489583 0.6510417
r = mlr_resamplings$get("subsampling") r$instantiate(task) prop.table(table(task$truth(r$train_set(1)))) # roughly same proportion
#> #> pos neg #> 0.3671875 0.6328125