Runs a benchmark on arbitrary combinations of tasks (Task), learners (Learner), and resampling strategies (Resampling), possibly in parallel.
benchmark(design, store_models = FALSE, store_backends = TRUE)
design | ( |
---|---|
store_models | ( |
store_backends | ( |
The fitted models are discarded after the predictions have been scored in order to reduce memory consumption.
If you need access to the models for later analysis, set store_models
to TRUE
.
This function can be parallelized with the future package.
One job is one resampling iteration, and all jobs are send to an apply function
from future.apply in a single batch.
To select a parallel backend, use future::plan()
.
This function supports progress bars via the package progressr.
Simply wrap the function in progressr::with_progress()
to enable them.
We recommend to use package progress as backend; enable with
progressr::handlers("progress")
.
The mlr3 uses the lgr package for logging.
lgr supports multiple log levels which can be queried with
getOption("lgr.log_levels")
.
To suppress output and reduce verbosity, you can lower the log from the
default level "info"
to "warn"
:
lgr::get_logger("mlr3")$set_threshold("warn")
To get additional log output for debugging, increase the log level to "debug"
or "trace"
:
lgr::get_logger("mlr3")$set_threshold("debug")
To log to a file or a data base, see the documentation of lgr::lgr-package.
# benchmarking with benchmark_grid() tasks = lapply(c("penguins", "sonar"), tsk) learners = lapply(c("classif.featureless", "classif.rpart"), lrn) resamplings = rsmp("cv", folds = 3) design = benchmark_grid(tasks, learners, resamplings) print(design)#> task learner resampling #> 1: <TaskClassif[46]> <LearnerClassifFeatureless[34]> <ResamplingCV[19]> #> 2: <TaskClassif[46]> <LearnerClassifRpart[34]> <ResamplingCV[19]> #> 3: <TaskClassif[46]> <LearnerClassifFeatureless[34]> <ResamplingCV[19]> #> 4: <TaskClassif[46]> <LearnerClassifRpart[34]> <ResamplingCV[19]>#> uhash task #> 1: adbeff20-e10a-4468-9d11-2daefda5e5eb <TaskClassif[46]> #> 2: adbeff20-e10a-4468-9d11-2daefda5e5eb <TaskClassif[46]> #> 3: adbeff20-e10a-4468-9d11-2daefda5e5eb <TaskClassif[46]> #> 4: 05f24713-5fce-4142-ab50-7621f8bdc5ac <TaskClassif[46]> #> 5: 05f24713-5fce-4142-ab50-7621f8bdc5ac <TaskClassif[46]> #> 6: 05f24713-5fce-4142-ab50-7621f8bdc5ac <TaskClassif[46]> #> learner resampling iteration #> 1: <LearnerClassifFeatureless[34]> <ResamplingCV[19]> 1 #> 2: <LearnerClassifFeatureless[34]> <ResamplingCV[19]> 2 #> 3: <LearnerClassifFeatureless[34]> <ResamplingCV[19]> 3 #> 4: <LearnerClassifRpart[34]> <ResamplingCV[19]> 1 #> 5: <LearnerClassifRpart[34]> <ResamplingCV[19]> 2 #> 6: <LearnerClassifRpart[34]> <ResamplingCV[19]> 3 #> prediction #> 1: <PredictionClassif[19]> #> 2: <PredictionClassif[19]> #> 3: <PredictionClassif[19]> #> 4: <PredictionClassif[19]> #> 5: <PredictionClassif[19]> #> 6: <PredictionClassif[19]>#> nr resample_result task_id learner_id resampling_id iters #> 1: 1 <ResampleResult[21]> penguins classif.featureless cv 3 #> 2: 2 <ResampleResult[21]> penguins classif.rpart cv 3 #> 3: 3 <ResampleResult[21]> sonar classif.featureless cv 3 #> 4: 4 <ResampleResult[21]> sonar classif.rpart cv 3 #> classif.ce #> 1: 0.55807272 #> 2: 0.07561658 #> 3: 0.46652864 #> 4: 0.28854382## Extract predictions of first resampling result rr = aggr$resample_result[[1]] as.data.table(rr$prediction())#> row_ids truth response #> 1: 4 Adelie Adelie #> 2: 8 Adelie Adelie #> 3: 12 Adelie Adelie #> 4: 14 Adelie Adelie #> 5: 15 Adelie Adelie #> --- #> 340: 329 Chinstrap Adelie #> 341: 330 Chinstrap Adelie #> 342: 339 Chinstrap Adelie #> 343: 340 Chinstrap Adelie #> 344: 343 Chinstrap Adelie# Benchmarking with a custom design: # - fit classif.featureless on penguins with a 3-fold CV # - fit classif.rpart on sonar using a holdout tasks = list(tsk("penguins"), tsk("sonar")) learners = list(lrn("classif.featureless"), lrn("classif.rpart")) resamplings = list(rsmp("cv", folds = 3), rsmp("holdout")) design = data.table::data.table( task = tasks, learner = learners, resampling = resamplings ) ## Instantiate resamplings design$resampling = Map( function(task, resampling) resampling$clone()$instantiate(task), task = design$task, resampling = design$resampling ) ## Run benchmark bmr = benchmark(design) print(bmr)#> <BenchmarkResult> of 4 rows with 2 resampling runs #> nr task_id learner_id resampling_id iters warnings errors #> 1 penguins classif.featureless cv 3 0 0 #> 2 sonar classif.rpart holdout 1 0 0## Get the training set of the 2nd iteration of the featureless learner on penguins rr = bmr$aggregate()[learner_id == "classif.featureless"]$resample_result[[1]] rr$resampling$train_set(2)#> [1] 5 7 8 9 12 13 17 19 22 25 28 35 36 40 46 48 49 50 #> [19] 52 53 54 60 61 62 63 67 69 72 73 74 75 76 78 81 84 85 #> [37] 88 92 97 101 103 104 109 110 114 119 122 127 129 130 131 136 145 147 #> [55] 156 160 162 163 166 170 172 173 177 179 180 184 185 188 190 191 193 194 #> [73] 205 212 213 217 218 221 228 229 233 234 238 239 245 250 252 255 256 258 #> [91] 259 263 264 267 271 278 281 282 287 291 295 296 297 300 302 307 309 317 #> [109] 319 321 325 327 331 337 341 3 4 10 11 15 23 24 31 32 38 39 #> [127] 41 42 43 45 56 57 58 64 68 77 79 87 93 99 100 102 111 112 #> [145] 113 115 118 120 124 126 128 132 133 134 135 137 142 143 146 148 149 151 #> [163] 153 154 155 159 161 164 169 171 174 182 186 189 195 196 197 199 200 203 #> [181] 204 207 210 215 219 220 223 224 226 227 230 240 242 243 246 247 251 253 #> [199] 261 266 269 273 274 275 276 277 279 284 286 288 289 290 292 293 301 306 #> [217] 308 313 314 315 316 318 323 330 335 336 338 343 344