This is the result container object returned by benchmark(). A BenchmarkResult consists of the data row-binded data of multiple ResampleResults, which can easily be re-constructed.

Note that all stored objects are accessed by reference. Do not modify any object without cloning it first.

Format

R6::R6Class object.

Construction

bmr = BenchmarkResult$new(data = data.table())

Fields

  • data :: data.table::data.table()
    Internal data storage with one row per resampling iteration. Can be joined with $rr_data by joining on column "hash". We discourage users to directly work with this table.

    Package develops on the other hand may opt to add additional columns here. These columns are preserved in all mutators.

  • rr_data :: data.table::data.table()
    Internal data storage with one row per ResampleResult. Can be joined with $data by joining on column "hash". Not used in mlr3 directly, but can be exploited by add-on packages.

    Package develops may opt to add additional columns here. These columns are preserved in all mutators.

  • task_type :: character(1)
    Task type of objects in the BenchmarkResult. All stored objects (Task, Learner, Prediction) in a single BenchmarkResult are required to have the same task type, e.g., "classif" or "regr".

  • tasks :: data.table::data.table()
    Table of used tasks with three columns: "task_hash" (character(1)), "task_id" (character(1)) and "task" (Task).

  • learners :: data.table::data.table()
    Table of used learners with three columns: "learner_hash" (character(1)), "learner_id" (character(1)) and "learner" (Learner). Note that it is not feasible to access learnt models via this getter, as the training task would be ambiguous. Instead, select a row from the table returned by $score().

  • resamplings :: data.table::data.table()
    Table of used resamplings with three columns: "resampling_hash" (character(1)), "resampling_id" (character(1)) and "resampling" (Resampling).

  • n_resample_results :: integer(1)
    Returns the number of stored ResampleResults.

  • uhashes :: character()
    Vector of unique hashes of all included ResampleResults.

Methods

S3 Methods

Examples

set.seed(123) learners = list( lrn("classif.featureless", predict_type = "prob"), lrn("classif.rpart", predict_type = "prob") ) design = benchmark_grid( tasks = list(tsk("sonar"), tsk("spam")), learners = learners, resamplings = rsmp("cv", folds = 3) ) print(design)
#> task learner resampling #> 1: <TaskClassif> <LearnerClassifFeatureless> <ResamplingCV> #> 2: <TaskClassif> <LearnerClassifRpart> <ResamplingCV> #> 3: <TaskClassif> <LearnerClassifFeatureless> <ResamplingCV> #> 4: <TaskClassif> <LearnerClassifRpart> <ResamplingCV>
bmr = benchmark(design) print(bmr)
#> <BenchmarkResult> of 12 rows with 4 resampling runs #> nr task_id learner_id resampling_id iters warnings errors #> 1 sonar classif.featureless cv 3 0 0 #> 2 sonar classif.rpart cv 3 0 0 #> 3 spam classif.featureless cv 3 0 0 #> 4 spam classif.rpart cv 3 0 0
bmr$tasks
#> task_hash task_id task #> 1: dea3e1fd99a2120d sonar <TaskClassif> #> 2: 7cf4341432d6352e spam <TaskClassif>
bmr$learners
#> learner_hash learner_id learner #> 1: 3bbabd1058707305 classif.featureless <LearnerClassifFeatureless> #> 2: fc402e71eadd46bb classif.rpart <LearnerClassifRpart>
# first 5 individual resamplings head(as.data.table(bmr, measures = c("classif.acc", "classif.auc")), 5)
#> uhash task #> 1: 6345bd59-e485-4a04-8f8c-c4ca540a71c3 <TaskClassif> #> 2: 6345bd59-e485-4a04-8f8c-c4ca540a71c3 <TaskClassif> #> 3: 6345bd59-e485-4a04-8f8c-c4ca540a71c3 <TaskClassif> #> 4: 3957a766-7326-4c46-a257-0652584f7aef <TaskClassif> #> 5: 3957a766-7326-4c46-a257-0652584f7aef <TaskClassif> #> learner resampling iteration prediction #> 1: <LearnerClassifFeatureless> <ResamplingCV> 1 <list> #> 2: <LearnerClassifFeatureless> <ResamplingCV> 2 <list> #> 3: <LearnerClassifFeatureless> <ResamplingCV> 3 <list> #> 4: <LearnerClassifRpart> <ResamplingCV> 1 <list> #> 5: <LearnerClassifRpart> <ResamplingCV> 2 <list>
# aggregate results bmr$aggregate()
#> nr resample_result task_id learner_id resampling_id iters #> 1: 1 <ResampleResult> sonar classif.featureless cv 3 #> 2: 2 <ResampleResult> sonar classif.rpart cv 3 #> 3: 3 <ResampleResult> spam classif.featureless cv 3 #> 4: 4 <ResampleResult> spam classif.rpart cv 3 #> classif.ce #> 1: 0.4660455 #> 2: 0.2739130 #> 3: 0.3940399 #> 4: 0.1086721
# aggregate results with hyperparameters as separate columns mlr3misc::unnest(bmr$aggregate(params = TRUE), "params")
#> nr resample_result task_id learner_id resampling_id iters #> 1: 1 <ResampleResult> sonar classif.featureless cv 3 #> 2: 2 <ResampleResult> sonar classif.rpart cv 3 #> 3: 3 <ResampleResult> spam classif.featureless cv 3 #> 4: 4 <ResampleResult> spam classif.rpart cv 3 #> classif.ce method xval #> 1: 0.4660455 mode NA #> 2: 0.2739130 <NA> 0 #> 3: 0.3940399 mode NA #> 4: 0.1086721 <NA> 0
# extract resample result for classif.rpart rr = bmr$aggregate()[learner_id == "classif.rpart", resample_result][[1]] print(rr)
#> <ResampleResult> of 3 iterations #> * Task: sonar #> * Learner: classif.rpart #> * Warnings: 0 in 0 iterations #> * Errors: 0 in 0 iterations
# access the confusion matrix of the first resampling iteration rr$predictions()[[1]]$confusion
#> truth #> response M R #> M 30 18 #> R 3 19