This is the abstract base class for measures like MeasureClassif and MeasureRegr.

Measures are classes around tailored around two functions:

  1. A function $score() which quantifies the performance by comparing true and predicted response.

  2. A function $aggregator() which combines multiple performance scores returned by calculate to a single numeric value.

In addition to these two functions, meta-information about the performance measure is stored.

Predefined measures are stored in the dictionary mlr_measures, e.g. classif.auc or time_train. Many of the measures in mlr3 are implemented in mlr3measures as ordinary functions.

A guide on how to extend mlr3 with custom measures can be found in the mlr3book.

See also

Public fields

id

(character(1))
Identifier of the object. Used in tables, plot and text output.

task_type

(character(1))
Task type, e.g. "classif" or "regr".For a complete list of possible task types (depending on the loaded packages), see mlr_reflections$task_types$type.

predict_type

(character(1))
Required predict type of the Learner.

predict_sets

(character())
During resample()/benchmark(), a Learner can predict on multiple sets. Per default, a learner only predicts observations in the test set (predict_sets == "test"). To change this behaviour, set predict_sets to a non-empty subset of {"train", "test"}. Each set yields a separate Prediction object. Those be combined via getters in ResampleResult/BenchmarkResult, or Measures can be altered to operate on specific subsets of the calculated prediction sets.

average

(character(1))
Method for aggregation:

  • "micro": All predictions from multiple resampling iterations are first combined into a single Prediction object. Next, the scoring function of the measure is applied on this combined object, yielding a single numeric score.

  • "macro": The scoring function is applied on the Prediction object of each resampling iterations, each yielding a single numeric score. Next, the scores are combined with the aggregator function to a single numerical score.

aggregator

(function())
Function to aggregate scores computed on different resampling iterations.

task_properties

(character())
Required properties of the Task.

range

(numeric(2))
Lower and upper bound of possible performance scores.

properties

(character())
Properties of this measure.

minimize

(logical(1))
If TRUE, good predictions correspond to small values of performance scores.

packages

(character(1))
Set of required packages. These packages are loaded, but not attached.

man

(character(1))
String in the format [pkg]::[topic] pointing to a manual page for this object. Defaults to NA, but can be set by child classes.

Active bindings

hash

(character(1))
Hash (unique identifier) for this object.

Methods

Public methods


Method new()

Creates a new instance of this R6 class.

Note that this object is typically constructed via a derived classes, e.g. MeasureClassif or MeasureRegr.

Usage

Measure$new(
  id,
  task_type = NA,
  range = c(-Inf, Inf),
  minimize = NA,
  average = "macro",
  aggregator = NULL,
  properties = character(),
  predict_type = "response",
  predict_sets = "test",
  task_properties = character(),
  packages = character(),
  man = NA_character_
)

Arguments

id

(character(1))
Identifier for the new instance.

task_type

(character(1))
Type of task, e.g. "regr" or "classif". Must be an element of mlr_reflections$task_types$type.

range

(numeric(2))
Feasible range for this measure as c(lower_bound, upper_bound). Both bounds may be infinite.

minimize

(logical(1))
Set to TRUE if good predictions correspond to small values, and to FALSE if good predictions correspond to large values. If set to NA (default), tuning this measure is not possible.

average

(character(1))
How to average multiple Predictions from a ResampleResult.The default, "macro", calculates the individual performances scores for each Prediction and then uses the function defined in $aggregator to average them to a single number.If set to "micro", the individual Prediction objects are first combined into a single new Prediction object which is then used to assess the performance. The function in $aggregator is not used in this case.

aggregator

(function(x))
Function to aggregate individual performance scores x where x is a numeric vector. If NULL, defaults to mean().

properties

(character())
Properties of the measure. Must be a subset of mlr_reflections$measure_properties. Supported by mlr3:

  • "requires_task" (requires the complete Task),

  • "requires_learner" (requires the trained Learner),

  • "requires_train_set" (requires the training indices from the Resampling), and

  • "na_score" (the measure is expected to occasionally return NA or NaN).

predict_type

(character(1))
Required predict type of the Learner. Possible values are stored in mlr_reflections$learner_predict_types.

predict_sets

(character())
Prediction sets to operate on, used in aggregate() to extract the matching predict_sets from the ResampleResult. Multiple predict sets are calculated by the respective Learner during resample()/benchmark(). Must be a non-empty subset of {"train", "test"}. If multiple sets are provided, these are first combined to a single prediction object. Default is "test".

task_properties

(character())
Required task properties, see Task.

packages

(character())
Set of required packages. A warning is signaled by the constructor if at least one of the packages is not installed, but loaded (not attached) later on-demand via requireNamespace().

man

(character(1))
String in the format [pkg]::[topic] pointing to a manual page for this object. The referenced help package can be opened via method $help().


Method format()

Helper for print outputs.

Usage

Measure$format()


Method print()

Printer.

Usage

Measure$print()

Arguments

...

(ignored).


Method help()

Opens the corresponding help page referenced by field $man.

Usage

Measure$help()


Method score()

Takes a Prediction (or a list of Prediction objects named with valid predict_sets) and calculates a numeric score. If the measure if flagged with the properties "requires_task", "requires_learner" or "requires_train_set", you must additionally pass the respective Task, the trained Learner or the training set indices. This is handled internally during resample()/benchmark().

Usage

Measure$score(prediction, task = NULL, learner = NULL, train_set = NULL)

Arguments

prediction

(Prediction | named list of Prediction).

task

(Task).

learner

(Learner).

train_set

(integer()).

Returns

numeric(1).


Method aggregate()

Aggregates multiple performance scores into a single score using the aggregator function of the measure. Operates on the Predictions of ResampleResult with matching predict_sets.

Usage

Measure$aggregate(rr)

Arguments

Returns

numeric(1).