This is the abstract base class for learner objects like LearnerClassif and LearnerRegr.

Learners consist of the following parts:

  • Methods train() and predict() which call train_internal() and predict_internal().

  • The fitted model, after calling train().

  • A paradox::ParamSet which stores meta-information about available hyperparameters, and also stores hyperparameter settings.

  • Meta-information about the requirements and capabilities of the learner.

Predefined learners are stored in the mlr3misc::Dictionary mlr_learners, e.g. classif.rpart or regr.rpart. A guide on how to extend mlr3 with custom learners can be found in the mlr3book.

Format

R6::R6Class object.

Construction

Note: This object is typically constructed via a derived classes, e.g. LearnerClassif or LearnerRegr.

l = Learner$new(id, task_type, param_set = ParamSet$new(), predict_types = character(),
     feature_types = character(), properties = character(), packages = character())
  • id :: character(1)
    Identifier for the learner.

  • task_type :: character(1)
    Type of the task the learner can operator on. E.g., "classif" or "regr".

  • param_set :: paradox::ParamSet
    Set of hyperparameters.

  • predict_types :: character()
    Supported predict types. Must be a subset of mlr_reflections$learner_predict_types.

  • predict_sets :: character()
    Sets to predict on during resample()/benchmark(). Creates and stores a separate Prediction object for each set. The individual sets can be combined via getters in ResampleResult/BenchmarkResult, or Measures can be set to operate on subsets of the calculated prediction sets. Must be a non-empty subset of ("train", "test"). Default is "test".

  • feature_types :: character()
    Feature types the learner operates on. Must be a subset of mlr_reflections$task_feature_types.

  • properties :: character()
    Set of properties of the learner. Must be a subset of mlr_reflections$learner_properties. The following properties are currently standardized and understood by learners in mlr3:

    • "missings": The learner can handle missing values in the data.

    • "weights": The learner supports observation weights.

    • "importance": The learner supports extraction of importance scores, i.e. comes with a importance() extractor function (see section on optional extractors).

    • "selected_features": The learner supports extraction of the set of selected features, i.e. comes with a selected_features() extractor function (see section on optional extractors).

    • "oob_error": The learner supports extraction of estimated out of bag error, i.e. comes with a oob_error() extractor function (see section on optional extractors).

  • data_formats :: character()
    Vector of supported data formats which can be processed during $train() and $predict(). Defaults to "data.table".

  • packages :: character()
    Set of required packages. Note that these packages will be loaded via requireNamespace(), and are not attached.

  • man :: character(1)
    String in the format [pkg]::[topic] pointing to a manual page for this object.

Fields

  • id :: character(1)
    Identifier of the learner.

  • task_type :: character(1)
    Stores the type of class this learner can operate on, e.g. "classif" or "regr". A complete list of task types is stored in mlr_reflections$task_types.

  • param_set :: paradox::ParamSet
    Description of available hyperparameters and hyperparameter settings.

  • predict_types :: character()
    Stores the possible predict types the learner is capable of. A complete list of candidate predict types, grouped by task type, is stored in mlr_reflections$learner_predict_types.

  • predict_type :: character(1)
    Stores the currently selected predict type. Must be an element of l$predict_types.

  • feature_types :: character()
    Stores the feature types the learner can handle, e.g. "logical", "numeric", or "factor". A complete list of candidate feature types, grouped by task type, is stored in mlr_reflections$task_feature_types.

  • properties :: character()
    Stores a set of properties/capabilities the learner has. A complete list of candidate properties, grouped by task type, is stored in mlr_reflections$learner_properties.

  • packages :: character()
    Stores the names of required packages.

  • state :: NULL | named list()
    Current (internal) state of the learner. Contains all information learnt during train() and predict(). Do not access elements from here directly.

  • encapsulate (named character())
    How to call the code in train_internal() and predict_internal(). Must be a named character vector with names "train" and "predict". Possible values are "none", "evaluate" and "callr". See mlr3misc::encapsulate() for more details.

  • fallback (Learner)
    Learner which is fitted to impute predictions in case that either the model fitting or the prediction of the top learner is not successful. Requires you to enable encapsulation, otherwise errors are not caught and the execution is terminated before the fallback learner kicks in.

  • hash :: character(1)
    Hash (unique identifier) for this object.

  • model :: any
    The fitted model. Only available after $train() has been called.

  • timings :: numeric(2)
    Elapsed time in seconds for the steps "train" and "predict". Measured via mlr3misc::encapsulate().

  • log :: data.table::data.table()
    Returns the output (including warning and errors) as table with columns "stage" (train or predict), "class" (output, warning, error) and "msg" (character()).

  • warnings :: character()
    Returns the logged warnings as vector.

  • errors :: character()
    Returns the logged errors as vector.

Methods

  • train(task, row_ids = NULL))
    (Task, integer() | character()) -> self
    Train the learner on the row ids of the provided Task. Mutates the learner by reference, i.e. stores the model alongside other objects in field $state.

  • predict(task, row_ids = NULL)
    (Task, integer() | character()) -> Prediction
    Uses the data stored during $train() in $state to create a new Prediction based on the provided row_ids of the task.

  • predict_newdata(newdata, task = NULL)
    (data.frame(), Task) -> Prediction
    Uses the model fitted during $train() in to create a new Prediction based on the new data in newdata. Object task is the task used during $train() and required for conversions of newdata. If the learner's $train() method has been called, there is a (size reduced) version of the training task stored in the learner. If the learner has been fitted via resample() or benchmark(), you need to pass the corresponding task stored in the ResampleResult or BenchmarkResult, respectively.

  • help()
    () -> NULL
    Opens the corresponding help page referenced by $man.

Optional Extractors

Specific learner implementations are free to implement additional getters to ease the access of certain parts of the model in the inherited subclasses.

For the following operations, extractors are standardized:

  • importance(...): Returns the feature importance score as numeric vector. The higher the score, the more important the variable. The returned vector is named with feature names and sorted in decreasing order. Note that the model might omit features it has not used at all. The learner must be tagged with property "importance". To filter variables using the importance scores, use package mlr3filters.

  • selected_features(...): Returns a subset of selected features as character(). The learner must be tagged with property "selected_features".

  • oob_error(...): Returns the out-of-bag error of the model as numeric(1). The learner must be tagged with property "oob_error".

Setting Hyperparameters

All information about hyperparameters is stored in the slot param_set which is a paradox::ParamSet. The printer gives an overview about the ids of available hyperparameters, their storage type, lower and upper bounds, possible levels (for factors), default values and assigned values. To set hyperparameters, assign a named list to the subslot values:

lrn = lrn("classif.rpart")
lrn$param_set$values = list(minsplit = 3, cp = 0.01)

Note that this operation replaces all previously set hyperparameter values. If you only intend to change one specific hyperparameter value and leave the others as-is, you can use the helper function mlr3misc::insert_named():

lrn$param_set$values = mlr3misc::insert_named(lrn$param_set$values, list(cp = 0.001))

If the learner has additional hyperparameters which are not encoded in the ParamSet, you can easily extend the learner. Here, we add a hyperparameter with id "foo" possible levels "a" and "b":

lrn$param_set$add(paradox::ParamFct$new("foo", levels = c("a", "b")))

See also