This measure specializes Measure for measures quantifying the similarity of
sets of selected features.
To calculate similarity measures, the Learner must have the property
"selected_features"
.
task_type
is set toNA_character_
.average
is set to"custom"
.
Predefined measures can be found in the dictionary
mlr_measures, prefixed with "sim."
.
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-eval
Package mlr3measures for the scoring functions. Dictionary of Measures: mlr_measures
as.data.table(mlr_measures)
for a table of available Measures in the running session (depending on the loaded packages).Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
Other Measure:
Measure
,
MeasureClassif
,
MeasureRegr
,
mlr_measures
,
mlr_measures_aic
,
mlr_measures_bic
,
mlr_measures_classif.costs
,
mlr_measures_debug_classif
,
mlr_measures_elapsed_time
,
mlr_measures_internal_valid_score
,
mlr_measures_oob_error
,
mlr_measures_regr.rsq
,
mlr_measures_selected_features
Super class
mlr3::Measure
-> MeasureSimilarity
Methods
Method new()
Creates a new instance of this R6 class.
Usage
MeasureSimilarity$new(
id,
param_set = ps(),
range,
minimize = NA,
average = "macro",
aggregator = NULL,
properties = character(),
predict_type = NA_character_,
predict_sets = "test",
task_properties = character(),
packages = character(),
label = NA_character_,
man = NA_character_
)
Arguments
id
(
character(1)
)
Identifier for the new instance.param_set
(paradox::ParamSet)
Set of hyperparameters.range
(
numeric(2)
)
Feasible range for this measure asc(lower_bound, upper_bound)
. Both bounds may be infinite.minimize
(
logical(1)
)
Set toTRUE
if good predictions correspond to small values, and toFALSE
if good predictions correspond to large values. If set toNA
(default), tuning this measure is not possible.average
(
character(1)
)
How to average multiple Predictions from a ResampleResult.The default,
"macro"
, calculates the individual performances scores for each Prediction and then uses the function defined in$aggregator
to average them to a single number.If set to
"micro"
, the individual Prediction objects are first combined into a single new Prediction object which is then used to assess the performance. The function in$aggregator
is not used in this case.aggregator
(
function()
)
Function to aggregate over multiple iterations. The role of this function depends on the value of field"average"
:"macro"
: A numeric vector of scores (one per iteration) is passed. The aggregate function defaults tomean()
in this case."micro"
: Theaggregator
function is not used. Instead, predictions from multiple iterations are first combined and then scored in one go."custom"
: A ResampleResult is passed to the aggregate function.
properties
(
character()
)
Properties of the measure. Must be a subset of mlr_reflections$measure_properties. Supported bymlr3
:"requires_task"
(requires the complete Task),"requires_learner"
(requires the trained Learner),"requires_model"
(requires the trained Learner, including the fitted model),"requires_train_set"
(requires the training indices from the Resampling), and"na_score"
(the measure is expected to occasionally returnNA
orNaN
)."primary_iters"
(the measure explictly handles resamplings that only use a subset of their iterations for the point estimate)."requires_no_prediction"
(No prediction is required; This usually means that the measure extracts some information from the learner state.).
predict_type
(
character(1)
)
Required predict type of the Learner. Possible values are stored in mlr_reflections$learner_predict_types.predict_sets
(
character()
)
Prediction sets to operate on, used inaggregate()
to extract the matchingpredict_sets
from the ResampleResult. Multiple predict sets are calculated by the respective Learner duringresample()
/benchmark()
. Must be a non-empty subset of{"train", "test", "internal_valid"}
. If multiple sets are provided, these are first combined to a single prediction object. Default is"test"
.task_properties
(
character()
)
Required task properties, see Task.packages
(
character()
)
Set of required packages. A warning is signaled by the constructor if at least one of the packages is not installed, but loaded (not attached) later on-demand viarequireNamespace()
.label
(
character(1)
)
Label for the new instance.man
(
character(1)
)
String in the format[pkg]::[topic]
pointing to a manual page for this object. The referenced help package can be opened via method$help()
.
Examples
task = tsk("penguins")
learners = list(
lrn("classif.rpart", maxdepth = 1, id = "r1"),
lrn("classif.rpart", maxdepth = 2, id = "r2")
)
resampling = rsmp("cv", folds = 3)
grid = benchmark_grid(task, learners, resampling)
bmr = benchmark(grid, store_models = TRUE)
bmr$aggregate(msrs(c("classif.ce", "sim.jaccard")))
#> nr task_id learner_id resampling_id iters classif.ce sim.jaccard
#> <int> <char> <char> <char> <int> <num> <num>
#> 1: 1 penguins r1 cv 3 0.22400203 0.3333333
#> 2: 2 penguins r2 cv 3 0.07271803 0.4166667
#> Hidden columns: resample_result