Measure to compare true observed labels with predicted labels in multiclass classification tasks.
Details
In the binary case, the Matthews Correlation Coefficient is defined as $$ \frac{\mathrm{TP} \cdot \mathrm{TN} - \mathrm{FP} \cdot \mathrm{FN}}{\sqrt{(\mathrm{TP} + \mathrm{FP}) (\mathrm{TP} + \mathrm{FN}) (\mathrm{TN} + \mathrm{FP}) (\mathrm{TN} + \mathrm{FN})}}, $$ where \(TP\), \(FP\), \(TN\), \(TP\) are the number of true positives, false positives, true negatives, and false negatives respectively.
In the multi-class case, the Matthews Correlation Coefficient is defined for a multi-class confusion matrix \(C\) with \(K\) classes: $$ \frac{c \cdot s - \sum_k^K p_k \cdot t_k}{\sqrt{(s^2 - \sum_k^K p_k^2) \cdot (s^2 - \sum_k^K t_k^2)}}, $$ where
\(s = \sum_i^K \sum_j^K C_{ij}\): total number of samples
\(c = \sum_k^K C_{kk}\): total number of correctly predicted samples
\(t_k = \sum_i^K C_{ik}\): number of predictions for each class \(k\)
\(p_k = \sum_j^K C_{kj}\): number of true occurrences for each class \(k\).
The above formula is undefined if any of the four sums in the denominator is 0 in the binary case and more generally if either \(s^2 - \sum_k^K p_k^2\) or \(s^2 - \sum_k^K t_k^2)\) is equal to 0. The denominator is then set to 1.
When there are more than two classes, the MCC will no longer range between -1 and +1. Instead, the minimum value will be between -1 and 0 depending on the true distribution. The maximum value is always +1.
Note
The score function calls mlr3measures::mcc() from package mlr3measures.
If the measure is undefined for the input, NaN is returned.
This can be customized by setting the field na_value.
Dictionary
This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():
See also
Dictionary of Measures: mlr_measures
as.data.table(mlr_measures) for a complete table of all (also dynamically created) Measure implementations.
Other classification measures:
mlr_measures_classif.acc,
mlr_measures_classif.auc,
mlr_measures_classif.bacc,
mlr_measures_classif.bbrier,
mlr_measures_classif.ce,
mlr_measures_classif.costs,
mlr_measures_classif.dor,
mlr_measures_classif.fbeta,
mlr_measures_classif.fdr,
mlr_measures_classif.fn,
mlr_measures_classif.fnr,
mlr_measures_classif.fomr,
mlr_measures_classif.fp,
mlr_measures_classif.fpr,
mlr_measures_classif.logloss,
mlr_measures_classif.mauc_au1p,
mlr_measures_classif.mauc_au1u,
mlr_measures_classif.mauc_aunp,
mlr_measures_classif.mauc_aunu,
mlr_measures_classif.mauc_mu,
mlr_measures_classif.mbrier,
mlr_measures_classif.npv,
mlr_measures_classif.ppv,
mlr_measures_classif.prauc,
mlr_measures_classif.precision,
mlr_measures_classif.recall,
mlr_measures_classif.sensitivity,
mlr_measures_classif.specificity,
mlr_measures_classif.tn,
mlr_measures_classif.tnr,
mlr_measures_classif.tp,
mlr_measures_classif.tpr
Other multiclass classification measures:
mlr_measures_classif.acc,
mlr_measures_classif.bacc,
mlr_measures_classif.ce,
mlr_measures_classif.costs,
mlr_measures_classif.logloss,
mlr_measures_classif.mauc_au1p,
mlr_measures_classif.mauc_au1u,
mlr_measures_classif.mauc_aunp,
mlr_measures_classif.mauc_aunu,
mlr_measures_classif.mauc_mu,
mlr_measures_classif.mbrier