Extract probability for a tresholded model #981

tpoisot · 2022-11-04T18:02:30Z

When calling predict on a BinaryThresholdPredictor, is there a specific incantation that can give the probability of the positive class? Is there a way to "unwrap" the underlying model?

The text was updated successfully, but these errors were encountered:

ablaom · 2022-11-06T21:48:05Z

No, I think if you want to get probabilisitic predictions for the original (atomic) model, then you need to train the atomic model on the data. You can't eat your cake and have it too.

I don't know your use-case, but I suppose you could roll your own wrapper that preserves the behaviour of predict (keeps it probabilistic) but overloads predict_mode to do the thresholding. You can specify predict_mode as the operation in things like evaluate! and TunedModel.

using MLJ
import CategoricalDistributions

mutable struct Thresholder <: ProbabilisticComposite
    model
    threshold::Float64
end

# to do the thresholding:
function deterministic(y_probabilistic, threshold)
    classes = CategoricalDistributions.classes(y_probabilistic)
    map(pdf.(y_probabilistic, classes[2]) .> threshold) do above
        above ? classes[2] : classes[1]
    end
end

function MLJ.fit(thresholder::Thresholder, verbosity, X, y)
    Xs = source(X)
    ys = source(y)
    mach = machine(thresholder.model, Xs, ys)
    y_probabilistic = predict(mach, Xs)
    y_deterministic = node(y -> deterministic(y, thresholder.threshold), y_probabilistic)

    network_machine = machine(Probabilistic(), Xs, ys;
                              predict=y_probabilistic,
                              predict_mode=y_deterministic)

    return!(network_machine, thresholder, verbosity)
end

X, y = make_moons()
DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
thresholder = Thresholder(DecisionTreeClassifier(), 0.7)

mach = machine(thresholder, X, y) |> fit!
predict(mach, X)      # probabilistic predictions
predict_mode(mach, X) # thresholded predictions

julia> evaluate!(
       mach, 
       operation=[predict_mode, predict], 
       measure=[accuracy, log_loss],
       resampling=CV()
       )
Evaluating over 6 folds: 100%[=========================] Time: 0:00:00
PerformanceEvaluation object with these fields:
  measure, operation, measurement, per_fold,
  per_observation, fitted_params_per_fold,
  report_per_fold, train_test_rows
Extract:
┌────────────────────────────────┬──────────────┬─────────────┬─────────┬───────────────────
│ measure                        │ operation    │ measurement │ 1.96*SE │ per_fold         ⋯
├────────────────────────────────┼──────────────┼─────────────┼─────────┼───────────────────
│ Accuracy()                     │ predict_mode │ 0.967       │ 0.0345  │ [0.96, 0.92, 0.9 ⋯
│ LogLoss(                       │ predict      │ 1.2         │ 1.24    │ [1.44, 2.88, 2.8 ⋯
│   tol = 2.220446049250313e-16) │              │             │         │                  ⋯
└────────────────────────────────┴──────────────┴─────────────┴─────────┴───────────────────
                                                                            1 column omitted

ablaom · 2022-11-06T21:53:46Z

(After JuliaAI/MLJBase.jl#853, the preferred way of exporting the learning network is to replace MLJ.fit definition above with the simpler

function MLJ.prefit(thresholder::Thresholder, verbosity, X, y)
    Xs = source(X)
    ys = source(y)
    mach = machine(:model, Xs, ys)
    y_probabilistic = predict(mach, Xs)
    y_deterministic = node(y -> deterministic(y, thresholder.threshold), y_probabilistic)

    (predict=y_probabilistic, predict_mode=y_deterministic)
end

and change the subtyping Thresholder <: ProbabilisticComposite to Thresholder <: ProbabilisticNetworkComposite.)

tpoisot · 2022-11-07T01:30:59Z

No, I think if you want to get probabilisitic predictions for the original (atomic) model, then you need to train the atomic model on the data. You can't eat your cake and have it too.

In the specific case I used to explore this, the atomic model was wrapped into a BinaryThresholdPredictor, and then into a TunedModel to optimize the threshold -- so I think (if I understand correctly what TunedModel and BinaryThresholdPredictor do from the documentation) that the atomic model is trained.

ablaom · 2022-11-07T01:57:51Z

Yes, it is trained, but the public API doesn't give you a way to produce probabilistic predictions. It is possible to get them without retraining, but it's a hack (non-public API).

tpoisot · 2022-11-07T02:09:59Z

Got it, thanks!!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract probability for a tresholded model #981

Extract probability for a tresholded model #981

tpoisot commented Nov 4, 2022

ablaom commented Nov 6, 2022

ablaom commented Nov 6, 2022

tpoisot commented Nov 7, 2022

ablaom commented Nov 7, 2022 •

edited

tpoisot commented Nov 7, 2022

Extract probability for a tresholded model #981

Extract probability for a tresholded model #981

Comments

tpoisot commented Nov 4, 2022

ablaom commented Nov 6, 2022

ablaom commented Nov 6, 2022

tpoisot commented Nov 7, 2022

ablaom commented Nov 7, 2022 • edited

tpoisot commented Nov 7, 2022

ablaom commented Nov 7, 2022 •

edited