Skip to content

Commit

Permalink
Merge pull request #955 from alan-turing-institute/dev
Browse files Browse the repository at this point in the history
For a 0.18.4 release
  • Loading branch information
ablaom committed Jul 14, 2022
2 parents 8f5f85f + cbd8550 commit 7dd4f4e
Show file tree
Hide file tree
Showing 4 changed files with 68 additions and 20 deletions.
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "MLJ"
uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
authors = ["Anthony D. Blaom <anthony.blaom@gmail.com>"]
version = "0.18.3"
version = "0.18.4"

[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Expand All @@ -27,7 +27,7 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
CategoricalArrays = "0.8,0.9, 0.10"
ComputationalResources = "0.3"
Distributions = "0.21,0.22,0.23, 0.24, 0.25"
MLJBase = "0.20"
MLJBase = "0.20.9"
MLJEnsembles = "0.3"
MLJIteration = "0.5"
MLJModels = "0.15.5"
Expand Down
39 changes: 30 additions & 9 deletions docs/src/adding_models_for_general_use.md
Original file line number Diff line number Diff line change
Expand Up @@ -722,6 +722,23 @@ If a new model type subtypes `JointProbablistic <: Probabilistic` then
implementation of `predict_joint` is compulsory.


### Training losses

```@docs
MLJModelInterface.training_losses
```

Trait values can also be set using the `metadata_model` method, see below.

### Feature importances

```@docs
MLJModelInterface.feature_importances
```

Trait values can also be set using the `metadata_model` method, see below.


### Trait declarations

Two trait functions allow the implementer to restrict the types of
Expand Down Expand Up @@ -802,15 +819,19 @@ Additional trait functions tell MLJ's `@load` macro how to find your
model if it is registered, and provide other self-explanatory metadata
about the model:

method | return type | declarable return values | fallback value
-------------------------|-------------------|------------------------------------|---------------
`load_path` | `String` | unrestricted | "unknown"
`package_name` | `String` | unrestricted | "unknown"
`package_uuid` | `String` | unrestricted | "unknown"
`package_url` | `String` | unrestricted | "unknown"
`package_license` | `String` | unrestricted | "unknown"
`is_pure_julia` | `Bool` | `true` or `false` | `false`
`supports_weights` | `Bool` | `true` or `false` | `false`
method | return type | declarable return values | fallback value
-----------------------------|-------------------|------------------------------------|---------------
`load_path` | `String` | unrestricted | "unknown"
`package_name` | `String` | unrestricted | "unknown"
`package_uuid` | `String` | unrestricted | "unknown"
`package_url` | `String` | unrestricted | "unknown"
`package_license` | `String` | unrestricted | "unknown"
`is_pure_julia` | `Bool` | `true` or `false` | `false`
`supports_weights` | `Bool` | `true` or `false` | `false`
`supports_class_weights` | `Bool` | `true` or `false` | `false`
`supports_training_losses` | `Bool` | `true` or `false` | `false`
`reports_feature_importances`| `Bool` | `true` or `false` | `false`


Here is the complete list of trait function declarations for
`DecisionTreeClassifier`, whose core algorithms are provided by
Expand Down
43 changes: 35 additions & 8 deletions docs/src/machines.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ the machine's internal state (as recorded in private fields
`old_model` and `old_rows`). These lower-level `fit` and `update`
methods, which are not ordinarily called directly by the user,
dispatch on the model and a view of the data defined by the optional
`rows` keyword argument of `fit!` (all rows by default).
`rows` keyword argument of `fit!` (all rows by default).

# Warm restarts

Expand Down Expand Up @@ -75,10 +75,10 @@ critetion. See [Controlling Iterative Models](@ref) for details.

## Inspecting machines

There are two methods for inspecting the outcomes of training in
MLJ. To obtain a named-tuple describing the learned parameters (in a
user-friendly way where possible) use `fitted_params(mach)`. All other
training-related outcomes are inspected with `report(mach)`.
There are two principal methods for inspecting the outcomes of training in MLJ. To obtain a
named-tuple describing the learned parameters (in a user-friendly way where possible) use
`fitted_params(mach)`. All other training-related outcomes are inspected with
`report(mach)`.

```@example machines
X, y = @load_iris
Expand All @@ -97,6 +97,32 @@ fitted_params
report
```

### Training losses and feature importances

Training losses and feature importances, if reported by a model, will be available in the
machine's report (see above). However, there are also direct access methods where
supported:

```julia
training_losses(mach::Machine) -> vector_of_losses
```

Here `vector_of_losses` will be in historical order (most recent loss last). This kind of
access is supported for `model = mach.model` if `supports_training_losses(model) == true`.

```julia
feature_importances(mach::Machine) -> vector_of_pairs
```

Here a `vector_of_pairs` is a vector of elements of the form `feature => importance_value`,
where `feature` is a symbol. For example, `vector_of_pairs = [:gender => 0.23, :height =>
0.7, :weight => 0.1]`. If a model does not support feature importances for some model
hyper-parameters, every `importance_value` will be zero. This kind of accesss is supported
for `model = mach.model` if `reports_feature_importances(model) == true`.

If a model can report multiple types of feature importances, then there will be a model
hyper-parameter controlling the active type.


## Constructing machines

Expand Down Expand Up @@ -203,8 +229,8 @@ For a supervised machine the `predict` method calls a lower-level
unsupervised cousins `transform` and `inverse_transform`, see
[Getting Started](index.md).

The fields of a `Machine` instance (which should not generally be
accessed by the user) are:
With the exception of `model`, a `Machine` instance has an number of fields which the user
should not directly access; these include:

- `model` - the struct containing the hyperparameters to be used in
calls to `fit!`
Expand All @@ -215,7 +241,8 @@ accessed by the user) are:
see [Learning Networks](@ref) (in the supervised learning example
above, `args = (source(X), source(y))`)

- `report` - outputs of training not encoded in `fitresult` (eg, feature rankings)
- `report` - outputs of training not encoded in `fitresult` (eg, feature rankings),
initially undefined

- `old_model` - a deep copy of the model used in the last call to `fit!`

Expand Down
2 changes: 1 addition & 1 deletion src/MLJ.jl
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ export coerce, coerce!, autotype, schema, info
# re-export from MLJBase:
export nrows, color_off, color_on,
selectrows, selectcols, restrict, corestrict, complement,
training_losses,
training_losses, feature_importances,
predict, predict_mean, predict_median, predict_mode, predict_joint,
transform, inverse_transform, evaluate, fitted_params, params,
@constant, @more, HANDLE_GIVEN_ID, UnivariateFinite,
Expand Down

0 comments on commit 7dd4f4e

Please sign in to comment.