Skip to content

Releases: JuliaAI/MLJ.jl

v0.13.0

01 Sep 04:15
224431b
Compare
Choose a tag to compare

MLJ v0.13.0

Diff since v0.12.1

Updates requirements for MLJBase, MLJModels and MLJScientificTypes to enable new features and fix some bugs:

  • (enhancement) Add fitted_params_per_fold and report_per_fold properties to the object returned by evaluate/evaluate! to give user access to the outcomes of training for each train/test pair in resampling (#400, #616)

  • (enhancement) Implement logpdf for UnivariateFinite distributions (JuliaAI/MLJBase.jl#411)

  • (bug fix and deprecation) Fix bug related to creating new composite models by hand in special case of non-model hyper-parameters (not an issue with @pipeline or @from_network models). Introduce new return! syntax for doing this and deprecate calling of learning network machines ( JuliaAI/MLJBase.jl#390, JuliaAI/MLJBase.jl#391, JuliaAI/MLJBase.jl#377)

  • (breaking) Change the behavior of evaluate/evaluate! so that weights are only passed to measures if explicitly passed using the key-word argument weights=... (JuliaAI/MLJBase.jl#405)

  • (new model) Add UnivariateTimeTypeToContinuous model for converting assorted time data into Continuous data (JuliaAI/MLJModels.jl#295)

  • (breaking) LDA models from MultivariateStats that take UnivariateFinite objects as a hyper-parameters to take dictionaries instead. Also change some default hyper-parameter values and improve the report (JuliaAI/MLJModels.jl#276)

  • (enhancement) Improve efficiency of FillImputer model (JuliaAI/MLJModels.jl#292)

  • (bug fix) Fix issue with syntax for loading models with a user-specified name (JuliaAI/MLJModels.jl#294)

  • (mildly breaking) Regard Nothing as a native scientific type and declare scitype(nothing) = Nothing (old behaviour: scitype(nothing) = Unknown (JuliaAI/ScientificTypes.jl#112)
    Also updates the manual to reflect changes, and makes some improvements to the same.

  • (breaking) Remove deprecated @pipeline syntax (JuliaAI/MLJBase.jl#350)

Closed issues:

  • [docs] Document metadata_pkg, metadata_model and @mlj_model (#241)
  • Improved docs for the data interface between MLJ and the world (#379)
  • Document workaround for @mlj_model macro issues around negative defaults (#504)
  • Document: can't use multithreading for python models (#525)
  • In "Working with Categorical Data" part of docs, explain about int method. (#605)
  • Allow access to the outcomes of fitting models on each fold in resampling (#616)
  • Errors when columns read from the CSV file have missing entries. (#622)
  • On models that fit a distribution to some data. (#641)
  • Meta-issue: Add the JointProbabilistic <: Probabilistic subtype (#642)

Merged pull requests:

v0.12.1

25 Aug 23:12
44e2861
Compare
Choose a tag to compare

MLJ v0.12.1

Diff since v0.12.0

Closed issues:

  • Should some if any "classic" ML algorithms accepts matrix input in addition to table input? (#209)
  • Dataset generation for model inspection (#214)
  • Coerce fails when a column has type Vector{Missing} (#549)
  • OpenML integration: Columns as Symbols (#579)
  • Issue to generate new releases (#583)
  • Something not right with the Binder project file? (#587)
  • range(pipeEvoTreeClassifier, :(selector.features), values = cases): ArgumentError: values does not have an appropriate type. (#590)
  • fitted_params(LogisticModel): linear_binary_classifier = Any[] (#597)
  • Problem fetching spawned @pipeline processes (#598)
  • LogisticModel: pdf(ŷ[i], 1) does not work after the last release (#599)
  • Improve error message for non-functions in @pipeline ... operation=... (#600)
  • unable to use predict_mode() on machine associated with pipeline since MLJ 0.12.0 release (#601)
  • Unable to use functions predict(), predict_mode() (#602)
  • In the latest version, how do we do range(pipeXGBoostRegressor, :(xgr.max_depth), lower=3, upper=10) ? (#603)
  • Add section on creating synthetic data to the manual (#604)
  • Documentation for adding models: Discourage fields with type Union{Nothing,T} where T<:Real (#606)
  • ERROR: LoadError: BoundsError: pkg = DecisionTree (#607)
  • Old @from_network syntax still in docs (#608)
  • Can't use @load inside a package (#613)
  • max_samples parameter for RamdomForestClassifier (#619)
  • Ambiguous assignment in soft scope on Julia 1.5 and Julia 1.6 (#624)
  • inverse_transform of a PCA (#625)
  • potential bug in MLJBase.roc_curve (#630)
  • MLJ 0.12.0 doesn't work with Julia 1.5.0 (Windows) (#631)
  • Meta-issue: Add the JointProbabilistic supervised model type (#633)

Merged pull requests:

v0.11.6

13 Jul 00:30
Compare
Choose a tag to compare

MLJ v0.11.6 (NOT LATEST RELEASE see 0.12 below)

Diff since v0.11.5

Patch removing redundant files causing problems for Windows #591

Closed issues:

  • Integrate flux models (#33)
  • Conflict w Scikitlearn.jl (#502)
  • Accessing nested machines is (still) awkward (#553)
  • Standardization of multi-targets (#568)
  • Add note at MLJ landing page that MLJ wraps a majority of sk-learn models (#573)
  • OpenML integration: Columns as Symbols (#579)
  • OpenML integration: Kmeans is not fitting (#580)

Merged pull requests:

v0.12.0

29 Jun 22:08
d32e8b8
Compare
Choose a tag to compare

MLJ v0.12.0

Diff since v0.11.5

This release provides updates to breaking changes in MLJBase 0.14.0, and MLJModels 0.11.0 and MLJTuning 0.4.0. For a complete list of changes, closed issues and pull requests, refer to the linked release notes.

It also updates the MLJ documentation to reflect the new changes.

Summary

Main breaking changes:

  • Adds restrictions to acceleration options - nesting
    distributed processes within multithread process is now disallowed.

  • Adds more user-friendly interface for inspecting training reports and
    fitted parameters of composite models. For example, if composite = @pipeline OneHotEncoder KNNRegressor and mach = machine(composite, X, y), then access the fitted parameters of the machine associated
    with KNNRegressor using fitted_params(mach).knn_regressor.

  • The @from_network syntax has been changed to make it more
    expressive. In particular, through the new concept of learning
    network machines
    it is possible to export a learning network to a
    composite type supporting multiple operations (e.g., predict and
    transform, as in clustering). See the
    manual

    for details. The old syntax is no longer supported.

Other enhancements of note:

  • Adds MLJFlux
    models to the registry for incorporating neural network models.

  • A more economic @pipeline syntax has been introduced. For example,
    pipe = @pipeline OneHotEncoder PCA(maxoutdim=3) defines model
    pipe with automatically generated field names and model type name. Target inverse
    transformations now ooccur immediately after the supervised model in
    a @pipeline, instead of at the end, unless invert_last=true. The old syntax is
    available but deprecated.

  • It is now possible to simulataneously load model code for models
    having the same name but from different packages, when using @load
    or load.

  • Removes the requirement to specify the kind of source node, as in
    source(y, kind=:target). The role of source nodes in learning
    networks is now inferred from the order in which they appear in
    learning network machine constructors (see above).

Deprecations:

  • The old @pipeline syntax.

  • The specification of kind when constructing a source node.

  • The use of fitresults() when exporting learning networks "by
    hand". See the manual for the new way to do this.

Closed issues:

  • Integrate flux models (#33)
  • Conflict w Scikitlearn.jl (#502)
  • Accessing nested machines is (still) awkward (#553)
  • Add note at MLJ landing page that MLJ wraps a majority of sk-learn models (#573)
  • OpenML integration: Kmeans is not fitting (#580)

Merged pull requests:

v0.11.5

15 Jun 01:10
c0c02c8
Compare
Choose a tag to compare

MLJ v0.11.5

Diff since v0.11.4

Closed issues:

  • computing UnivariateFinite matrix seems to be substantially slow (#511)
  • AMES tutorial doesn't work (UndefVarError) if ScikitLearn.jl or StatsBase.jl are loaded (#534)
  • DimensionMismatch in evaluate() (#540)
  • Hyperparameter tuning of KNN classifier (#543)
  • Decision trees from ScikitLearn.jl not available (#545)
  • Export supports_weights() and prediction_type() (#547)
  • Testing for type of values in a range too restrictive (#548)
  • SVC won't tune cost (#551)
  • Implementation of Tversky Loss (#554)
  • Fix broken MLJ logo in the manual (MLJ github pages) (#555)
  • Add configuration options for RandomForestClassifier.n_subfeatures that depend on the data size (#557)
  • Change DecisionTree.jl n_subfeatures default to -1 for random forest classifier and regressor (#558)
  • Tutorial link in Getting Started doesn't link to right spot (#560)
  • Old documentation deployed on github pages (#561)
  • Document how to load models without the @load macro (#562)
  • Request for monte-carlo cross validation (#564)
  • Loading SKLearn packages causes Julia to crash (#565)

Merged pull requests:

v0.11.4

19 May 11:07
6a54713
Compare
Choose a tag to compare

MLJ v0.11.4

Diff since v0.11.3

Closed issues:

  • Working with models with the same name from different packages (#446)
  • Update readme: MLJModels does not need to be in user's env after MLJModels 0.9.10 (#520)
  • More informative error for supplying model type instead of instance in range method (#521)
  • Readme inconsistency (#524)
  • Not loading to all workers: @Everywhere @load RandomForestClassifier pkg = DecisionTree (#527)
  • Re-export mape and MAPE from MLJBase (#532)

Merged pull requests:

v0.11.3

14 May 03:05
bed315a
Compare
Choose a tag to compare

MLJ v0.11.3

Diff since v0.11.2

  • Update CategoricalArrays compatibility requirement to "^0.8" (#528) (@ablaom)

v0.11.2

30 Apr 00:10
e065b50
Compare
Choose a tag to compare

MLJ v0.11.2

Diff since v0.11.1

Fix bug in defining MLJ_VERSION (#508 PR #509)

Merged pull requests:

v0.11.1

29 Apr 10:05
7bd2ed1
Compare
Choose a tag to compare

MLJ v0.11.1

Diff since v0.11.0

Minor issues only:

Closed issues:

  • Add sample-weight interface point? (#177)
  • Add default_measure to learning_curve! (#283)
  • Flush out unsupervised models in "Adding models for general use" section of manual (#285)
  • is_probabilistic=true in @pipeline syntax is clunky (#305)
  • [suggestions] Unroll the network in @from_network (#311)
  • Towards stabilisation of the core API (#318)
  • failure on nightly (1.4) (#384)
  • Documentation of extracting best fitted params (#386)
  • incorporate input_scitype and target_scitype declarations for @pipeline models (#412)
  • "Supervised" models with no predict method (#460)
  • Use OpenML.load to iris data set in the Getting Started page of docs? (#461)
  • Review cheatsheet (#474)
  • Re-export UnsupervisedNetwork from MLJBase (#497)
  • Broken link for MLJ tour in documentation (#501)

Merged pull requests:

v0.11.0

24 Apr 20:05
3936fd2
Compare
Choose a tag to compare

MLJ v0.11.0

Diff since v0.10.3

Make compatibility updates to MLJBase and MLJModels to effect the following changes to MLJ (see the linked release notes for links to the issues/PRs)):

  • (new model) Add LightGBM models LightGBMClassifier and
    LightGBMRegressor

  • (new model) Add new built-in model, ContinuousEncoder, for
    transforming all features of a table to Continuous scitype,
    dropping any features that cannot be so transformed

  • (new model) Add ParallelKMeans model, KMeans, loaded with
    @load KMeans pkg=ParallelKMeans

  • (mildly breaking enhancement) Arrange for the CV
    resampling strategyto spread fold "remainders" evenly among folds in
    train_test_pairs(::CV, ...) (a small change only noticeable in
    small datasets)

  • (breaking) Restyle report and fitted_params for exported
    learning networks (e.g., pipelines) to include a dictionary of reports or
    fitted_params, keyed on the machines in the underlying learning
    network. New doc-strings detail the new behaviour.

  • (enhancement) Allow calling of transform on machines with Static models without
    first calling fit!

  • Allow machine constructor to work on supervised models that take nothing for
    the input features X (for models that simply fit a
    sampler/distribution to the target data y) (#51)

Also:

  • (documentation) In the "Adding New Models for General Use"
    section of the manual, add detail on how to wrap unsupervised
    models, as well as models that fit a sampler/distribution to data

  • (documentation) Expand the "Transformers" sections of the
    manual, including more material on static transformers and
    transformers that implement predict (#393)

Closed issues:

  • Add tuning by stochastic search (#37)
  • Improve documentation around static transformers (#393)
  • Error in docs for model search (#478)
  • Update [compat] StatsBase="^0.32,^0.33" (#481)
  • For a 0.10.3 release (#483)
  • Help with coercing strings for binary data into Continuous variables (#489)
  • EvoTree Error (#490)
  • Add info with workaround to avoid MKL error (#491)
  • LogisticClassifier pkg = MLJLinearModels computes a number of coefficients but not the same number of mean_and_std_given_feature (#492)
  • MethodError: no method matching... (#493)
  • For a 0.10.4 release (#495)
  • Error: fitted_params(LogisticModel) (#498)

Merged pull requests: