Skip to content

Releases: webis-de/small-text

v1.3.3

29 Dec 21:23
Compare
Choose a tag to compare

Bugfix release.

Changed

  • An errata section was added to the documentation.

Fixed

  • Fixed a deviation from the paper, where DeltaFScore also considered negative label predictions for the agreement. (#51)
  • Fixed a bug in KappaAverage that affected the stopping behavior. (#52)

Contributors

@zakih2 @vmanc

v1.3.2

19 Aug 18:16
Compare
Choose a tag to compare

Bugfix release.

Fixed

  • Fixed a bug in TransformerBasedClassification where validations_per_epoch>=2 left the model in eval mode. (#40)

v1.3.1

22 Jul 19:55
Compare
Choose a tag to compare

Bugfix release.

Fixed

  • Fixed a bug where parameter groups were omitted when using TransformerBasedClassification's layer-specific fine-tuning functionality. (#36, #38)
  • Fixed a bug where class weighting resulted in nan values. (#39)

Contributors

@JP-SystemsX

v1.3.0

21 Feb 21:15
Compare
Choose a tag to compare

SetFitClassification now also supports dropout sampling (like KimCNNClassifier and TransformerBasedClassification).

Added

Fixed

  • Fixed broken link in README.md.
  • Fixed typo in README.md. (#26)

Changed

Stopping Criteria

Documentation

  • Updated the active learning setup figure.
  • The documentation of integrations has been reorganized.

Contributors

@rmitsch

v1.2.0

04 Feb 21:44
Compare
Choose a tag to compare

This release adds a SetFit classifier, the BALD query strategy, and two new example notebooks.

Added

Active Learning

Classification

Examples

  • Revised both existing notebook examples.
  • Added a notebook example for active learning with SetFit classifiers.
  • Added a notebook example for cold start initialization with SetFit classifiers.

Documentation

  • A showcase section has been added to the documentation.

Fixed

  • Distances in lightweight_coreset were not correctly projected onto the [0, 1] interval (but ranking was unaffected).

Changed

v1.1.1

14 Oct 20:42
Compare
Choose a tag to compare

Minor bug fix release.

Fixed

  • Fixed model selection which could raise an error under certain circumstances (#21).

v1.1.0

01 Oct 10:50
Compare
Choose a tag to compare

This release adds a conda package, more convenient imports, and improves many aspects of the classifcation functionality. Moreover, one new query strategy and three stopping criteria have been added.

Added

General

  • Small-Text package is now available via conda-forge.
  • Imports have been reorganized. You can import all public classes and methods from the top-level package (small_text):
    from small_text import PoolBasedActiveLearner
    

Classification

  • All classifiers now support weighting of training samples.
  • Early stopping has been reworked, improved, and documented (#18).
  • Model selection has been reworked and documented.
  • [!] KimCNNClassifier.__init()__: The default value of the (now deprecated) keyword argument early_stopping_acc has been changed from 0.98 to -1 in order to match TransformerBasedClassification.
  • [!] Removed weight renormalization after gradient clipping.

Datasets

  • The target_labels keyword argument in __init()__ will now raise a warning if not passed.
  • Added from_arrays() to SklearnDataset, PytorchTextClassificationDataset, and TransformersDataset to construct datasets more conveniently.

Query Strategies

Stopping Criteria

Deprecated

  • small_text.integrations.pytorch.utils.misc.default_tensor_type() is deprecated without replacement (#2).
  • TransformerBasedClassification and KimCNNClassifier:
    The keyword arguments for early stopping (early_stopping / early_stopping_no_improvement, early_stopping_acc) that are passed to __init__() are now deprecated. Use the early_stopping
    keyword argument in the fit() method instead (#18).

Fixed

Classification

  • KimCNNClassifier.fit() and TransformerBasedClassification.fit() now correctly
    process the scheduler keyword argument (#16).

Removed

  • Removed the strict check that every target label has to occur in the training data.
    (This is intended for multi-label settings with many labels; apart from that it is still recommended to make sure that all labels occur.)

v1.0.1

12 Sep 22:17
Compare
Choose a tag to compare

Minor bug fix release.

Fixed

Links to notebooks and code examples will now always point to the latest release instead of the latest main branch.

v1.0.0

14 Jun 13:07
Compare
Choose a tag to compare

This is the first stable release πŸŽ‰! The release mainly consists of code cleanup, documentation, and repository organization.

  • Datasets:
    • SklearnDataset now checks if the dimensions of features and labels match.
  • Query Strategies:
  • Documentation:
    • The html documentation uses the full screen width.
  • Repository:
    • This repository can now be referenced using the respective Zenodo DOI.

v1.0.0b4

04 May 18:20
Compare
Choose a tag to compare

This release adds two no query strategies, improves the Dataset interface, and introduces optional dependencies.

Added

  • General:
    • We now have a concept for optional dependencies which allows components to rely on soft dependencies, i.e. python dependencies which can be installed on demand (and only when certain functionality is needed).
  • Datasets:
    • The Dataset interface now has a clone() method that creates an identical copy of the respective dataset.
  • Query Strategies:

Changed

  • Datasets:
    • Separated the previous DatasetView implementation into interface (DatasetView) and implementation (SklearnDatasetView).
    • Added clone() method which creates an identical copy of the dataset.
  • Query Strategies:
    • EmbeddingBasedQueryStrategy now only embeds instances that are either in the label or in the unlabeled pool (and no longer the entire dataset).
  • Code examples:
    • Code structure was unified.
    • Number of iterations can now be passed via an cli argument.
  • small_text.integrations.pytorch.utils.data:
    • Method get_class_weights() now scales the resulting multi-class weights so that the smallest class weight is equal to 1.0.