Skip to content

Latest commit

History

History
400 lines (344 loc) 路 20.1 KB

CHANGELOG.rst

File metadata and controls

400 lines (344 loc) 路 20.1 KB

Change Log

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning

This release is only compatible with PyTorch 1.9+. Because of some changes, it's now pretty non-trivial to support both, so moving forwards PyKEEN will continue to support the latest version of PyTorch and try its best to keep backwards compatibility.

New Models

New Datasets

New Losses

Added

  • Tutorial in using checkpoints when bringing your own data (pykeen#498)
  • Learning rate scheduling (pykeen#492)
  • Checkpoints include entity/relation maps (pykeen#498)
  • QuatE reproducibility configurations (pykeen#486)

Changed

Fixed

  • FileNotFoundError on Windows/Anaconda (pykeen#503, thanks @Hao-666)
  • Fixed docstring for ComplEx interaction (pykeen#504)
  • Make DistMult the default interaction function for R-GCN (pykeen#548)
  • Fix gradient error in CompGCN buffering (pykeen#573)
  • Fix splitting of numeric triples factories (pykeen#594, thanks @Rodrigo-A-Pereira)
  • Fix determinism in spitting of triples factory (pykeen#500)
  • Fix documentation and improve HPO suggestion (pykeen#524, thanks @kdutia)

1.5.0 - 2021-06-13

New Metrics

  • Adjusted Arithmetic Mean Rank Index (pykeen#378)
  • Add harmonic, geometric, and median rankings (pykeen#381)

New Trackers

New Models

New Negative Samplers

Datasets

Added

Updated

  • R-GCN implementation now uses new-style models and is super idiomatic (pykeen#110)
  • Enable passing of interaction function by string in base model class (pykeen#384, pykeen#387)
  • Bump scipy requirement to 1.5.0+
  • Updated interfaces of models and negative samplers to enforce kwargs (pykeen#445)
  • Reorganize filtering, negative sampling, and remove triples factory from most objects ( pykeen#400, pykeen#405, pykeen#406, pykeen#409, pykeen#420)
  • Update automatic memory optimization (pykeen#404)
  • Flexibly define positive triples for filtering (pykeen#398)
  • Completely reimplemented negative sampling interface in training loops (pykeen#427)
  • Completely reimplemented loss function in training loops (pykeen#448)
  • Forward-compatibility of embeddings in old-style models and updated docs on how to use embeddings (pykeen#474)

Fixed

  • Regularizer passing in the pipeline and HPO (pykeen#345)
  • Saving results when using multimodal models (pykeen#349)
  • Add missing diagonal constraint on MuRE Model (pykeen#353)
  • Fix early stopper handling (pykeen#419)
  • Fixed saving results from pipeline (pykeen#428, thanks @kantholtz)
  • Fix OOM issues with early stopper and AMO (pykeen#433)
  • Fix ER-MLP functional form (pykeen#444)

1.4.0 - 2021-03-04

New Datasets

New Models

New Algorithms

If you're interested in any of these, please get in touch with us regarding an upcoming publication.

Added

  • New-style models (pykeen#260) for direct usage of interaction modules
  • Ability to train pipeline() using an Interaction module rather than a Model (pykeen#326, pykeen#330).

Changes

  • Lookup of assets is now mediated by the class_resolver package (pykeen#321, pykeen#327)
  • The docdata package is now used to parse structured information out of the model and dataset documentation in order to make a more informative README with links to citations (pykeen#303).

1.3.0 - 2021-02-15

We skipped version 1.2.0 because we made an accidental release before this version was ready. We're only human, and are looking into improving our release workflow to live in CI/CD so something like this doesn't happen again. However, as an end user, this won't have an effect on you.

New Datasets

New Trackers

Fixed

  • Fixed ComplEx's implementation (pykeen#313)
  • Fixed OGB's reuse entity identifiers (pykeen#318, thanks @tgebhart)

Added

  • pykeen version command for more easily reporting your environment in issues (pykeen#251)
  • Functional forms of all interaction models (e.g., TransE, RotatE) (pykeen#238, pykeen.nn.functional documentation). These can be generally reused, even outside of the typical PyKEEN workflows.
  • Modular forms of all interaction models (pykeen#242, pykeen.nn.modules documentation). These wrap the functional forms of interaction models and store hyper-parameters such as the p value for the L_p norm in TransE.
  • The initializer, normalizer, and constrainer for the entity and relation embeddings are now exposed through the __init__() function of each KGEM class and can be configured. A future update will enable HPO on these as well (pykeen#282).

Refactoring and Future Preparation

This release contains a few big refactors. Most won't affect end-users, but if you're writing your own PyKEEN models, these are important. Many of them are motivated to make it possible to introduce a new interface that makes it much easier for researchers (who shouldn't have to understand the inner workings of PyKEEN) to make new models.

  • The regularizer has been refactored (pykeen#266, pykeen#274). It no longer accepts a torch.device when instantiated.
  • The pykeen.nn.Embedding class has been improved in several ways: - Embedding Specification class makes it easier to write new classes (pykeen#277) - Refactor to make shape of embedding explicit (pykeen#287) - Specification of complex datatype (pykeen#292)
  • Refactoring of the loss model class to provide a meaningful class hierarchy (pykeen#256, pykeen#262)
  • Refactoring of the base model class to provide a consistent interface (pykeen#246, pykeen#248, pykeen#253, pykeen#257). This allowed for simplification of the loss computation based on the new hierarchy and also new implementation of regularizer class.
  • More automated testing of typing with MyPy (pykeen#255) and automated checking of documentation with doctests (pykeen#291)

Triples Loading

We've made some improvements to the pykeen.triples.TriplesFactory to facilitate loading even larger datasets (pykeen#216). However, this required an interface change. This will affect any code that loads custom triples. If you're loading triples from a path, you should now use:

path = ...

# Old (doesn't work anymore)
tf = TriplesFactory(path=path)

# New
tf = TriplesFactory.from_path(path)

Predictions

While refactoring the base model class, we excised the prediction functionality to a new module pykeen.models.predict (docs: https://pykeen.readthedocs.io/en/latest/reference/predict.html#functions). We also renamed some of the prediction functions inside the base model to make them more consistent, but we now recommend you use the functions from pykeen.models.predict instead.

  • Model.predict_heads() -> Model.get_head_prediction_df()
  • Model.predict_relations() -> Model.get_head_prediction_df()
  • Model.predict_tails() -> Model.get_head_prediction_df()
  • Model.score_all_triples() -> Model.get_all_prediction_df()

Fixed

  • Do not create inverse triples for validation and testing factory (pykeen#270)
  • Treat nonzero applied to large tensor error as OOM for batch size search (pykeen#279)
  • Fix bug in loading ConceptNet (pykeen#290). If your experiments relied on this dataset, you should rerun them.

1.1.0 - 2021-01-20

New Datasets

New Trackers

Added

  • Add MLFlow set tags function (pykeen#139; thanks @sunny1401)
  • Add score_t/h function for ComplEx (pykeen#150)
  • Add proper testing for literal datasets and literal models (pykeen#199)
  • Checkpoint functionality (pykeen#123)
  • Random triple generation (pykeen#201)
  • Make negative sampler corruption scheme configurable (pykeen#209)
  • Add predict with inverse tripels pipeline (pykeen#208)
  • Add generalize p-norm to regularizer (pykeen#225)

Changed

  • New harness for resetting parameters (pykeen#131)
  • Modularize embeddings (pykeen#132)
  • Update first steps documentation (pykeen#152; thanks @TobiasUhmann )
  • Switched testing to GitHub Actions (pykeen#165 and pykeen#194)
  • No longer support Python 3.6
  • Move automatic memory optimization (AMO) option out of model and into training loop (pykeen#176)
  • Improve hyper-parameter defaults and HPO defaults (pykeen#181 and pykeen#179)
  • Switch internal usage to ID-based triples (pykeen#193 and pykeen#220)
  • Optimize triples splitting algorithm (pykeen#187)
  • Generalize metadata storage in triples factory (pykeen#211)
  • Add drop_last option to data loader in training loop (pykeen#217)

Fixed

1.0.5 - 2020-10-21

Added

  • Added testing on Windows with AppVeyor and documentation for installation on Windows (pykeen#95)
  • Add ability to specify custom datasets in HPO and ablation studies (pykeen#54)
  • Add functions for plotting entities and relations (as well as an accompanying tutorial) (pykeen#99)

Changed

  • Replaced BCE loss with BCEWithLogits loss (pykeen#109)
  • Store default HPO ranges in loss classes (pykeen#111)
  • Use entrypoints for datasets (pykeen#115) to allow registering of custom datasets
  • Improved WANDB results tracker (pykeen#117, thanks @kantholtz)
  • Reorganized ablation study generation and execution (pykeen#54)

Fixed

  • Fixed bug in the initialization of ConvE (pykeen#100)
  • Fixed cross-platform issue with random integer generation (pykeen#98)
  • Fixed documentation build on ReadTheDocs (pykeen#104)

1.0.4 - 2020-08-25

Added

Changed

  • Use number of epochs as step instead of number of checks (pykeen#72)

Fixed

1.0.3 - 2020-08-13

Added

Changed

Fixed

1.0.2 - 2020-07-10

Added

  • Add default values for margin and adversarial temperature in NSSA loss (pykeen#29)
  • Added FTP uploader (pykeen#35)
  • Add AWS S3 uploader (pykeen#39)

Changed

  • Improved MLflow support (pykeen#40)
  • Lots of improvements to documentation!

Fixed

  • Fix triples factory splitting bug (pykeen#21)
  • Fix problem with tensors' device during prediction (pykeen#41)
  • Fix RotatE relation embeddings re-initialization (pykeen#26)

1.0.1 - 2020-07-02

Added

Changed