Skip to content

v0.7.0

Compare
Choose a tag to compare
@MaartenGr MaartenGr released this 03 Nov 08:30
· 16 commits to master since this release
7b763ae

Highlights

  • Cleaned up documentation and added several visual representations of the algorithm (excluding MMR / MaxSum)
  • Added functions to extract and pass word- and document embeddings which should make fine-tuning much faster
from keybert import KeyBERT

kw_model = KeyBERT()

# Prepare embeddings
doc_embeddings, word_embeddings = kw_model.extract_embeddings(docs)

# Extract keywords without needing to re-calculate embeddings
keywords = kw_model.extract_keywords(docs, doc_embeddings=doc_embeddings, word_embeddings=word_embeddings)

Do note that the parameters passed to .extract_embeddings for creating the vectorizer should be exactly the same as those in .extract_keywords.

Fixes

  • Redundant documentation was removed by @mabhay3420 in #123
  • Fixed Gensim backend not working after v4 migration (#71)
  • Fixed candidates not working (#122)