v0.7.0

MaartenGr released this 03 Nov 08:30

· 16 commits to master since this release

Highlights

Cleaned up documentation and added several visual representations of the algorithm (excluding MMR / MaxSum)
Added functions to extract and pass word- and document embeddings which should make fine-tuning much faster

from keybert import KeyBERT

kw_model = KeyBERT()

# Prepare embeddings
doc_embeddings, word_embeddings = kw_model.extract_embeddings(docs)

# Extract keywords without needing to re-calculate embeddings
keywords = kw_model.extract_keywords(docs, doc_embeddings=doc_embeddings, word_embeddings=word_embeddings)

Do note that the parameters passed to .extract_embeddings for creating the vectorizer should be exactly the same as those in .extract_keywords.

Fixes

Redundant documentation was removed by @mabhay3420 in #123
Fixed Gensim backend not working after v4 migration (#71)
Fixed candidates not working (#122)

Assets 2