Skip to content

Releases: aajanki/spacy-fi

Release 0.14.0

14 Oct 12:00
Compare
Choose a tag to compare
  • Compatible with spaCy 3.7
  • The noun chunker includes chains of flats and nmods: e.g. "maaliskuun 7. päivänä"
  • The parser doesn't try to detect nsubj:outer, dislocated and goeswith
    dependencies anymore. There's not enough training data to learn those.
  • Tokenize "-kampanja" as ["-", "kampanja"]
  • Tokenize "maa-" as ["maa", "-"]
  • Tokenize "/kk" as ["/", "kk"]
  • Other tokenizer improvements

Evaluation scores:

TAG 96.62
POS 96.45
MORPH 92.26
LEMMA 94.01
UAS 87.14
LAS 82.90
NER P 83.04
NER R 81.56
NER F 82.29

Release 0.13.0

21 Jul 08:21
Compare
Choose a tag to compare
  • Compatible with spaCy 3.6

Evaluation scores:

TAG 96.81
POS 96.79
MORPH 92.49
LEMMA 94.16
UAS 88.55
LAS 84.18
NER P 82.85
NER R 81.80
NER F 82.32

Release 0.12.0

01 Feb 18:12
Compare
Choose a tag to compare
  • Compatible with spaCy 3.5
  • Word occurrence probabilities (they have been broken in the past several versions)

Evaluation scores:

TAG 96.72
POS 96.69
MORPH 92.75
LEMMA 94.19
UAS 87.28
LAS 83.21
NER P 83.00
NER R 81.41
NER F 82.20

Release 0.11.0

23 Jul 09:01
Compare
Choose a tag to compare
  • Ported to spaCy 3.4
  • Updated word vectors and word frequencies
  • Minor fixes to the lemmatization

Evaluation scores:

TAG 96.71
POS 96.85
MORPH 92.83
LEMMA 94.22
UAS 87.38
LAS 83.02
NER P 82.95
NER R 81.49
NER F 82.21

Release 0.10.0

07 May 07:47
Compare
Choose a tag to compare
  • Floret embedding vectors trained on MC4_fi_cleaned
  • Ported to spaCy 3.3.0. Older spacy versions are not supported anymore.

Evaluation scores:

TAG 96.95
POS 96.83
MORPH 92.39
LEMMA 93.85
UAS 88.12
LAS 83.94
NER P 82.71
NER R 81.12
NER F 81.91

Release 0.10.0b1

09 Apr 08:35
Compare
Choose a tag to compare
  • Ported to spaCy 3.3.0.dev0. Older spacy versions are not supported anymore.
  • Noun chunker now splits off appositions as independent phrases

Release 0.9.0

19 Jan 18:20
Compare
Choose a tag to compare
  • The pipeline now includes a named-entity recognizer (NER)

Evaluation scores:
TAG 96.75
POS 96.32
MORPH 92.31
LEMMA 93.82
UAS 87.69
LAS 83.38
NER P 82.32
NER R 80.53
NER F 81.41

Release 0.8.0

21 Nov 11:09
Compare
Choose a tag to compare
  • Ported to spaCy 3.2. Older spaCy versions are not supported anymore.
  • Vectors for out-of-vocabulary words generated by Floret embeddings
  • The default spaCy morphologizer instead of the custom Voikko-based morphologizer

Evaluation scores:
TAG 96.93
POS 96.48
MORPH 92.46
LEMMA 93.84
UAS 87.60
LAS 83.33

Release 0.7.1

21 Aug 13:02
Compare
Choose a tag to compare
  • Works on Python 3.7 again

Evaluation scores:
TAG: 95.17
POS: 94.76
MORPH: 65.30
LEMMA: 93.35
UAS: 85.08
LAS: 79.82

License: MIT

Release 0.7.0

12 Jul 07:36
Compare
Choose a tag to compare
  • Compatibility with spaCy v3.1
  • Minor improvements to analysis: prefer non-compound words

Evaluation scores:
TAG: 95.17
POS: 94.76
MORPH: 65.30
LEMMA: 93.35
UAS: 85.08
LAS: 79.82

License: MIT