Skip to content

Releases: dkpro/dkpro-core

DKPro Core 2.4.0

17 Jun 10:18
Compare
Choose a tag to compare

DKPro Core is a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This is a feature release.

What's Changed

Full Changelog: dkpro-core-2.3.1...dkpro-core-2.4.0

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

DKPro Core 2.3.1

10 Jun 07:35
Compare
Choose a tag to compare

DKPro Core is a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This is a bugfix release.

What's Changed

Full Changelog: dkpro-core-2.3.0...dkpro-core-2.3.1

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

DKPro Core 2.3.0

01 Jan 12:54
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 2.3.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This release updates many dependencies, removes several modules which are no longer viable and fixes a few bugs.

What's Changed

Full Changelog: rel/dkpro-core-2.2.0...dkpro-core-2.3.0
Thanks to all contributors!

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

DKPro Core 2.2.0

18 Jan 21:54
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 2.2.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 3.

https://dkpro.github.io/dkpro-core

This is a feature release.

Notable changes since DKPro Core 2.1.0

  • io-brat: Fixed NPE when WebAnno-style slot feature does not have a role label
  • io-xmi: Added support for binary TSI
  • io-nif: Improved entity linking support
  • io-conll-u: Set div type on paragraphs
  • documentation: Make data format examples more easily copy/pastable
  • Updated various dependencies

A more detailed overview of the changes in this release can be found [2].

Thanks to all contributors!

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.2.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.2.0

DKPro Core 2.1.0

01 Dec 21:23
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 2.1.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 3.

https://dkpro.github.io/dkpro-core

This is a feature release.

Notable changes since DKPro Core 2.0.0

  • Added option to export XMI using XML 1.1 to avoid issues with certain characters
  • Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
  • Added support for annotator notes in brat format
  • Improved speed for writing WebAnno TSV format (backported from WebAnno)
  • Fixed a couple of issues with the CoNLL 2012 format
  • Fixed default extension for CoNLL-U writer
  • Fixed problem in CoNLL-U writer when text contains line breaks
  • Fixed problem that LanguageToolChecker did not fill in suggestions
  • Fixed setting div type on paragraphs created by CoNLL-U reader

A more detailed overview of the changes in this release can be found [2].

Thanks to all contributors!

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.1.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.1.0

DKPro Core 1.12.0

01 Dec 15:11
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 1.12.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 2.

https://dkpro.github.io/dkpro-core

This is a feature release.

Important upgrade notice

If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

Notable changes since DKPro Core 1.11.1

  • Added option to export XMI using XML 1.1 to avoid issues with certain characters
  • Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
  • Added support for annotator notes in brat format
  • Improved speed for writing WebAnno TSV format (backported from WebAnno)
  • Fixed a couple of issues with the CoNLL 2012 format
  • Fixed default extension for CoNLL-U writer
  • Fixed problem in CoNLL-U writer when text contains line breaks
  • Fixed problem that LanguageToolChecker did not fill in suggestions

A more detailed overview of the changes in this release can be found [2].

Thanks to all contributors!

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-1.12.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.12.0

DKPro Core 2.0.0

08 Sep 17:20
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 2.0.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This is a feature release.

Important upgrade notice

This version requires UIMA v3.

If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

Notable changes since DKPro Core 1.11.1

  • Switched to UIMAv3
  • Added filling in suggestions to LanguageToolChecker
  • Added support for notes to BratReader
  • Added basic read support for Perseus XML format
  • Improved error message when StanfordNamedEntityRecognizerTrainer is called without training data
  • Moved StopwordRemover to tokit module and removed stopwordremover module
  • Renamed lancaster module to smile
  • Removed Tag type from syntax module
  • ... and a few additional under-the-hood changes

A more detailed overview of the changes in this release can be found [2].

Thanks for contributions go to: @alaindesilets, @mischor

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

[1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.0.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.0.0

DKPro Core 1.11.1

17 Aug 10:11
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 1.11.1

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This is a bugfix release.

Important upgrade notice

If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

Notable changes since DKPro Core 1.11.0

  • Fixed trimming of whitespace at the start and end of annotations
  • Fixed encoding of named entity categories in LIF format
  • Fixed unescaping of URI-encoded characters when writing files
  • Added parameter to control whitespace normalization in HtmlDocumentReader
  • Added parameters to control indentation and output method in XmlDocumentWriter
  • Improved exception in Stanford CoreNLP NER trainer when no documents have been processed

A more detailed overview of the changes in this release can be found [2].

Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck, @alaindesilets, @jcklie

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

[1] https://github.com/dkpro/dkpro-core/releases/tag/dkpro-core-1.11.0
[2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.11.1

DKPro Core 1.11.0

05 Jul 13:27
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 1.11.0

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This is a feature release.

Important upgrade notice

  • Changed groupIds and artifactIds. The group ID is now org.dkpro.core and the artifact IDs are dkpro-core-...-(asl/gpl)
  • Changed package names. The packages are now all starting with org.dkpro.core... - except the packages of UIMA types which remain unchanged for data compatibility.

Notable changes since DKPro Core 1.10.0

  • Changed parts of the brat data conversion code such that it can be more easily used outside a UIMA component
  • Changed type mapping such that out-of-tagset types map to the generic type (e.g. an unknown POS tag maps to POS, not to POS_X)
  • Changed name of NYTCollectionReader to NitfReader
  • Added types to encode XML document structure in CAS
  • Added new XmlDocumentReader/Writer components using these types
  • Added basic reader for Annotated Gigaword corpus (only reads text so far) (thanks @az79nefy)
  • Added basic support for PubAnnotation JSON format
  • Added Maui component for keyword assignment
  • Added parameter to SfstAnnotator to enable lower-case lookup of first word in a sentence (thanks @rziai)
  • Added "order" feature to Token type
  • Added support for CoNLL-U document and paragraph IDs (thanks @manuelciosici)
  • Added support for CoNLL-U sentence IDs and text
  • Added standardized parameter to disable type mapping
  • Added support for TCF orthography layer using SofaChangeAnnotations
  • Added segmenter for Chinese using jieba (thanks @Horsmann)
  • Added MyStem for Russian
  • Added links to OpenMinTeD categories in type system documentation
  • Added support for the reading/writing the CoreNLP CoNLL flavor
  • Added parameter to configure the Tika buffer size (useful for large documents)
  • Updated to OpenNLP 1.9.1
  • Updated to CoreNLP 3.9.2
  • Updated to ICU4J 64.2
  • Updated to Tika 1.19.1
  • Updated to LanguageTool 4.3
  • Updated to PDFBox 2.0.12
  • Updated IllinoisNLP components
  • Updated TreeTagger models/binaries in build.xml script (thanks @tilmanbeck)
  • Updated LIF dependencies
  • Updated dataset descriptions
  • Updated various general dependencies (e.g. Apache Commons etc.)
  • Improved robustness of checksum verification for text files used in datasets (e.g. license files)
  • Improved error messages in WebAnno TSV3 module
  • Fixed crash in WebannoTsv3XWriter when annotations do not start/end at token boundaries
  • Fixed bug in WebAnno TSV3 support causing span annotations with slot features to disappear
  • Fixed trimming of whitespace in TeiReader
  • Fixed bug in NifWriter causing named entity identifier not to be written
  • Fixed crash in BratReader with reading discontinuous segments
  • Fixed problem in BratWriter when dealing with slot features
  • Fixed metadata of CoNLL2012Writer
  • Fixed potential problem of datasets being written outside their target directory
  • Dropped the GrAF I/O module since the upstream libraries are outdated and no longer maintained

A more detailed overview of the changes in this release can be found here.

Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

DKPro Core 1.10.0

10 Sep 18:42
Compare
Choose a tag to compare

We are pleased to announce the release of

DKPro Core 1.10.0

a collection of interoperable software components for natural language
processing (NLP) based on the Apache UIMA framework.

https://dkpro.github.io/dkpro-core

This is a feature release.

Notable changes since DKPro Core 1.9.3

  • Added support for Arabic to CoreNlpSegmenter (thanks @Jibun)
  • Added support for Token "form" to CoNLL writers (thanks @Jibun)
  • Added ability to provide extra non-standard parameters to CoreNlpSegmenter (thanks @Jibun)
  • Added ArkTreet POS tagger trainer (thanks @schrieveslaach)
  • Added WebAnno TSV3 reader/writer
  • Added reader for Leipzig Corpora Collection
  • Upgraded to CoreNLP 3.9.1 (stanfordnlp and corenlp modules)
  • Upgraded to OpenNLP 1.9.0
  • Upgraded to PDFBox 2.0.9 (io-pdf module)
  • Upgraded to LanguageTool 4.2
  • Upgraded to CogComp 4.0.7 (lbj module)
  • Upgraded to Tika 1.18 (io-tika module)
  • Improved handling of multi-line annotations in brat module (thanks @parisni)
  • Fix discontinuous annotations crashing the brat reader by reading only the first fragment
  • Added dataset description for GUM 4.1.0 dataset
  • Removed PARAM_INTERN_TAGS
  • Improved component metadata

A more detailed overview of the changes in this release can be found here.

Thanks for contributions go to: @Jibun, @parisni, @schrieveslaach, @jgrivolla

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.