Skip to content

Version 2.7.0

Latest
Compare
Choose a tag to compare
@jbaker-dstl jbaker-dstl released this 08 May 19:56
· 14 commits to master since this release
bb88621

Version 2.7 incorporates a number of changes and improvements, including some breaking changes.

New Functionality

  • Ability to include other YAML files within YAML configuration files
  • Annotator regex.Mgrs now adds GeoJSON to extracted coordinates
  • New annotator regex.NaiveParagraph to naively annotate paragraphs based on multiple new lines
  • New annotator triage.TokenFrequencySummarisation to use a token frequency approach to document summarisation
  • New options on CsvFolderReader collection reader to add line numbers and reprocess files that are modified

Updates and Bug Fixes

  • Code quality improvements based on feedback from Codacy
  • Integration with CI tools
  • Set ContentType on Elasticsearch REST requests
  • Support for both Java 8 and newer versions (Java 9+, tested against Java 11)
  • Update dependencies to newer versions
  • Update underlying framework to UimaFIT 3
  • Use synchronous requests in Plankton to avoid race conditions
  • Minor bugfixes, typos, etc

Breaking Changes

  • Content Extractors are now a first class citizen in Baleen, and as such have their own section in pipeline configuration files. Existing pipeline files will need changing, otherwise the content extractor may be incorrectly configured. For more information, see What's New in Baleen 2.7.0.

For a complete list of changes, see the Git commit log.