Skip to content

Configuring History

James Baker edited this page Jan 9, 2017 · 1 revision

Baleen can record history information for every entity within a document. This information would include, for example, which annotator created it, what it was merged with, how it was modified, etc.

By default, this information is logged to the console. However, it is possible to have it saved to MongoDB or Elasticsearch instead, or completely disabled. This is done through additional configuration parameters in the pipeline configuration YAML file.

To configure for MongoDB, add the following:

history:
  class: uk.gov.dstl.baleen.history.mongo.MongoHistory

To configure for Elasticsearch, add the following:

history:
  class: uk.gov.dstl.baleen.history.elasticsearch.ElasticsearchHistory

To disable the history, add the following:

history:
  class: uk.gov.dstl.baleen.core.history.noop.NoopBaleenHistory

Additional configuration may be possible, please see the Javadoc of the relevant history implementation for full details. More information is also available in the Javadoc for PipelineCpeBuilder.

Other history implementations exist and can be used. By default, LoggingBaleenHistory is used and logs all history to the currently configured logs.