Annotators
James Baker edited this page May 9, 2017
·
4 revisions
The following annotators are included in the Baleen 2.2 release, categorised below by the technique they use to do entity extraction.
Full documentation of all Baleen components is available in the Baleen Javadoc.
- AddGenderToPerson
- AddTitleToPerson
- Blacklist
- CleanPunctuation
- CleanTemporal
- CollapseLocations
- CurrencyDetection
- EntityInitials
- ExpandLocationToDescription
- MergeAdjacent
- MergeAdjacentQuantities
- MergeNationalityIntoEntity
- NaiveMergeRelations
- NormalizeOSGB
- NormalizeTemporal
- NormalizeWhitespace
- ReferentToEntity
- RelationTypeFilter
- RemoveLowConfidenceEntities
- RemoveNestedEntities
- RemoveNestedLocations
- RemoveOverlappingEntities
- SplitBrackets
- Surname
Some of these are currently appear as cleaners in the code.
- CorefBrackets
- CorefCapitalisationAndApostrophe
- SieveCoreference
- Country
- File
- Mongo
- MongoRegex
- MongoStemming
- NPAtCoordinate
- NPElement
- NPLocation
- NPOrganisation
- NPTitleEntity
- QuantityNPEntity
- TOLocationEntity
- AssignTypeToInteraction
- PatternExtractor
- RemoveInteractionInEntities
- MaltParser
- OpenNLP
- OpenNLPParser
- WordNetLemmatizer
- AddSourceToMetadata
- CommonKeywords
- DocumentTypeByFilename
- DocumentTypeByLocation
- DocumentTypeByParameter
- FullDocument
- GenericMilitaryPlatform
- GenericVehicle
- GenericWeapon
- MentionedAgain
- NationalityToLocation
- OrganisationPersonRole
- People
- Pronouns
- RakeKeywords
- Area
- BritishArmyUnits
- Callsign
- CasRegistryNumber
- Custom
- Date
- DateTime
- Distance
- DocumentNumber
- Dtg
- FlightNumber
- Frequency
- Hms
- IpV4
- LatLon
- Mgrs
- Money
- Nationality
- Osgb
- Postcode
- RelativeDate
- SocialMediaUsername
- TaskForce
- Telephone
- Time
- TimeQuantity
- USTelephone
- UnqualifiedDate
- Url
- Volume
- Weight
- NPVNP
- SimpleInteraction
- UbmreConstituent
- UbmreDependency
- DocumentLanguage
- DocumentType
- OpenNLP
- StructuralEntity
- StructuralRelation
- TableEntity
- TableRelation
- TemplateAnnotator
- TemplateFieldDefinitionAnnotator
- TemplateFieldJoiningAnnotator
- TemplateFieldToEntityAnnotator
- TemplateRecordDefinitionAnnotator
- TemplateValidator