Skip to content

🦞 Rust library of natural language dictionaries using character-wise double-array tries.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

daac-tools/crawdad

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🦞 Crawdad: ChaRActer-Wise Double-Array Dictionary

Crates.io Documentation Build Status Slack

Overview

Crawdad is a library of natural language dictionaries using character-wise double-array tries. The implementation is optimized for strings of multibyte-characters, and you can enjoy fast text processing on strings such as Japanese or Chinese.

For example, on a large Japanese dictionary of IPADIC+Neologd, Crawdad has a better time-space tradeoff than other Rust libraries.

The detailed experimental settings and other results are available on Wiki.

What can do

  • Key-value mapping: Crawdad stores a set of string keys with mapping arbitrary integer values.
  • Exact match: Crawdad supports a fast lookup for an input key.
  • Common prefix search: Crawdad supports fast common prefix search that can be used to enumerate all keys appearing in a text.

Data structures

Crawdad contains the two trie implementations:

  • crawdad::Trie is a standard trie form that often provides the fastest queries.
  • crawdad::MpTrie is a minimal-prefix trie form that is memory-efficient for long strings.

Slack

We have a Slack workspace for developers and users to ask questions and discuss a variety of topics.

License

Licensed under either of

at your option.

Acknowledgment

The initial version of this software was developed by LegalOn Technologies, Inc., but not an officially supported LegalOn Technologies product.

Contribution

See the guidelines.

About

🦞 Rust library of natural language dictionaries using character-wise double-array tries.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published

Languages