Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Trie #161

Open
sepehrs1378 opened this issue Feb 4, 2021 · 4 comments
Open

Add Trie #161

sepehrs1378 opened this issue Feb 4, 2021 · 4 comments

Comments

@sepehrs1378
Copy link

I implemented a simple version of the trie data structure and I guess it can be a part of your project. Hope this will be helpful.

@hoiyd
Copy link

hoiyd commented Apr 3, 2022

So any updates with this Trie feature? This is gonna be really helpful.

@emirpasic
Copy link
Owner

For sure, I have looked into the implementation and am familiarizing myself with the technical details.

Usually a trie is implemented on top of a radix tree, so that's the piece I was wondering, if we should start with radix tree first and then on top of that build the trie? @sepehrs1378

@sepehrs1378
Copy link
Author

Sorry but I won't be able to help. Thanks for mentioning me though.

@drew-512
Copy link

drew-512 commented May 4, 2023

I have some thoughts, coming from C# land where generics are rather pathetic and non-performing (been with Go 5 years now).

In my search, I looked at 20+ different implementations and every single one was meh or had essential features missing. A common choice is to have a hashtable at each node -- which is wasteful if you have a large sparse trie with many nodes (common) which also doesn't support ordinal iteration. Meanwhile, a pure radix implementation is easily wasteful (or slow) on one rune per node approach for long and/or sparse keys (common in file systems).

So after finding nothing and frustrated, I bit the bullet and made a trie impl that screams, is lightweight, scales well, supports case insensitive keys at no cost, and supports forward or reverse iteration. Happy to share more and invest in this if the interest is there.

Core characteristics:

  • Use cases ~1M entries or less. Can support any size but at that point, why not just use badger in memory mode, which also offers a boatload of killer parallelization features and can be made a disk-based db at the flick of an option.
  • Must support iteration from any starting key (existent or not)
  • Must support case insensitive keys (making it usable for file systems)
  • Uses strings as keys with a user-given rune designated as a node path separator. Alternatively, using a generic as a path character basically locks-in a pure-radix implementation which has drawbacks (mentioned above). A way to avoid those drawbacks is to dynamically fork character sequences as new entries arrive. Doing that however is a metric butt-ton of work and a major undertaking (and why projects like badger are so valuable). Since almost all use cases of trie are with strings (of bytes or runes), rune/byte strings over generic character strings is the clear choice imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants