Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting an iterator in the middle of a values list in a label-ordered RecordTrie #30

Open
gmossessian opened this issue Jul 6, 2016 · 0 comments
Labels

Comments

@gmossessian
Copy link

gmossessian commented Jul 6, 2016

Hi Mike,

Thanks for building this wrapper and providing the additional Bytes and RecordTrie classes, they're great and extremely useful and have been largely easy to build additional features into.

I would like to implement a RecordTrie feature as follows: if there are key-value pairs (u'a', (1, N_1)), (u'a', (2, N_2)), (u'a',(3, N_3)), ..., (u'a', (i, N_i)), and I know that I only need the values stored in N_p through N_q. If i is very large (say, a million), calling my_list = my_trie[u'a'] is prohibitively slow, and so I would like to start the loop at the pth value, that is, at b_prefix = <bytes>u'a'.encode('utf8') + self._b_value_separator + <bytes>bytes(struct(">I", p)).

Of course, if I set Agent key to this, the loop will not continue on to (u'a', (p+1, N_{p+1})), and I cannot figure out how to "trick" the predictive search into continuing to loop past that specific prefix and on to anything with prefix simply <bytes>u'a'.encode('utf8') + self._b_value_separator without resetting the entire loop from the top.

Do you have any idea of how this might be accomplished, or if it is a limitation of the marisa-trie base library?

Thanks,
-George

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants