Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get value using key with null char (marisa_trie.Trie) #47

Open
fvarrebola opened this issue Jun 24, 2018 · 0 comments
Open

Can't get value using key with null char (marisa_trie.Trie) #47

fvarrebola opened this issue Jun 24, 2018 · 0 comments

Comments

@fvarrebola
Copy link

fvarrebola commented Jun 24, 2018

I might be missing something out here, but I believe there is a consistency issue with the Trie implementation and keys that have null characters.

Take a look at the following code snippet:

key = 'Random\x00Key'
python_dict = { key : 'random_value' }
key in python_dict # prints True
python_dict.get(key) # returns 'random_value'

std_trie = marisa_trie.Trie(python_dict)
key in std_trie # prints True
std_trie.keys() # prints ['Random\x00Key']
std_trie.get(key) # should return 'random_value'
std_trie.key_id(key) # should return the key id

What happens is that std_trie.get(key) actually returns None, and std_trie.key_id(key) throws a KeyError exception with the following trace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "src/marisa_trie.pyx", line 409, in marisa_trie.Trie.key_id
  File "src/marisa_trie.pyx", line 417, in marisa_trie.Trie.key_id
KeyError: 'Random\x00Key'

Apparently, the RecordTrie implementation is immune to this consistency issue.

r_trie = marisa_trie.RecordTrie('<H', zip([key], [(1,)]))
key in r_trie # prints True
r_trie.keys() # prints ['Random\x00Key']
r_trie.get(key) # prints [(1,)]

By the way, if you confirm this issue as an actual bug, also check DAWG. I haven't used it extensively, but calling dawg.DAWG(python_dict) should throw the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "dawg.pyx", line 45, in dawg.DAWG.__init__ (src\dawg.cpp:2147)
  File "dawg.pyx", line 70, in dawg.DAWG._build_from_iterable (src\dawg.cpp:2570)
dawg.Error: Can't insert key b'Random\x00Key' (with value 0)

I'm running Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32 and using marisa-trie 0.7.5 and DAWG 0.7.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants