Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with tokyocabinet as the memory storage engine #192

Open
gavin-norman-sociomantic opened this issue Nov 20, 2018 · 0 comments
Open

Comments

@gavin-norman-sociomantic
Copy link

gavin-norman-sociomantic commented Nov 20, 2018

Tokyocabinet has several points of behaviour which are not ideal for our usage, resulting in CPU usage inefficiency.

  1. The treating of keys as strings, resulting in a double hashing of every key.
  2. The malloc/free performed on every get (search for 'free' in ocean.db.tokyocabinet.TokyoCabinetM). TC copies its internal record into a malloced buffer, which is passed to our get method, which then copies it into the D buffer to receive it. This is a double copy, in addition to a theoretically unnecessary malloc/free cycle.

It may be possible to fix these issues, either by modifying tokyocabinet itself or by implementing a new storage engine in D, perhaps based on ocean.util.container.map.HashMap. (The latter would require implementing a proper step iterator in the HashMap.)

Additional not-so-ideal points about tokyocabinet:

  • Internally, 32-bit hashes are used.
  • From looking at the tcmapput() function, there seems to be no kind of rehashing behaviour. This means that the initial num_records and load_factor settings are very important! Perhaps it's good though, in our case, to not have any rehashing behaviour.
  • There seems to be no built-in way to minimise the allocated memory of the database after a large number of records have been removed. This can happen, for example, after a data redistribution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant