Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel tdb_cons_add() and tdb_cons_finalize() #126

Open
tuulos opened this issue Mar 11, 2017 · 0 comments
Open

Parallel tdb_cons_add() and tdb_cons_finalize() #126

tuulos opened this issue Mar 11, 2017 · 0 comments

Comments

@tuulos
Copy link
Member

tuulos commented Mar 11, 2017

Currently it is hard to utilize multicore / SSDs on TrailDB creation. tdb_cons functions can be expensive with a lot of data, so it would be beneficial to utilize CPU/IO resources more efficiently.

Here is an idea how this could be achieved:

Parallel tdb_cons_add()

  • K worker threads. Each with a bounded queue for incoming events.
  • tdb_cons_add() shards by uuid over the K threads, shuffles events to the queues.
  • Each thread has its own write buffer which is flushed to disk.
  • Lexicons need to be shared. A naive implementation could cause massive lock contention. Instead:
    • For small low-entropy fields, maintain a copy of the lexicon in each thread. If an entry is found in the lexicon, no need to lock anything. A missing key forces all lexicons to be synchronized.
    • For large high-entropy fields, shard the lexicon to M shards. Shards shared but locked individually so many lookups can succeed without any contention.

Parallel tdb_cons_finalize()

  • This is a profiling output from an expensive tdb_cons_finalize() call:
PROF: encoder/store_lexicons took 166505ms
PROF: encoder/store_uuids took 254ms
PROF: encoder/store_version took 0ms
PROF: trail/groupby_uuid took 839064ms
PROF: trail/info took 1ms
PROF: trail/collect_unigrams took 902892ms
PROF: encode_model/find_candidates took 992ms
PROF: encode_model/choose_grams took 829878ms
PROF: trail/gram_freqs took 830871ms
PROF: huffman/sort_symbols took 5014ms
PROF: huffman/huffman_code took 22ms
PROF: huffman/make_codemap took 2ms
PROF: trail/huff_create_codemap took 5058ms
PROF: trail/encode_trails took 1762364ms
PROF: trail/store_codebook took 3ms
PROF: encoder/encode took 4345248ms

A good candidate for parallelization is encode_trails, which can be sharded by cookie quite easily. Other functions may require more thinking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant