Atomic nibble instead of mutex #1601

betatim · 2017-01-30T15:48:49Z

NibbleStorage uses a big (bad) mutex. This is pretty slow. This PR removes that and instead uses an array of atomic bytes.

Is it mergeable?
make test Did it pass the tests?
make clean diff-cover If it introduces new functionality in
scripts/ is it tested?
make format diff_pylint_report cppcheck doc pydocstyle Is it well
formatted?
Did it change the command-line interface? Only backwards-compatible
additions are allowed without a major version increment. Changing file
formats also requires a major version number increment.
For substantial changes or changes to the command-line interface, is it
documented in CHANGELOG.md? See keepachangelog
for more details.
Was a spellchecker run on the source code and documentation after
changes were made?
Do the changes respect streaming IO? (Are they
tested for streaming IO?)

betatim · 2017-01-30T15:49:20Z

lib/storage.hh

@@ -393,7 +403,7 @@ public:

    Byte ** get_raw_tables()
    {
-        return _counts;
+        return (Byte **)_counts;


Completely evil :-/

Not sure what to do about this. What is the use case for get_raw_tables() beyond nosy devs wanting to peak inside?

ctb · 2017-01-31T14:26:16Z

the explicit rationale is somewhere in the merged pull request history... seek answers there :)

betatim · 2017-01-31T14:49:01Z

Seems like this was the first time this came up: #667 From what I gather of the discussion there get_raw_tables for anything using <8bits per bucket is not what they were expecting to get. Same goes for the Node* classes no?

I don't think we can use the buffer interface to fake the "round up to nearest byte".

Should we move it up the inheritance tree to expose it only for Count* classes?

standage · 2017-01-31T18:10:28Z

Should we move it up the inheritance tree to expose it only for Count* classes?

I think this is reasonable.

betatim · 2017-02-01T15:23:34Z

get_raw_tables has been moved and tests adjusted.

Ready for review! @luizirber, @camillescott, @standage , or @ctb

codecov-io · 2017-02-14T17:50:52Z

Codecov Report

Merging #1601 into master will decrease coverage by <.01%.
The diff coverage is 52.94%.

@@            Coverage Diff            @@
##           master   #1601      +/-   ##
=========================================
- Coverage   70.11%   70.1%   -0.01%     
=========================================
  Files          66      66              
  Lines        8906    8877      -29     
  Branches     3009    2999      -10     
=========================================
- Hits         6244    6223      -21     
- Misses       1040    1041       +1     
+ Partials     1622    1613       -9

Impacted Files	Coverage Δ
khmer/_khmer.cc	`57.1% <ø> (ø)`	⬆️
lib/hashtable.hh	`82.6% <ø> (-0.38%)`	⬇️
khmer/_cpy_smallcountgraph.hh	`52% <ø> (-2.77%)`	⬇️
lib/storage.cc	`47.88% <0%> (ø)`	⬆️
lib/hashgraph.hh	`70.96% <100%> (+0.96%)`	⬆️
lib/storage.hh	`86.86% <57.14%> (-2.83%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6dd8430...f47c07c. Read the comment docs.

Use individual bytes that can be updated atomically instead of mutexes.

Node* or SmallCount* objects pack things into individual bytes which makes the raw table points unsuitable for numpy.frombuffer

betatim commented Jan 30, 2017

View reviewed changes

betatim force-pushed the feature/atomic-nibble branch 2 times, most recently from 5229519 to f422e07 Compare January 31, 2017 17:57

betatim force-pushed the feature/atomic-nibble branch 2 times, most recently from a870206 to 641436f Compare February 1, 2017 15:16

betatim force-pushed the feature/atomic-nibble branch from 988cf32 to 921bf9b Compare February 1, 2017 15:29

betatim added 5 commits February 14, 2017 19:04

Switch to using atomic byte sfor NibbleStorage

9c0a2a5

Use individual bytes that can be updated atomically instead of mutexes.

Fix overflow handling

a525c0b

Move get_raw_tables to Countgraph and Counttable only

6cbcd7f

Node* or SmallCount* objects pack things into individual bytes which makes the raw table points unsuitable for numpy.frombuffer

Remove unused left overs from the mutex

75c2820

Add a comment explaining the compare-and-swap loop

f47c07c

betatim force-pushed the feature/atomic-nibble branch from ea7bb2d to f47c07c Compare February 14, 2017 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Atomic nibble instead of mutex #1601

Atomic nibble instead of mutex #1601

betatim commented Jan 30, 2017 •

edited

betatim Jan 30, 2017

betatim Jan 31, 2017

ctb commented Jan 31, 2017 via email

betatim commented Jan 31, 2017

standage commented Jan 31, 2017

betatim commented Feb 1, 2017

codecov-io commented Feb 14, 2017 •

edited

Atomic nibble instead of mutex #1601

Are you sure you want to change the base?

Atomic nibble instead of mutex #1601

Conversation

betatim commented Jan 30, 2017 • edited

betatim Jan 30, 2017

Choose a reason for hiding this comment

betatim Jan 31, 2017

Choose a reason for hiding this comment

ctb commented Jan 31, 2017 via email

betatim commented Jan 31, 2017

standage commented Jan 31, 2017

betatim commented Feb 1, 2017

codecov-io commented Feb 14, 2017 • edited

Codecov Report

betatim commented Jan 30, 2017 •

edited

codecov-io commented Feb 14, 2017 •

edited