-
-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RBTree for set #2313
base: master
Are you sure you want to change the base?
RBTree for set #2313
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2313 +/- ##
==========================================
- Coverage 83.53% 83.48% -0.05%
==========================================
Files 376 376
Lines 61640 61732 +92
==========================================
+ Hits 51489 51535 +46
- Misses 10151 10197 +46
... and 25 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
Unfortunately I won't have time to look at this for a while. Whoever does the review, please check that there is no performance degradation in the |
I'll take a look at it early next week. |
Benchmark suggestion: Create a random graph, record its degree sequence, use that in Undirected suggestion:
Directed suggestion:
|
FYI, this is not forgotten, I've started writing the benchmark, hopefully it will be ready today. |
Benchmark results are in; it looks like the proposal in this PR did not improve performance, although there's probably still room for optimization. Old version: Proposal in this PR: Compiled in release mode, with LTO. I don't have time to debug this further, but it's probably worth looking at the code with a profiler. |
One caveat with the benchmark above is that the implementation of |
src/core/set.c
Outdated
if(newNode == NULL){ | ||
IGRAPH_CHECK_OOM(newNode, "Cannot reserve space for the new set element."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IGRAPH_CHECK_OOM
already does the NULL
-check, so the if
is unnecessary.
Just letting everyone know that I haven't given up on this. I got a job recently and I need to write a report to participate in a hackathon by this Wednesday. I will profile the code and upload the results here probably by Sunday, latest by next Wednesday. Then continue on improving the timing. |
Hi @szhorvat @ntamas I ran the analysis and here are the results. Here is the time taken by the igraph set implementation
Here is the time taken by the simple rb tree set implementation. The commit "set constructor error t"
Next is the pool implementation here. Lastly there is the multi pool implementation which is 5 times slower than the pool implementation. The multi pool implementation works by making a new pool rather than reallocating the old pool and updating the addresses. I don't understand why it is spending 60% of the time in reserve function, so its possible there's some bug there that I am not seeing(all the tests are passing though). Here's something that instantly jumped out to me. In the simple implementation most time is wasted just deallocating the tree(44.21%). So if it is somehow possible to start a task asynchronously it will exactly be only slightly slower than the current implementation. Does igraph have a standard for threading? If yes, by making the contains function iterative instead of recursive and making the deletion asynchronous, I feel it should be as fast as the current implementation. If not can we please just implement it alongside the current set implementation as igraph_rbtree_t? Developers can opt in to use it in case they need the iterator to behave like c++ stl iterators. |
No; igraph is inherently single-threaded at the moment and there are no immediate plans to add support for threading in general. Also note that many host languages that igraph embeds itself in (Python, Mathematica and R being the most prominent) are also single-threaded so it's not clear yet how much we could gain by embracing the additional complexity that comes with threads. So far we've been using OpenMP in places in the code where we could immediately benefit from parallelizing a for loop, but your use-case does not seem to fit the bill.
I don't have a strong opinion here but I'm slightly leaning towards saying no. igraph's goal is not to provide a full general-purposes data structure library for C. If there is a place in igraph's current code where we could benefit from using a red-black tree instead of sets, I'm okay with adding such a data type, but adding a red-black tree just for the sake of having it means more maintenance burden on our end for little gain. I'm pretty sure that if someone badly needs a red-black tree in C, there are plenty of open-source libraries that already provide that and it's unlikely we can do any better than those without putting significant effort in. (Opinions @szhorvat ?). |
@ntamas I have implemented the indexed pool igraph as you suggested. It is exactly as slow as the regular pool implementation with an identical flame graph. I guess each calloc call takes less time when you ask for very few bytes since the OS doesn't have to move memory a bunch of memory.
But I do need the RB tree based sets to implement my ant colony graph coloring algorithm. The reason I cannot use the igraph set is because its iterator doesn't satisfy my use case: if an element is remove from the set the iterator will point to the wrong element, and I need to delete elements from the set while iterating over it. It seems we have 2 options, either implementing RB Tree as an igraph export or to use it inside the ant colony graph coloring algorithm file only. Please let me know which one you would like me to proceed with. |
I see three possible ways forward:
|
I have tried 3 other method to make the set faster(5 total now). So far the simplest implementation is still the fastest. I can't make much sense of the flame graphs. I used them, hypothesized what might be causing the problem, tried to resolve it with another method and failed. Right now my current hypothesis on why its slow is that the degree sequence game test make a lot (at least 100 if not thousand) individual sets of size 1 and 2 and then discards them, which is why the more complex tree implementation is slower. Since we are basically comparing how long it takes to allocate the 24 bytes for the current implementation vs the 32 bytes of the tree implementation times 1000. I will implement the std::set wrapper and bench that. Just writing this comment to update you on the progress. |
Hey I haven't been able to figure out how to integrate C++ into the project. Can you please point me in the right direction on how I can go about doing it. Thanks. |
I have written the set api to use Red and Black tree, but the degree sequence game internally relies on the logic that a set iterator will be an int. It an easy fix. Another 2 integers to keep track of number of elements iterated over. I'll update it soon