Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RBTree for set #2313

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open

RBTree for set #2313

wants to merge 45 commits into from

Conversation

professorcode1
Copy link
Contributor

I have written the set api to use Red and Black tree, but the degree sequence game internally relies on the logic that a set iterator will be an int. It an easy fix. Another 2 integers to keep track of number of elements iterated over. I'll update it soon

@codecov
Copy link

codecov bot commented Feb 26, 2023

Codecov Report

Merging #2313 (5736bf6) into master (1d0ea4f) will decrease coverage by 0.05%.
The diff coverage is 82.55%.

❗ Current head 5736bf6 differs from pull request most recent head f937448. Consider uploading reports for the commit f937448 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2313      +/-   ##
==========================================
- Coverage   83.53%   83.48%   -0.05%     
==========================================
  Files         376      376              
  Lines       61640    61732      +92     
==========================================
+ Hits        51489    51535      +46     
- Misses      10151    10197      +46     
Impacted Files Coverage Δ
src/core/set.c 81.56% <80.52%> (-13.34%) ⬇️
src/cliques/cliques.c 95.74% <100.00%> (ø)
src/games/degree_sequence.c 98.49% <100.00%> (-0.01%) ⬇️
src/operators/subgraph.c 94.52% <100.00%> (ø)

... and 25 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d0ea4f...f937448. Read the comment docs.

@professorcode1 professorcode1 marked this pull request as ready for review February 26, 2023 18:47
@szhorvat
Copy link
Member

Unfortunately I won't have time to look at this for a while.

Whoever does the review, please check that there is no performance degradation in the IGRAPH_DEGSEQ_CONFIGURATION_SIMPLE method of igraph_degree_sequence_game(), which relies heavily on sets.

@ntamas ntamas self-assigned this Feb 26, 2023
@ntamas
Copy link
Member

ntamas commented Feb 26, 2023

I'll take a look at it early next week.

@szhorvat
Copy link
Member

szhorvat commented Feb 26, 2023

Benchmark suggestion:

Create a random graph, record its degree sequence, use that in igraph_degree_sequence_game(). Note that the timings will strongly depend on the specific degree sequence, so set a random seed!

Undirected suggestion:

  • Barabasi–Albert game, n=100 (tune n as needed), m=2
  • G(n,m), n=100, m=300-ish (tune m as needed, note that running time explodes in terms of m)
  • Regular graph with degree of ~7

Directed suggestion:

  • igraph_static_power_law_game, n=100, m=300-ish (tune this), both exponents set to 2.1
  • G(n,m), n=100, m=400-450-ish (tune m as needed, note that running time explodes in terms of m)
  • Regular graph with degree of ~5

@ntamas
Copy link
Member

ntamas commented Mar 7, 2023

FYI, this is not forgotten, I've started writing the benchmark, hopefully it will be ready today.

@ntamas
Copy link
Member

ntamas commented Mar 7, 2023

Benchmark results are in; it looks like the proposal in this PR did not improve performance, although there's probably still room for optimization.

Old version:

igraph_set_benchmark_old

Proposal in this PR:

image

Compiled in release mode, with LTO. I don't have time to debug this further, but it's probably worth looking at the code with a profiler.

@ntamas
Copy link
Member

ntamas commented Mar 7, 2023

One caveat with the benchmark above is that the implementation of igraph_degree_sequence_game() seems to be very sensitive on whether the algorithm gets "lucky" and finds an appropriate wiring of the graph with no multiple edges. The timings on the same graph, with the same algorithm, vary wildly depending on the random seed being used. There's a slim chance that changing the set implementation to RB-trees also changes the generated random numbers in the algorithm and the differences that we see are not due to the performance of the RB-trees but due to the fact that the algorithm spends more time trying to find an adequate wiring of the graph. We should probably count how many times igraph_set_contains() (or some other crucial set function) is called in the benchmarks; if there's a huge difference, this could mean that we are not testing the same thing in the two benchmarks even if the random seed and everything else is the same. Does the RB tree uses randomness anywhere?

src/core/set.c Outdated
Comment on lines 323 to 325
if(newNode == NULL){
IGRAPH_CHECK_OOM(newNode, "Cannot reserve space for the new set element.");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IGRAPH_CHECK_OOM already does the NULL-check, so the if is unnecessary.

@professorcode1
Copy link
Contributor Author

Just letting everyone know that I haven't given up on this. I got a job recently and I need to write a report to participate in a hackathon by this Wednesday. I will profile the code and upload the results here probably by Sunday, latest by next Wednesday. Then continue on improving the timing.

@professorcode1
Copy link
Contributor Author

Hi @szhorvat @ntamas
I am really sorry it took so long to do this. I am still adjusting to the new job life.

I ran the analysis and here are the results.
There are screenshots of the times and flame-graphs for the profiling.

Here is the time taken by the igraph set implementation

1 Degree sequence of undirected k-regular graph, N=1000, k=7, CONFIGURATION_SIMPLE
1.3s 1.2s 0s
2 Degree sequence of undirected BA graph, N=1000, m=1, CONFIGURATION_SIMPLE
0.789s 0.728s 0s
3 Degree sequence of undirected Erdos-Renyi graph, N=150, m=u450, CONFIGURATION_SIMPLE
0.804s 0.742s 0s
4 Degree sequence of GRG graph, N=10000, r=0.013, CONFIGURATION_SIMPLE
0.156s 0.144s 0s
5 Degree sequence of directed k-regular graph, N=1000, k=5, CONFIGURATION_SIMPLE
0.865s 0.798s 0s
6 Degree sequence of directed BA graph, N=500, m=2, CONFIGURATION_SIMPLE
2.22s 2.05s 0s
7 Degree sequence of directed Erdos—Renyi graph, N=15000, m=u45000, CONFIGURATION_SIMPLE
0.653s 0.603s 0s

Here is the time taken by the simple rb tree set implementation. The commit "set constructor error t"

  1. 2.400s 2.400s 0s
  2. 1.680s 1.680s 0s
  3. 1.460s 1.460s 0s
  4. 0.294s 0.294s 0s
  5. 1.610s 1.610s 0s
  6. 4.610s 4.610s 0s
  7. 1.290s 1.290s 0s

Next is the pool implementation here.
It is roughly 3 times slower than the simple rb tree implementation.

Lastly there is the multi pool implementation which is 5 times slower than the pool implementation. The multi pool implementation works by making a new pool rather than reallocating the old pool and updating the addresses. I don't understand why it is spending 60% of the time in reserve function, so its possible there's some bug there that I am not seeing(all the tests are passing though).

Here's something that instantly jumped out to me. In the simple implementation most time is wasted just deallocating the tree(44.21%). So if it is somehow possible to start a task asynchronously it will exactly be only slightly slower than the current implementation. Does igraph have a standard for threading? If yes, by making the contains function iterative instead of recursive and making the deletion asynchronous, I feel it should be as fast as the current implementation.

If not can we please just implement it alongside the current set implementation as igraph_rbtree_t? Developers can opt in to use it in case they need the iterator to behave like c++ stl iterators.

@ntamas
Copy link
Member

ntamas commented May 15, 2023

Does igraph have a standard for threading?

No; igraph is inherently single-threaded at the moment and there are no immediate plans to add support for threading in general. Also note that many host languages that igraph embeds itself in (Python, Mathematica and R being the most prominent) are also single-threaded so it's not clear yet how much we could gain by embracing the additional complexity that comes with threads. So far we've been using OpenMP in places in the code where we could immediately benefit from parallelizing a for loop, but your use-case does not seem to fit the bill.

If not can we please just implement it alongside the current set implementation as igraph_rbtree_t?

I don't have a strong opinion here but I'm slightly leaning towards saying no. igraph's goal is not to provide a full general-purposes data structure library for C. If there is a place in igraph's current code where we could benefit from using a red-black tree instead of sets, I'm okay with adding such a data type, but adding a red-black tree just for the sake of having it means more maintenance burden on our end for little gain. I'm pretty sure that if someone badly needs a red-black tree in C, there are plenty of open-source libraries that already provide that and it's unlikely we can do any better than those without putting significant effort in. (Opinions @szhorvat ?).

@professorcode1
Copy link
Contributor Author

@ntamas I have implemented the indexed pool igraph as you suggested. It is exactly as slow as the regular pool implementation with an identical flame graph. I guess each calloc call takes less time when you ask for very few bytes since the OS doesn't have to move memory a bunch of memory.

If there is a place in igraph's current code where we could benefit from using a red-black tree instead of sets, I'm okay with adding such a data type, but adding a red-black tree just for the sake of having it means more maintenance burden on our end for little gain.

But I do need the RB tree based sets to implement my ant colony graph coloring algorithm. The reason I cannot use the igraph set is because its iterator doesn't satisfy my use case: if an element is remove from the set the iterator will point to the wrong element, and I need to delete elements from the set while iterating over it.

It seems we have 2 options, either implementing RB Tree as an igraph export or to use it inside the ant colony graph coloring algorithm file only. Please let me know which one you would like me to proceed with.

@ntamas
Copy link
Member

ntamas commented May 16, 2023

But I do need the RB tree based sets to implement my ant colony graph coloring algorithm.

I see three possible ways forward:

  1. Just start using std::set in your own implementation of the ant colony graph coloring algorithm, and we can accept that this will be a small part of the igraph library that uses C++. As long as you expose a C API, I think it's okay to use C++ internally. There are already other parts in the igraph library that use C++, but they expose a C API so they can be used from pure C programs.

  2. If you can make your igraph_rbtree_t ipmlementation more performant than an std::set, you can implement your own in the context of the ant colony optimization PR and we can merge that as is. igraph_rbtree_t might or might not be exposed as a public API, we need to decide that together with the dev team.

  3. If you do not want to use std::set but you are not confident that you can make your igraph_rbtree_t at least as performant as an std::set, you can probably make use of the uthash library. This is a hash table so it's suitable to be used as a set as long as the items are hashable, it's pure C, it's header-only, and it supports deletion-safe iteration, so it probably fits your use-case.

@professorcode1
Copy link
Contributor Author

I have tried 3 other method to make the set faster(5 total now). So far the simplest implementation is still the fastest.

I can't make much sense of the flame graphs. I used them, hypothesized what might be causing the problem, tried to resolve it with another method and failed.

Right now my current hypothesis on why its slow is that the degree sequence game test make a lot (at least 100 if not thousand) individual sets of size 1 and 2 and then discards them, which is why the more complex tree implementation is slower. Since we are basically comparing how long it takes to allocate the 24 bytes for the current implementation vs the 32 bytes of the tree implementation times 1000.

I will implement the std::set wrapper and bench that. Just writing this comment to update you on the progress.

@professorcode1
Copy link
Contributor Author

Hey

I haven't been able to figure out how to integrate C++ into the project. Can you please point me in the right direction on how I can go about doing it.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants