Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft PR] Parallel Utils #3464

Draft
wants to merge 25 commits into
base: graph-function
Choose a base branch
from
Draft

[Draft PR] Parallel Utils #3464

wants to merge 25 commits into from

Conversation

anuchak
Copy link
Collaborator

@anuchak anuchak commented May 8, 2024

This is a draft PR for the Parallel Utils interface to be used for Algorithm library.
I am reusing some of the demo functions added by Xiyang. Since the front end interface is not ready, I have a defined an operator AlgorithmRunner that orchestrates execution of the function.

Main files added for parallel utils:

(1) parallel_utils.cpp
(2) algorithm_runner.cpp
(3) graph_algorithms.cpp
(4) shortest_path.cpp

  • There is a main graph_algorithm struct defined and all algorithms will override the compute() function. The compute function gets called from the AlgorithmRunner operator at runtime.
  • Currently TableFunctions store a reference to a "tableFunc" that gets executed at runtime (like a function pointer).
  • I have extended this to support a list of functions that need to get executed in parallel (for eg. in shortest_path algorithm, there are two functions extendFrontierFunc and shortestPathOutputFunc for frontier extension and parallel output writing executing in parallel).
  • AlgorithmRunner is registered as a single threaded task, the first thread to register itself is the master that orchestrates the execution of the graph algorithm. It will call graphAlgorithm→compute()
  • ParallelUtils is defined inside AlgorithmRunner and gets invoked by the master thread to submit tasks to the task queue. To submit a task, doParallel() is called with a copy of the Sink operator (in this case AlgorithmRunner).
  • The other threads act as workers and pick up the new task from the queue. Since they need access to local state (value vectors, ftable), unlike the master they will do initLocalState.
  • runWorker() method is reserved for worker threads, the functions are invoked from here. The return result is the no. of tuples processed by the thread, if it is non-zero then the results need to be merged to the local FTable. Once a function returns 0, it indicates no more morsels remain. The local FTable's results are merged to the global FTable.

Parallelization Results (tested on spotify dataset)

Query - CALL shortest_path('spotify_n', 'spotify_e', 1, 30, 85) RETURN dst_offset, path_length;
Results - 3,592,911 tuples (unweighted bfs from node offset 85 to all nodes in the graph as destinations)

Threads Runtime (in seconds)
1 16.91
32 2.15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant