Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocular: Basic example of routing using OpenCL #547

Draft
wants to merge 42 commits into
base: master
Choose a base branch
from
Draft

Conversation

daveshah1
Copy link
Contributor

@daveshah1 daveshah1 commented Dec 27, 2020

This is currently still very experimental and needs a week or two's more work, but opening a PR in case anyone is interested in giving feedback on the current state or OpenCL integration.

Currently it can legally route some smaller ECP5 designs, with mediocre performance (about 20% slower than router1 with a GTX 960, with overhead being a big problem, only half the runtime is spent in the router kernel). Larger designs are currently hitting big congestion issues (the termination criterion is too close to a BFS and doesn't count the cost of the found solution) and also occasional illegal route tree failures (probably a race condition somewhere).

Things still TODO:

  • Improved termination/congestion handling and bounding box expansion
  • Remove duplicate queue entries
  • Implement the near/far pile separation to improve runtime (some parts of this are present but it's not being used yet)
  • Better selection options for OpenCL device
  • Finding ways to reduce overhead
  • Consider non-rectangular bounding boxes for long but low-fanout nets
  • Benchmark performance on a FPGA accelerator card using the Xilinx/Intel OpenCL stacks
  • Buy a better GPU ;)

Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
@pepijndevos
Copy link
Member

Very cool!

My limited understanding of OpenCL programming is that it's easy to write bad OpenCL and hard to write the good stuff, mostly because of memory locality and the ideal amount of parallelization depending on the hardware.

My only experience with GPGPU programming is using Futhark, which aims to take care of these details, and ideally allow you to write straightforward functional code and get optimized OpenCL out. So I admire your courage to do it the hard way :)

You mentioned FPGA accelerator cards as an OpenCL target, which is super interesting and appropriate for an FPGA tool haha. I asked the author of Futhark if he had considered supporting FPGAs, but he said the trade-offs for good performance are very different between GPUs and FPGAs, so curious to see if you will manage to get good performance out of both.

Signed-off-by: David Shah <dave@ds0.me>
Signed-off-by: David Shah <dave@ds0.me>
@rowanG077
Copy link
Contributor

+1 on Futhark. Very easy to get relatively high quality GPGPU code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants