Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt examples run significantly slower than CUDA solver #147

Open
gerwang opened this issue Apr 18, 2019 · 5 comments
Open

Opt examples run significantly slower than CUDA solver #147

gerwang opened this issue Apr 18, 2019 · 5 comments

Comments

@gerwang
Copy link

gerwang commented Apr 18, 2019

I am using Opt with VS2015 and LLVM 6.0.1 on Windows. When I ran those examples, the Opt method is significantly slower than the corresponding CUDA solver, but the paper claims than Opt should be faster. Is it an issue with my platform or that is what it should be?

@Mx7f
Copy link
Collaborator

Mx7f commented Apr 30, 2019

What version of CUDA are you using? The values in the paper were generated with VS2013, CUDA 7.5, and LLVM 3.8.

@gerwang
Copy link
Author

gerwang commented May 1, 2019

I am using CUDA 10.0. I noticed that CUDA 10.0 should generate PTX 6.3, my LLVM actually supports PTX 6.0 (I applied the commit on https://marc.info/?l=llvm-commits&m=153783073315460&w=2 manually to get my LLVM to work), is that a problem?

@Mx7f
Copy link
Collaborator

Mx7f commented Jun 18, 2019

Sorry for the delay on this, I will be trying to upgrade my personal development machines to CUDA 10 this month, so hopefully can give you a real answer soon.

@mfratarcangeli
Copy link

I also have the same problem. I have compiled Terra from source using LLVM 6.01. I use Windows 10, CUDA 10.1. The cuda solver runs ~20X faster than Opt.

@ProfFan
Copy link

ProfFan commented Dec 24, 2019

Hi, @marfr960

Can you try using terra master with LLVM 7?

Fan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants