Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Princeton GPU hackathon 2022 wrap-up #325

Open
2 tasks
FlorianReiss opened this issue Jun 9, 2022 · 1 comment
Open
2 tasks

Princeton GPU hackathon 2022 wrap-up #325

FlorianReiss opened this issue Jun 9, 2022 · 1 comment

Comments

@FlorianReiss
Copy link
Member

FlorianReiss commented Jun 9, 2022

during the hackathon several issues related to running on GPUs were understood

stack overflow

some functions, in particular the kMatrix, allocate a lot of memory on the stack, which can exceed the default limit. Possible solutions are more memory-friendly implementations, using malloc (usually slower) or increasing the stack size with cuda_error_check(cudaDeviceSetLimit (cudaLimitStackSize, 1024*50));

related issues #287 #242

using RO_CACHE on stack memory

it is not allowed to use RO_CACHE/__ldg on stack memory

related issues #216 #217

see MR #328

to do

  • finalize and merge fixes
  • document everything learned
@FlorianReiss
Copy link
Member Author

FlorianReiss commented Jun 9, 2022

@thboettc @henryiii @JuanBSLeite just a little summary to keep track of the things we fixed or should be able to fix after the hackathon. Feel free to edit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant