Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Scatter Block issues at large scale #20

Open
wckzhang opened this issue May 19, 2020 · 1 comment
Open

Reduce Scatter Block issues at large scale #20

wckzhang opened this issue May 19, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@wckzhang
Copy link
Contributor

Reduce Scatter Block hits lots of issues in regards to running out of memory and such at ranks 128+. Need to figure out a way to handle this...Possible to just ignore the failures and just revamp the parsing code a little. Needs more thought.

@wckzhang wckzhang added the bug Something isn't working label May 19, 2020
@wckzhang wckzhang self-assigned this May 19, 2020
@wckzhang wckzhang added enhancement New feature or request and removed bug Something isn't working labels May 19, 2020
@wckzhang
Copy link
Contributor Author

This is more of a robustness issue, non critical but it's annoying since I have to juggle two datasets, one with reduce_scatter_block, and one with every other collective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant