Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P2] Multi-GPU model sharding with intervening evaluation and training #54

Open
frankaging opened this issue Jan 15, 2024 · 0 comments
Open

Comments

@frankaging
Copy link
Collaborator

Descriptions:

The library is not tested with multi-GPU use cases. We assume the intervening model can be loaded into a single GPU. This is not ideal for interventions on 70B models, for instance. We want to be able to load the model into multiple GPUs using sharding.

Static interventions need to be attached to the right component on the right machine in case of model sharing. Training interventions need to be mapped onto the right machine where the corresponding model component lives as well.

This could be a large task. The first step is clear: try out static interventions (e.g., vanilla interventions) when models are loaded into multiple GPUs during inference time.

@frankaging frankaging changed the title [P2] Multi-GPU intervening evaluation and training [P2] Multi-GPU model sharing with intervening evaluation and training Jan 17, 2024
@frankaging frankaging changed the title [P2] Multi-GPU model sharing with intervening evaluation and training [P2] Multi-GPU model sharding with intervening evaluation and training Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant