Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use multi GPUS #31

Open
magicleo opened this issue May 18, 2023 · 2 comments
Open

Can I use multi GPUS #31

magicleo opened this issue May 18, 2023 · 2 comments

Comments

@magicleo
Copy link

magicleo commented May 18, 2023

I have 2 GPUS, each on has 24G memory.
when I run code below

model = SentenceTransformerSpecb( "bigscience/sgpt-bloom-7b1-msmarco", cache_folder = "/mnt/storage/agtech/modelCache", ) query_embeddings = model.encode(queries, is_query=True)
got OutOfMemoryError, it only use the first GPU. Can it load the model on two gpus?

OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 22.03 GiB total capacity; 21.27 GiB
already allocated; 50.94 MiB free; 21.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated
memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and
PYTORCH_CUDA_ALLOC_CONF

@Muennighoff
Copy link
Owner

For inference, you can use accelerate for that I think; Check huggingface/accelerate#769

@magicleo
Copy link
Author

magicleo commented May 18, 2023

@Muennighoff Thank you very much for your reply.
I tried code like below
model = SentenceTransformerSpecb( "bigscience/sgpt-bloom-7b1-msmarco", cache_folder="/mnt/storage/agtech/modelCache", ) accelerator = Accelerator() model = accelerator.prepare(model)

when run model= accelerator.prepare(model) I got CUDA out of memory,still only use first gpu.
Any suggest?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants