Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Experiments #58

Open
JMMackenzie opened this issue Mar 11, 2024 · 2 comments
Open

Inference Experiments #58

JMMackenzie opened this issue Mar 11, 2024 · 2 comments

Comments

@JMMackenzie
Copy link
Contributor

Hey all,

I'm looking at the Efficiency Study paper and I'd like to replicate the query encoding numbers - could you please provide a pipeline or any other pointers so I can ensure my measurement is correct?

Thanks a lot!

@carlos-lassance
Copy link

Hey Joel,

So at the time I think I basically tokenized everything (without taking the time into account) and then ran it a query at the time with a single CPU core (set with SLURM). I can try spinning a similar pipeline, but would love to hear your thoughts.

For later papers I started using a benchmarker from huggingface, but I cannot find it right now. I can try digging deeper if needed.

@JMMackenzie
Copy link
Contributor Author

Thanks Carlos! I was looking for the test setup for measuring inference latency for queries so I could replicate your numbers, but I think I can manage without it if you don't have it. I'll have a dig on HF and see if I can find anything. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants