Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimating BinderHub cluster size #60

Open
TomasBeuzen opened this issue Sep 18, 2020 · 0 comments
Open

Estimating BinderHub cluster size #60

TomasBeuzen opened this issue Sep 18, 2020 · 0 comments

Comments

@TomasBeuzen
Copy link
Collaborator

TomasBeuzen commented Sep 18, 2020

I'm trying to get a good estimate of the required cluster size using this guide in Z2JH.

Here are my assumptions:

Memory

  • Max users = 50
  • Max expected concurrent users = 60% * max users = 30 (because it is not likely that everyone will use at same time)
  • Expected memory usage per user:
    • I used nbresuse to estimate a user's memory usage in the notebook.
    • A notebook by itself is about 120mb I tried to take it to the extreme, executing all the code in multiple chapters and loading in plenty of datasets. I was pushing ~300mb memory usage.
    • A single chapter was more commonly 100-200mb (including data and plots).
    • Let's be conservative and assume 300mb (we can downgrade in future)
    • If a user uses more than the available amount of memory, their notebook kernel will restart and memory will be flushed.

memory = max concurrent users * memory per user + 128mb (for JH overhead) = 30 * 300mb + 128mb = ~9GB

CPU

  • This is harder to estimate but also less of an issue, if we're running low on CPU, things will just run slower but nothing will break.
  • I took a look at the JupyterHub Tiffany set up for MDS and it's had a peak usage of just 5% since we started MDS so obviously a very conservative instance.
  • The JH is using a m5.12xlarge:

Summary

To meet memory and CPU requirements I'm going to start with using 2 x m5.2xlarge instances (the cluster can scale to 4 if needed). I think this is conservative but we'll see. I'll report back.

Here's a comparison of the two instances I mentioned:

Instance CPU RAM Memory (GB)
m5.2xlarge 8 37 32
m5.12xlarge 48 168 192
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant