Estimating BinderHub cluster size #60

TomasBeuzen · 2020-09-18T21:53:43Z

I'm trying to get a good estimate of the required cluster size using this guide in Z2JH.

Here are my assumptions:

Memory

Max users = 50
Max expected concurrent users = 60% * max users = 30 (because it is not likely that everyone will use at same time)
Expected memory usage per user:
- I used nbresuse to estimate a user's memory usage in the notebook.
- A notebook by itself is about 120mb I tried to take it to the extreme, executing all the code in multiple chapters and loading in plenty of datasets. I was pushing ~300mb memory usage.
- A single chapter was more commonly 100-200mb (including data and plots).
- Let's be conservative and assume 300mb (we can downgrade in future)
- If a user uses more than the available amount of memory, their notebook kernel will restart and memory will be flushed.

memory = max concurrent users * memory per user + 128mb (for JH overhead) = 30 * 300mb + 128mb = ~9GB

CPU

This is harder to estimate but also less of an issue, if we're running low on CPU, things will just run slower but nothing will break.
I took a look at the JupyterHub Tiffany set up for MDS and it's had a peak usage of just 5% since we started MDS so obviously a very conservative instance.
The JH is using a m5.12xlarge:

Summary

To meet memory and CPU requirements I'm going to start with using 2 x m5.2xlarge instances (the cluster can scale to 4 if needed). I think this is conservative but we'll see. I'll report back.

Here's a comparison of the two instances I mentioned:

Instance	CPU	RAM	Memory (GB)
m5.2xlarge	8	37	32
m5.12xlarge	48	168	192

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimating BinderHub cluster size #60

Estimating BinderHub cluster size #60

TomasBeuzen commented Sep 18, 2020 •

edited

Estimating BinderHub cluster size #60

Estimating BinderHub cluster size #60

Comments

TomasBeuzen commented Sep 18, 2020 • edited

Memory

CPU

Summary

TomasBeuzen commented Sep 18, 2020 •

edited