Apply suggestions from code review

Co-authored-by: Chris Holdgraf <choldgraf@gmail.com>
This commit is contained in:
Min RK
2022-09-07 11:55:33 +02:00
committed by GitHub
parent 1e9614b218
commit bf6786c55b

View File

@@ -40,17 +40,19 @@ The rest is going to be up to your users.
Per-user overhead from JupyterHub is typically negligible Per-user overhead from JupyterHub is typically negligible
up to at least a few hundred concurrent active users. up to at least a few hundred concurrent active users.
![Hub component resource usage for mybinder.org](../images/mybinder-hub-components-cpu-memory.png) ```[figure} ../images/mybinder-hub-components-cpu-memory.png
JupyterHub component resource usage for mybinder.org.
```
## Factors ## Factors to consider
### Static vs elastic resources ### Static vs elastic resources
A big factor in planning resources is: A big factor in planning resources is:
**how much does it cost to change your mind?** **how much does it cost to change your mind?**
If you are using a single shared machine with local storage, If you are using a single shared machine with local storage,
migrating to a new one because it turns out your users don't fit might be very costly, migrating to a new one because it turns out your users don't fit might be very costly.
because you have to get a new machine, set it up, and maybe even migrate user data. You will have to get a new machine, set it up, and maybe even migrate user data.
On the other hand, if you are using ephemeral resources, On the other hand, if you are using ephemeral resources,
such as node pools in Kubernetes, such as node pools in Kubernetes,
@@ -70,26 +72,26 @@ but which are **less predictable**.
(limits-requests)= (limits-requests)=
### Limit vs Request ### Limit vs Request for resources
Many scheduling tools like Kubernetes have two separate ways of allocating resources to users. Many scheduling tools like Kubernetes have two separate ways of allocating resources to users.
A **Request** or **Reservation** describes how much resources are _set aside_ for each user. A **Request** or **Reservation** describes how much resources are _set aside_ for each user.
Often, this doesn't have any practical effect other than deciding when a given machine is considered 'full'. Often, this doesn't have any practical effect other than deciding when a given machine is considered 'full'.
If you are using expandable resources like an autoscaling Kubernetes cluster, If you are using expandable resources like an autoscaling Kubernetes cluster,
'requesting' more resources than fit on currently running nodes is when a new node is launched and added to the pool (a cluster **scale-up event**). a new node must be launched and added to the pool if you 'request' more resources than fit on currently running nodes (a cluster **scale-up event**).
If you are running on a single VM, this describes how many users you can run at the same time, full stop. If you are running on a single VM, this describes how many users you can run at the same time, full stop.
A **Limit**, on the other hand, actually enforces a limit to how much resources any given user can consume. A **Limit**, on the other hand, enforces a limit to how much resources any given user can consume.
We'll see more information on what happens when users try to exceed their limits [below](oversubscription). For more information on what happens when users try to exceed their limits, see [](oversubscription).
In the strictest, safest case, you can have these two numbers be the same. In the strictest, safest case, you can have these two numbers be the same.
That means that each user is _limited_ to fit within the resources allocated to it. That means that each user is _limited_ to fit within the resources allocated to it.
This avoids **[oversubscription](oversubscription)** of resources (allowing use of more than you have available), This avoids **[oversubscription](oversubscription)** of resources (allowing use of more than you have available),
at the expense (in a literal, this-costs-money sense) of reserving lots of usually-idle capacity. at the expense (in a literal, this-costs-money sense) of reserving lots of usually-idle capacity.
But when deploying JupyterHub, However, you often find that a small fraction of users use more resources than others.
you will likely find that a relatively small fraction of users use lots more resources than others, In this case you may give users limits that _go beyond the amount of resources requested_.
making oversubscription attractive (to a point). This is called **oversubscribing** the resources available to users.
Having a gap between the request and the limit means you can fit a number of _typical_ users on a node (based on the request), Having a gap between the request and the limit means you can fit a number of _typical_ users on a node (based on the request),
but still limit how much a runaway user can gobble up for themselves. but still limit how much a runaway user can gobble up for themselves.
@@ -98,13 +100,14 @@ but still limit how much a runaway user can gobble up for themselves.
### Oversubscribed CPU is okay, running out of memory is bad ### Oversubscribed CPU is okay, running out of memory is bad
An important consideration when assigning resources to users is: An important consideration when assigning resources to users is: **What happens when users need more than I've given them?**
> What happens when users need more than I've given them? A good summary to keep in mind:
A good summary to keep in mind: **when tasks don't get enough CPU, things are slow. > When tasks don't get enough CPU, things are slow.
When they don't get enough memory, things are broken.** When they don't get enough memory, things are broken.
Which means it's very important that users have enough memory,
This means it's **very important that users have enough memory**,
but much less important that they always have exclusive access to all the CPU they can use. but much less important that they always have exclusive access to all the CPU they can use.
This relates to [Limits and Requests](limits-requests), This relates to [Limits and Requests](limits-requests),
@@ -125,7 +128,7 @@ meaning that the total reserved resources isn't enough for the total _actual_ co
This doesn't mean that _all_ your users exceed the request, This doesn't mean that _all_ your users exceed the request,
just that the _limit_ gives enough room for the _average_ user to exceed the request. just that the _limit_ gives enough room for the _average_ user to exceed the request.
### Example case for oversubscribe memory ### Example case for oversubscribing memory
Take for example, this system and sampling of user behavior: Take for example, this system and sampling of user behavior:
@@ -143,10 +146,10 @@ But _not_ everyone uses the full limit, which is the point!
This pattern is fine if 1/8 of your users are 'heavy' because _typical_ usage will be ~0.7G, This pattern is fine if 1/8 of your users are 'heavy' because _typical_ usage will be ~0.7G,
and your total usage will be ~5G (1 × 2 + 7 × 0.5 = 5.5). and your total usage will be ~5G (1 × 2 + 7 × 0.5 = 5.5).
But if _50%_ of your users are 'heavy' you have a problem because that means your users will be trying to use 10G (4 × 2 + 4 × 0.5 = 10), But if _50%_ of your users are 'heavy' you have a problem because that means your users will be trying to use 10G (`4 × 2 + 4 × 0.5 = 10`),
which you don't have. which you don't have.
You can make guesses at these numbers, but the only _real_ way to get them is to measure (more [below](measuring)). You can make guesses at these numbers, but the only _real_ way to get them is to measure (see [](measuring)).
### CPU:memory ratio ### CPU:memory ratio
@@ -191,7 +194,9 @@ The limit here is actually Kubernetes' pods per node, not memory _or_ CPU.
This is likely a extreme case, as many Binder users come from clicking links on webpages This is likely a extreme case, as many Binder users come from clicking links on webpages
without any actual intention of running code. without any actual intention of running code.
![mybinder.org node CPU usage is low with 50-150 users sharing just 8 cores](../images/mybinder-load5.png) ```[figure} ../images/mybinder-load5.png
mybinder.org node CPU usage is low with 50-150 users sharing just 8 cores
```
## More tips ## More tips