[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
2025-10-18 23:42:59 +00:00 · 2023-01-09 21:04:01 +00:00
parent 6679c389b5
commit 3caea2a463
1 changed files with 100 additions and 112 deletions
--- a/docs/source/explanation/concepts.md
+++ b/docs/source/explanation/concepts.md
@@ -1,21 +1,21 @@
 # JupyterHub: A conceptual overview
-JupyterHub is not what you think it is.  Most things you think are
+JupyterHub is not what you think it is. Most things you think are
 part of JupyterHub are actually handled by some other component, for
 example the spawner or notebook server itself, and it's not always
-obvious how the parts relate.  The knowledge contained here hasn't
+obvious how the parts relate. The knowledge contained here hasn't
 been assembled in one place before, and is essential to understand
 when setting up a sufficiently complex Jupyter(Hub) setup.
 This document was originally written to assist in debugging: very
 often, the actual problem is not where one thinks it is and thus
-people can't easily debug.  In order to tell this story, we start at
+people can't easily debug. In order to tell this story, we start at
 JupyterHub and go all the way down to the fundamental components of
 Jupyter.
 In this document, we occasionally leave things out or bend the truth
 where it helps in explanation, and give our explanations in terms of
-Python even though Jupyter itself is language-neutral.  The "(&)"
+Python even though Jupyter itself is language-neutral. The "(&)"
 symbol highlights important points where this page leaves out or bends
 the truth for simplification of explanation, but there is more if you
 dig deeper.
@@ -24,38 +24,35 @@ This guide is long, but after reading it you will be know of all major
 components in the Jupyter ecosystem and everything else you read
 should make sense.
 ## What is Jupyter?
-Before we get too far, let's remember what our end goal is.  A
+Before we get too far, let's remember what our end goal is. A
 **Jupyter Notebook** is nothing more than a Python(&) process
 which is getting commands from a web browser and displaying the output
-via that browser.  What the process actually sees is roughly like
+via that browser. What the process actually sees is roughly like
 getting commands on standard input(&) and writing to standard
-output(&).  There is nothing intrinsically special about this process
+output(&). There is nothing intrinsically special about this process
 - it can do anything a normal Python process can do, and nothing more.
-The **Jupyter kernel** handles capturing output and converting things
+  The **Jupyter kernel** handles capturing output and converting things
-such as graphics to a form usable by the browser.
+  such as graphics to a form usable by the browser.
 Everything we explain below is building up to this, going through many
 different layers which give you many ways of customizing how this
 process runs.
 ## JupyterHub
 **JupyterHub** is the central piece that provides multi-user
 login capabilities. Despite this, the end user only briefly interacts with
 JupyterHub and most of the actual Jupyter session does not relate to
 the hub at all: the hub mainly handles authentication and creating (JupyterHub calls it "spawning") the
-single-user server.  In short, anything which is related to *starting*
+single-user server. In short, anything which is related to _starting_
 the user's workspace/environment is about JupyterHub, anything about
-*running* usually isn't.
+_running_ usually isn't.
 If you have problems connecting the authentication, spawning, and the
-proxy (explained below), the issue is usually with JupyterHub.  To
+proxy (explained below), the issue is usually with JupyterHub. To
 debug, JupyterHub has extensive logs which get printed to its console
 and can be used to discover most problems.
@@ -63,41 +60,40 @@ The main pieces of JupyterHub are:
 ### Authenticator
-JupyterHub itself doesn't actually manage your users.  It has a
+JupyterHub itself doesn't actually manage your users. It has a
 database of users, but it is usually connected with some other system
-that manages the usernames and passwords.  When someone tries to log
+that manages the usernames and passwords. When someone tries to log
 in to JupyteHub, it asks the
 **authenticator**([basics](authenticators-users-basics),
 [reference](../reference/authenticators)) if the
-username/password is valid(&).  The authenticator returns a username(&),
+username/password is valid(&). The authenticator returns a username(&),
 which is passed on to the spawner, which has to use it to start that
-user's environment.  The authenticator can also return user
+user's environment. The authenticator can also return user
 groups and admin status of users, so that JupyterHub can do some
 higher-level management.
 The following authenticators are included with JupyterHub:
 - **PAMAuthenticator** uses the standard Unix/Linux operating system
-  functions to check users.  Roughly, if someone already has access to
+  functions to check users. Roughly, if someone already has access to
  the machine (they can log in by ssh), they will be able to log in to
-  JupyterHub without any other setup.  Thus, JupyterHub fills the role
+  JupyterHub without any other setup. Thus, JupyterHub fills the role
  of a ssh server, but providing a web-browser based way to access the
  machine.
 There are [plenty of others to choose from](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
 You can connect to almost any other existing service to manage your
-users.  You either use all users from this other service (e.g. your
+users. You either use all users from this other service (e.g. your
 company), or enable only the allowed users (e.g. your group's
-Github usernames).  Some other popular authenticators include:
+Github usernames). Some other popular authenticators include:
 - **OAuthenticator** uses the standard OAuth protocol to verify users.
  For example, you can easily use Github to authenticate your users -
-  people have a "click to login with Github" button.  This is often
+  people have a "click to login with Github" button. This is often
  done with a allowlist to only allow certain users.
 - **NativeAuthenticator** actually stores and validates its own
-  usernames and passwords, unlike most other authenticators.  Thus,
+  usernames and passwords, unlike most other authenticators. Thus,
  you can manage all your users within JupyterHub only.
 - There are authenticators for LTI (learning management systems),
@@ -111,7 +107,7 @@ The authenticator runs internally to the Hub process but communicates
 with outside services.
 If you have trouble logging in, this is usually a problem of the
-authenticator.  The authenticator logs are part of the the JupyterHub
+authenticator. The authenticator logs are part of the the JupyterHub
 logs, but there may also be relevant information in whatever external
 services you are using.
@@ -120,10 +116,10 @@ services you are using.
 The **spawner** ([basics](spawners-basics),
 [reference](../reference/spawners)) is the real core of
 JupyterHub: when someone wants a notebook server, the spawner allocates
-resources and starts the server.  The notebook server could run on the
+resources and starts the server. The notebook server could run on the
 same machine as JupyterHub, on another machine, on some cloud service,
-or more.  Administrators can limit resources (CPU, memory) or isolate users
+or more. Administrators can limit resources (CPU, memory) or isolate users
-from each other - if the spawner supports it.  They can also do no
+from each other - if the spawner supports it. They can also do no
 limiting and allow any user to access any other user's files if they
 are not configured properly.
@@ -131,10 +127,10 @@ Some basic spawners included in JupyterHub are:
 - **LocalProcessSpawner** is built into JupyterHub. Upon launch it tries
  to switch users to the given username (`su` (&)) and start the
-  notebook server.  It requires that the hub be run as root (because
+  notebook server. It requires that the hub be run as root (because
  only root has permission to start processes as other user IDs).
  LocalProcessSpawner is no different than a user logging in with
-  something like `ssh` and running `jupyter notebook`.  PAMAuthenticator and
+  something like `ssh` and running `jupyter notebook`. PAMAuthenticator and
  LocalProcessSpawner is the most basic way of using JupyterHub (and
  what it does out of the box) and makes the hub not too dissimilar to
  an advanced ssh server.
@@ -143,10 +139,10 @@ There are [many more advanced spawners](/reference/spawners), and to
 show the diversity of spawning strategys some are listed below:
 - **SudoSpawner** is like LocalProcessSpawner but lets you run
-  JupyterHub without root.  `sudo` has to be configured to allow the
+  JupyterHub without root. `sudo` has to be configured to allow the
  hub's user to run processes under other user IDs.
- **SystemdSpawner** uses Systemd to start other processes.  It can
+- **SystemdSpawner** uses Systemd to start other processes. It can
  isolate users from each other and provide resource limiting.
 - **DockerSpawner** runs stuff in Docker, a containerization system.
@@ -154,15 +150,15 @@ show the diversity of spawning strategys some are listed below:
  other container images to fully customize the environment.
 - **KubeSpawner** runs on the Kubernetes, a cloud orchestration
-  system.  The spawner can easily limit users and provide cloud
+  system. The spawner can easily limit users and provide cloud
  scaling - but the spawner doesn't actually do that, Kubernetes
-  does.  The spawner just tells Kubernetes what to do.  If you want to
+  does. The spawner just tells Kubernetes what to do. If you want to
  get KubeSpawner to do something, first you would figure out how to
  do it in Kubernetes, then figure out how to tell KubeSpawner to tell
-  Kubernetes that.  Actually... this is true for most spawners.
+  Kubernetes that. Actually... this is true for most spawners.
 - **BatchSpawner** runs on computer clusters with batch job scheduling
-  systems (e.g Slurm, HTCondor, PBS, etc).  The user processes are run
+  systems (e.g Slurm, HTCondor, PBS, etc). The user processes are run
  as batch jobs, having access to all the data and software that the
  users normally will.
@@ -171,62 +167,61 @@ system, and to configure them right you need to know a bit about how
 the corresponding operating system service works.
 The spawner is responsible for the environment of the single-user
-notebook servers (described in the next section).  In the end, it just
+notebook servers (described in the next section). In the end, it just
 makes a choice about how to start these processes: for example, the
 Docker spawner starts a normal Docker container and runs the right
-command inside of it.  Thus, the spawner is responsible for setting
+command inside of it. Thus, the spawner is responsible for setting
 what kind of software and data is available to the user.
 The spawner runs internally to the Hub process but communicates with
-outside services.  It is configured by `c.JupyterHub.spawner_class` in
+outside services. It is configured by `c.JupyterHub.spawner_class` in
 `jupyterhub_config.py`.
 If a user tries to launch a notebook server and it doesn't work, the
 error is usually with the spawner or the notebook server (as described
-in the next section).  Each spawner outputs some logs to the main
+in the next section). Each spawner outputs some logs to the main
 JupyterHub logs, but may also have logs in other places depending on
 what services it interacts with (for example, the Docker spawner
 somehow puts logs in the Docker system services, Kubernetes through
 the `kubectl` API).
 ### Proxy
 The JupyterHub **proxy** relays connections between the users
-and their single-user notebook servers.  What this basically means is
+and their single-user notebook servers. What this basically means is
 that the hub itself can shut down and the proxy can continue to
-allow users to communicate with their notebook servers.  (This
+allow users to communicate with their notebook servers. (This
 further emphasizes that the hub is responsible for starting, not
-running, the notebooks).  By default, the hub starts the proxy
+running, the notebooks). By default, the hub starts the proxy
 automatically
 and stops the proxy when the hub stops (so that connections get
-interrupted).  But when you [configure the proxy to run
+interrupted). But when you [configure the proxy to run
 separately](../reference/separate-proxy),
 user's connections will continue to work even without the hub.
 The default proxy is **ConfigurableHttpProxy** which is simple but
-effective.  A more advanced option is the [**Traefik Proxy**](https://blog.jupyter.org/introducing-traefikproxy-a-new-jupyterhub-proxy-based-on-traefik-4839e972faf6),
+effective. A more advanced option is the [**Traefik Proxy**](https://blog.jupyter.org/introducing-traefikproxy-a-new-jupyterhub-proxy-based-on-traefik-4839e972faf6),
 which gives you redundancy and high-availability.
-When users "connect to JupyterHub", they *always* first connect to the
+When users "connect to JupyterHub", they _always_ first connect to the
-proxy and the proxy relays the connection to the hub.  Thus, the proxy
+proxy and the proxy relays the connection to the hub. Thus, the proxy
 is responsible for SSL and accepting connections from the rest of the
-internet.  The user uses the hub to authenticate and start the server,
+internet. The user uses the hub to authenticate and start the server,
 and then the hub connects back to the proxy to adjust the proxy routes
 for the user's server (e.g. the web path `/user/someone` redirects to
-the server of someone at a certain internal address).  The proxy has
+the server of someone at a certain internal address). The proxy has
 to be able to internally connect to both the hub and all the
 single-user servers.
 The proxy always runs as a separate process to JupyterHub (even though
-JupyterHub can start it for you).  JupyterHub has one set of
+JupyterHub can start it for you). JupyterHub has one set of
 configuration options for the proxy addresses (`bind_url`) and one for
-the hub (`hub_bind_url`).  If `bind_url` is given, it is just passed to
+the hub (`hub_bind_url`). If `bind_url` is given, it is just passed to
 the automatic proxy to tell it what to do.
 If you have problems after users are redirected to their single-user
 notebook servers, or making the first connection to the hub, it is
-usually caused by the proxy.  The ConfigurableHttpProxy's logs are
+usually caused by the proxy. The ConfigurableHttpProxy's logs are
 mixed with JupyterHub's logs if it's started through the hub (the
 default case), otherwise from whatever system runs the proxy (if you
 do configure it, you'll know).
@@ -237,28 +232,28 @@ JupyterHub has the concept of **services**
 ([basics](services-basics),
 [reference](../reference/services)), which are other web services
 started by the hub, but otherwise are not necessarily related to the
-hub itself.  They are often used to do things related to Jupyter
+hub itself. They are often used to do things related to Jupyter
 (things that user interacts with, usually not the hub), but could
-always be run some other way.  Running from the hub provides an easy
+always be run some other way. Running from the hub provides an easy
-way to get Hub API tokens and authenticate users against the hub.  It
+way to get Hub API tokens and authenticate users against the hub. It
 can also automatically add a proxy route to forward web requests to
 that service.
 A common example of a service is the [cull idle
 servers](https://jupyterhub.readthedocs.io/en/stable/getting-started/services-basics.html#real-world-example-to-cull-idle-servers)
-script.  When started by the hub, it automatically gets admin API
+script. When started by the hub, it automatically gets admin API
-tokens.  It uses the API to list all running servers, compare against
+tokens. It uses the API to list all running servers, compare against
-activity timeouts, and shut down servers exceeding the limits.  Even
+activity timeouts, and shut down servers exceeding the limits. Even
 though this is an intrinsic part of JupyterHub, it is only loosely
 coupled and running as a service provides convenience of
 authentication - it could be just as well run some other way, with a
 manually provided API token.
-Another example is *sharing files using hubshare*.  Hubshare would work as an external service
+Another example is _sharing files using hubshare_. Hubshare would work as an external service
 which user notebooks talk to and use Hub authentication, but otherwise
-it isn't directly a matter of the hub.  You could equally well share
+it isn't directly a matter of the hub. You could equally well share
 files by other extensions to the single-user notebook servers or
-configure the spawners to access shared storage spaces.  In order to
+configure the spawners to access shared storage spaces. In order to
 use something such as hubshare, the difficulty is not modifying
 JupyterHub: it is modifying the notebook servers to speak to some
 service, and making that service.
@@ -269,8 +264,6 @@ services from the hub.
 When a service is started from JupyterHub automatically, its logs are
 included in the JupyterHub logs.
 ## Single-user notebook server
 The **single-user notebook server** is the same thing you get by
@@ -278,12 +271,12 @@ running `jupyter notebook` or `jupyter lab` from the command line -
 the actual Jupyter user interface for a single person.
 The role of the spawner is to start this server - basically, running
-the command `jupyter notebook`.  Actually it doesn't run that, it runs
+the command `jupyter notebook`. Actually it doesn't run that, it runs
 `jupyterhub-singleuser` which first communicates with the hub to say
-"I'm alive" before running a completely normal Jupyter server.  The
+"I'm alive" before running a completely normal Jupyter server. The
-single-user server can be JupyterLab or classic notebooks.  By this
+single-user server can be JupyterLab or classic notebooks. By this
 point, the hub is almost completely out of the picture (the web
-traffic is going through proxy unchanged).  Also by this time, the
+traffic is going through proxy unchanged). Also by this time, the
 spawner has already decided the environment which this single-user
 server will have and the single-user server has to deal with that.
@@ -293,9 +286,9 @@ environment variables like `JUPYTERHUB_API_TOKEN` and
 back to the hub in order to say that it's ready.
 The single-user server options are **JupyterLab** and **classic
-Jupyter Notebook**.  They both run through the same backend server process--the web
+Jupyter Notebook**. They both run through the same backend server process--the web
-frontend is an option when it is starting.  The spawner can choose the
+frontend is an option when it is starting. The spawner can choose the
-command line when it starts the single-user server.  Extensions are a
+command line when it starts the single-user server. Extensions are a
 property of the single-user server (in two parts: there can be a part
 that runs in the Python server process, and parts that run in
 javascript in lab or notebook).
@@ -303,21 +296,21 @@ javascript in lab or notebook).
 If one wants to install software for users, it is not a matter of
 "installing it for JupyerHub" - it's a matter of installing it for the
 single-user server, which might be the same environment as the hub,
-but not necessarily.  (see below - it's a matter of the kernels!)
+but not necessarily. (see below - it's a matter of the kernels!)
 After the single-user notebook server is started, any errors are only
-an issue of the single-user notebook server.  Sometimes, it seems like
+an issue of the single-user notebook server. Sometimes, it seems like
 the spawner is failing, but really the spawner is working but the
 single-user notebook server dies right away (in this case, you need to
 find the problem with the single-user server and adjust the spawner to
-start it correctly or fix the environment).  This can happen, for
+start it correctly or fix the environment). This can happen, for
 example, if the spawner doesn't set an environment variable or doesn't
 provide storage.
 The single-user server's logs are printed to stdout/stderr, and the
 spawer decides where those streams are directed, so if you
 notice problems at this phase you need to check your spawner for
-instructions for accessing the single-user logs.  For example, the
+instructions for accessing the single-user logs. For example, the
 LocalProcessSpawner logs are just outputted to the same JupyterHub
 output logs (TODO is this correct?), the SystemdSpawner logs are
 written to the Systemd journal, Docker and Kubernetes logs are written
@@ -325,50 +318,47 @@ to Docker and Kubernetes respectively, and batchspawner output goes to
 the normal output places of batch jobs and is an explicit
 configuration option of the spawner.
 **(Jupyter) Notebook** is the classic interface, where each notebook
-opens in a separate tab.  It is traditionally started by `jupyter
+opens in a separate tab. It is traditionally started by `jupyter
-notebook`.  Does anything need to be said here?
+notebook`. Does anything need to be said here?
 **JupyterLab** is the new interface, where multiple notebooks are
-openable in the same tab in an IDE-like environment.  It is
+openable in the same tab in an IDE-like environment. It is
-traditionally started with `jupyter lab`.  Both Notebook and Lab use
+traditionally started with `jupyter lab`. Both Notebook and Lab use
 the same `.ipynb` file format.
 JupyterLab is run thorugh the same server file, but at a path `/lab`
-instead of `/tree`.  Thus, they can be active at the same time in the
+instead of `/tree`. Thus, they can be active at the same time in the
 backend and you can switch between them at runtime by changing your
 URL path.
 Extensions need to be re-written for JupyterLab (if moving from
-classic notebooks).  But, the server-side of the extensions can be
+classic notebooks). But, the server-side of the extensions can be
 shared by both.
 ## Kernel
 The commands you run in the notebook session are not executed in the same process as
-the notebook itself, but in a separate **Jupyter kernel**.  There are [many
+the notebook itself, but in a separate **Jupyter kernel**. There are [many
 kernels
 available](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels).
 As a basic approximation, a **Jupyter kernel** is a process which
 accepts commands (cells that are run) and returns the output to
-Jupyter to display.  One example is the **IPython Jupyter kernel**,
+Jupyter to display. One example is the **IPython Jupyter kernel**,
-which runs Python.  There is nothing special about it, it can be
+which runs Python. There is nothing special about it, it can be
-considered a *normal Python process.  The kernel process can be
+considered a \*normal Python process. The kernel process can be
 approximated in UNIX terms as a process that takes commands on stdin
-and returns stuff on stdout(&).  Obviously, it's more because it has
+and returns stuff on stdout(&). Obviously, it's more because it has
 to be able to disentangle all the possible outputs, such as figures,
 and present it to the user in a web browser.
 Kernel communication is via the the ZeroMQ protocol on the local
-computer.  Kernels are separate processes from the main single-user
+computer. Kernels are separate processes from the main single-user
 notebook server (and thus obviously, different from the JupyterHub
-process and everything else).  By default (and unless you do something
+process and everything else). By default (and unless you do something
 special), kernels share the same environment as the notebook server
-(data, resource limits, permissions, user id, etc.).  But they *can*
+(data, resource limits, permissions, user id, etc.). But they _can_
 run in a separate Python environment from the single-user server
 (search `--prefix` in the [ipykernel installation
 instructions](https://ipython.readthedocs.io/en/stable/install/kernel_install.html))
@@ -383,21 +373,21 @@ A kernel doesn't just execute it's language - cell magics such as `%`,
 IPython kernel commands and don't necessarily work in any other
 kernel unless they specifically support them.
-Kernels are yet *another* layer of configurability.
+Kernels are yet _another_ layer of configurability.
 Each kernel can run a different programming language, with different
-software, and so on.  By default, they would run in the same
+software, and so on. By default, they would run in the same
 environment as the single-user notebook server, and the most common
 other way they are configured is by
 running in different Python virtual environments or conda
-environments.  They can be started and killed independently (there is
+environments. They can be started and killed independently (there is
-normally one per notebook you have open).  The kernel uses
+normally one per notebook you have open). The kernel uses
 most of your memory and CPU when running Jupyter - the rest of the web
 interface has a small footprint.
 You can list your installed kernels with `jupyter kernelspec list`.
 If you look at one of `kernel.json` files in those directories, you
-will see exactly what command is run.  These are normally
+will see exactly what command is run. These are normally
-automatically made by the kernels, but can be edited as needed.  [The
+automatically made by the kernels, but can be edited as needed. [The
 spec](https://jupyter-client.readthedocs.io/en/stable/kernels.html)
 tells you even more.
@@ -406,39 +396,37 @@ but the gateways mentioned above can get around that limitation.
 If you get problems with "Kernel died" or some other error in a single
 notebook but the single-user notebook server stays working, it is
-usually a problem with the kernel.  It could be that you are trying to
+usually a problem with the kernel. It could be that you are trying to
 use more resources than you are allowed and the symptom is the kernel
-getting killed.  It could be that it crashes for some other reason.
+getting killed. It could be that it crashes for some other reason.
 In these cases, you need to find the kernel logs and investigate.
 The debug logs for the kernel are normally mixed in with the
 single-user notebook server logs.
 ## JupyterHub distributions
 There are several "distributions" which automatically install all of
-the things above and configure them for a certain purpose.  They are
+the things above and configure them for a certain purpose. They are
 good ways to get started, but if you have custom needs, eventually it
 may become hard to adapt them to your requirements.
-* [**Zero to JupyterHub with
+- [**Zero to JupyterHub with
  Kubernetes**](https://zero-to-jupyterhub.readthedocs.io/) installs
-  an entire scaleable system using Kubernetes.  Uses KubeSpawner,
+  an entire scaleable system using Kubernetes. Uses KubeSpawner,
  ....Authenticator, ....
-* [**The Littlest JupyterHub**](https://tljh.jupyter.org/) installs JupyterHub on a single system
+- [**The Littlest JupyterHub**](https://tljh.jupyter.org/) installs JupyterHub on a single system
  using SystemdSpawner and NativeAuthenticator (which manages users
  itself).
-* [**JupyterHub the hard way**](https://github.com/jupyterhub/jupyterhub-the-hard-way/blob/master/docs/installation-guide-hard.md)
+- [**JupyterHub the hard way**](https://github.com/jupyterhub/jupyterhub-the-hard-way/blob/master/docs/installation-guide-hard.md)
-  takes you through everything yourself.  It is a natural companion to
+  takes you through everything yourself. It is a natural companion to
  this guide, since you get to experience every little bit.
 ## What's next?
-Now you know everything.  Well, you know how everything relates, but
+Now you know everything. Well, you know how everything relates, but
 there are still plenty of details, implementations, and exceptions.
 When setting up JupyterHub, the first step is to consider the above
 layers, decide the right option for each of them, then begin putting