Merge branch 'main' into copyediting

2025-10-14 05:23:01 +00:00 · 2022-11-24 09:40:21 +01:00
parent e243812745 1cf13bea66
commit de5fb1e7ce
328 changed files with 35662 additions and 7475 deletions
--- a/docs/source/getting-started/authenticators-users-basics.md
+++ b/docs/source/getting-started/authenticators-users-basics.md
@@ -1,40 +1,51 @@
 # Authentication and User Basics

-The default Authenticator uses [PAM][] to authenticate system users with
+The default Authenticator uses [PAM][] (Pluggable Authentication Module) to authenticate system users with
 their username and password. With the default Authenticator, any user
 with an account and password on the system will be allowed to login.

-## Create a whitelist of users
-
-You can restrict which users are allowed to login with a whitelist,
-`Authenticator.whitelist`:
+## Create a set of allowed users (`allowed_users`)

+You can restrict which users are allowed to login with a set,
+`Authenticator.allowed_users`:

 ```python
-c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'}
+c.Authenticator.allowed_users = {'mal', 'zoe', 'inara', 'kaylee'}
 ```

-Users in the whitelist are added to the Hub database when the Hub is
+Users in the `allowed_users` set are added to the Hub database when the Hub is
 started.

+```{warning}
+If this configuration value is not set, then **all authenticated users will be allowed into your hub**.
+```
+
 ## Configure admins (`admin_users`)

+```{note}
+As of JupyterHub 2.0, the full permissions of `admin_users`
+should not be required.
+Instead, you can assign [roles](define-role-target) to users or groups
+with only the scopes they require.
+```
+
 Admin users of JupyterHub, `admin_users`, can add and remove users from
-the user `whitelist`. `admin_users` can take actions on other users'
+the user `allowed_users` set. `admin_users` can take actions on other users'
 behalf, such as stopping and restarting their servers.

-A set of initial admin users, `admin_users` can configured be as follows:
+A set of initial admin users, `admin_users` can be configured as follows:

 ```python
 c.Authenticator.admin_users = {'mal', 'zoe'}
 ```
-Users in the admin list are automatically added to the user `whitelist`,
+
+Users in the admin set are automatically added to the user `allowed_users` set,
 if they are not already present.

-Each authenticator may have different ways of determining whether a user is an
-administrator. By default JupyterHub use the PAMAuthenticator which provide the
-`admin_groups` option and can determine administrator status base on a user
-groups. For example we can let any users in the `wheel` group be admin:
+Each Authenticator may have different ways of determining whether a user is an
+administrator. By default, JupyterHub uses the PAMAuthenticator which provides the
+`admin_groups` option and can set administrator status based on a user
+group. For example, we can let any user in the `wheel` group be an admin:

 ```python
 c.PAMAuthenticator.admin_groups = {'wheel'}
@@ -42,35 +53,35 @@ c.PAMAuthenticator.admin_groups = {'wheel'}

 ## Give admin access to other users' notebook servers (`admin_access`)

-Since the default `JupyterHub.admin_access` setting is False, the admins
+Since the default `JupyterHub.admin_access` setting is `False`, the admins
 do not have permission to log in to the single user notebook servers
-owned by *other users*. If `JupyterHub.admin_access` is set to True,
-then admins have permission to log in *as other users* on their
-respective machines, for debugging. **As a courtesy, you should make
+owned by _other users_. If `JupyterHub.admin_access` is set to `True`,
+then admins have permission to log in _as other users_ on their
+respective machines for debugging. **As a courtesy, you should make
 sure your users know if admin_access is enabled.**

 ## Add or remove users from the Hub

-Users can be added to and removed from the Hub via either the admin
+Users can be added to and removed from the Hub via the admin
 panel or the REST API. When a user is **added**, the user will be
-automatically added to the whitelist and database. Restarting the Hub
-will not require manually updating the whitelist in your config file,
+automatically added to the `allowed_users` set and database. Restarting the Hub
+will not require manually updating the `allowed_users` set in your config file,
 as the users will be loaded from the database.

 After starting the Hub once, it is not sufficient to **remove** a user
-from the whitelist in your config file. You must also remove the user
+from the allowed users set in your config file. You must also remove the user
 from the Hub's database, either by deleting the user from JupyterHub's
 admin page, or you can clear the `jupyterhub.sqlite` database and start
 fresh.

 ## Use LocalAuthenticator to create system users

-The `LocalAuthenticator` is a special kind of authenticator that has
+The `LocalAuthenticator` is a special kind of Authenticator that has
 the ability to manage users on the local system. When you try to add a
 new user to the Hub, a `LocalAuthenticator` will check if the user
 already exists. If you set the configuration value, `create_system_users`,
 to `True` in the configuration file, the `LocalAuthenticator` has
-the privileges to add users to the system. The setting in the config
+the ability to add users to the system. The setting in the config
 file is:

 ```python
@@ -80,7 +91,7 @@ c.LocalAuthenticator.create_system_users = True
 Adding a user to the Hub that doesn't already exist on the system will
 result in the Hub creating that user via the system `adduser` command
 line tool. This option is typically used on hosted deployments of
-JupyterHub, to avoid the need to manually create all your users before
+JupyterHub to avoid the need to manually create all your users before
 launching the service. This approach is not recommended when running
 JupyterHub in situations where JupyterHub users map directly onto the
 system's UNIX users.
@@ -90,27 +101,25 @@ system's UNIX users.
 JupyterHub's [OAuthenticator][] currently supports the following
 popular services:

- Auth0
- Bitbucket
- CILogon
- GitHub
- GitLab
- Globus
- Google
- MediaWiki
- Okpy
- OpenShift
+- [Auth0](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.auth0.html#module-oauthenticator.auth0)
+- [Azure AD](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.azuread.html#module-oauthenticator.azuread)
+- [Bitbucket](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.bitbucket.html#module-oauthenticator.bitbucket)
+- [CILogon](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.cilogon.html#module-oauthenticator.cilogon)
+- [GitHub](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.github.html#module-oauthenticator.github)
+- [GitLab](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.gitlab.html#module-oauthenticator.gitlab)
+- [Globus](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.globus.html#module-oauthenticator.globus)
+- [Google](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.google.html#module-oauthenticator.google)
+- [MediaWiki](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.mediawiki.html#module-oauthenticator.mediawiki)
+- [Okpy](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.okpy.html#module-oauthenticator.okpy)
+- [OpenShift](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.openshift.html#module-oauthenticator.openshift)

-NOTE: Open issue asking for more details on this generic implementation.
-It's not clear if this is a different implementation or if the JupyterHub OAuth
-_is_ the generic implementation.
-A generic implementation, which you can use for OAuth authentication
+A [generic implementation](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.generic.html#module-oauthenticator.generic), which you can use for OAuth authentication
 with any provider, is also available.

 ## Use DummyAuthenticator for testing

-The :class:`~jupyterhub.auth.DummyAuthenticator` is a simple authenticator that
-allows for any username/password unless if a global password has been set. If
+The `DummyAuthenticator` is a simple Authenticator that
+allows for any username or password unless a global password has been set. If
 set, it will allow for any username as long as the correct password is provided.
 To set a global password, add this to the config file:

@@ -118,5 +127,5 @@ To set a global password, add this to the config file:
 c.DummyAuthenticator.password = "some_password"
 ```

-[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
-[OAuthenticator]: https://github.com/jupyterhub/oauthenticator
+[pam]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
+[oauthenticator]: https://github.com/jupyterhub/oauthenticator
--- a/docs/source/getting-started/config-basics.md
+++ b/docs/source/getting-started/config-basics.md
@@ -1,6 +1,6 @@
 # Configuration Basics

-The section contains basic information about configuring settings for a JupyterHub
+This section contains basic information about configuring settings for a JupyterHub
 deployment. The [Technical Reference](../reference/index)
 documentation provides additional details.

@@ -44,30 +44,30 @@ jupyterhub -f /etc/jupyterhub/jupyterhub_config.py
 ```

 The IPython documentation provides additional information on the
-[config system](http://ipython.readthedocs.io/en/stable/development/config)
+[config system](http://ipython.readthedocs.io/en/stable/development/config.html)
 that Jupyter uses.

 ## Configure using command line options

-To display all command line options that are available for configuration:
+To display all command line options that are available for configuration run the following command:

 ```bash
    jupyterhub --help-all
 ```

 Configuration using the command line options is done when launching JupyterHub.
-For example, to start JupyterHub on ``10.0.1.2:443`` with https, you
+For example, to start JupyterHub on `10.0.1.2:443` with https, you
 would enter:

 ```bash
    jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert
 ```

-All configurable options may technically be set on the command-line,
+All configurable options may technically be set on the command line,
 though some are inconvenient to type. To set a particular configuration
 parameter, `c.Class.trait`, you would use the command line option,
 `--Class.trait`, when starting JupyterHub. For example, to configure the
-`c.Spawner.notebook_dir` trait from the command-line, use the
+`c.Spawner.notebook_dir` trait from the command line, use the
 `--Spawner.notebook_dir` option:

 ```bash
@@ -77,24 +77,24 @@ jupyterhub --Spawner.notebook_dir='~/assignments'
 ## Configure for various deployment environments

 The default authentication and process spawning mechanisms can be replaced, and
-specific [authenticators](./authenticators-users-basics) and
-[spawners](./spawners-basics) can be set in the configuration file.
+specific [authenticators](authenticators-users-basics) and
+[spawners](spawners-basics) can be set in the configuration file.
 This enables JupyterHub to be used with a variety of authentication methods or
 process control and deployment environments. [Some examples](../reference/config-examples),
-meant as illustration, are:
+meant as illustrations, are:

 - Using GitHub OAuth instead of PAM with [OAuthenticator](https://github.com/jupyterhub/oauthenticator)
 - Spawning single-user servers with Docker, using the [DockerSpawner](https://github.com/jupyterhub/dockerspawner)

 ## Run the proxy separately

-This is *not* strictly necessary, but useful in many cases.  If you
-use a custom proxy (e.g. Traefik), this also not needed.
+This is _not_ strictly necessary, but useful in many cases. If you
+use a custom proxy (e.g. Traefik), this is also not needed.

 Connections to user servers go through the proxy, and *not* the hub
 itself.  If the proxy stays running when the hub restarts (for
 maintenance, re-configuration, etc.), then user connections are not
 interrupted.  For simplicity, by default the hub starts the proxy
 automatically, so if the hub restarts, the proxy restarts, and user
-connections are interrupted.  It is easy to run the proxy separately,
+connections are interrupted. It is easy to run the proxy separately,
 for information see [the separate proxy page](../reference/separate-proxy).
--- a/docs/source/getting-started/faq.md
+++ b/docs/source/getting-started/faq.md
@@ -0,0 +1,35 @@
+# Frequently asked questions
+
+## How do I share links to notebooks?
+
+In short, where you see `/user/name/notebooks/foo.ipynb` use `/hub/user-redirect/notebooks/foo.ipynb` (replace `/user/name` with `/hub/user-redirect`).
+
+Sharing links to notebooks is a common activity,
+and can look different based on what you mean.
+Your first instinct might be to copy the URL you see in the browser,
+e.g. `hub.jupyter.org/user/yourname/notebooks/coolthing.ipynb`.
+However, let's break down what this URL means:
+
+`hub.jupyter.org/user/yourname/` is the URL prefix handled by _your server_,
+which means that sharing this URL is asking the person you share the link with
+to come to _your server_ and look at the exact same file.
+In most circumstances, this is forbidden by permissions because the person you share with does not have access to your server.
+What actually happens when someone visits this URL will depend on whether your server is running and other factors.
+
+But what is our actual goal?
+A typical situation is that you have some shared or common filesystem,
+such that the same path corresponds to the same document
+(either the exact same document or another copy of it).
+Typically, what folks want when they do sharing like this
+is for each visitor to open the same file _on their own server_,
+so Breq would open `/user/breq/notebooks/foo.ipynb` and
+Seivarden would open `/user/seivarden/notebooks/foo.ipynb`, etc.
+
+JupyterHub has a special URL that does exactly this!
+It's called `/hub/user-redirect/...`.
+So if you replace `/user/yourname` in your URL bar
+with `/hub/user-redirect` any visitor should get the same
+URL on their own server, rather than visiting yours.
+
+In JupyterLab 2.0, this should also be the result of the "Copy Shareable Link"
+action in the file browser.
--- a/docs/source/getting-started/index.rst
+++ b/docs/source/getting-started/index.rst
@@ -1,5 +1,10 @@
-Getting Started
-===============
+Get Started
+===========
+
+This section covers how to configure and customize JupyterHub for your
+needs. It contains information about authentication, networking, security, and
+other topics that are relevant to individuals or organizations deploying their
+own JupyterHub.

 .. toctree::
   :maxdepth: 2
@@ -10,3 +15,5 @@ Getting Started
   authenticators-users-basics
   spawners-basics
   services-basics
+   faq
+   institutional-faq
--- a/docs/source/getting-started/institutional-faq.md
+++ b/docs/source/getting-started/institutional-faq.md
@@ -0,0 +1,260 @@
+# Institutional FAQ
+
+This page contains common questions from users of JupyterHub,
+broken down by their roles within organizations.
+
+## For all
+
+### Is it appropriate for adoption within a larger institutional context?
+
+Yes! JupyterHub has been used at-scale for large pools of users, as well
+as complex and high-performance computing. For example, UC Berkeley uses
+JupyterHub for its Data Science Education Program courses (serving over
+3,000 students). The Pangeo project uses JupyterHub to provide access
+to scalable cloud computing with Dask. JupyterHub is stable and customizable
+to the use-cases of large organizations.
+
+### I keep hearing about Jupyter Notebook, JupyterLab, and now JupyterHub. What’s the difference?
+
+Here is a quick breakdown of these three tools:
+
+- **The Jupyter Notebook** is a document specification (the `.ipynb`) file that interweaves
+  narrative text with code cells and their outputs. It is also a graphical interface
+  that allows users to edit these documents. There are also several other graphical interfaces
+  that allow users to edit the `.ipynb` format (nteract, Jupyter Lab, Google Colab, Kaggle, etc).
+- **JupyterLab** is a flexible and extendible user interface for interactive computing. It
+  has several extensions that are tailored for using Jupyter Notebooks, as well as extensions
+  for other parts of the data science stack.
+- **JupyterHub** is an application that manages interactive computing sessions for **multiple users**.
+  It also connects them with infrastructure those users wish to access. It can provide
+  remote access to Jupyter Notebooks and JupyterLab for many people.
+
+## For management
+
+### Briefly, what problem does JupyterHub solve for us?
+
+JupyterHub provides a shared platform for data science and collaboration.
+It allows users to utilize familiar data science workflows (such as the scientific Python stack,
+the R tidyverse, and Jupyter Notebooks) on institutional infrastructure. It also allows administrators
+some control over access to resources, security, environments, and authentication.
+
+### Is JupyterHub mature? Why should we trust it?
+
+Yes - the core JupyterHub application recently
+reached 1.0 status, and is considered stable and performant for most institutions.
+JupyterHub has also been deployed (along with other tools) to work on
+scalable infrastructure, large datasets, and high-performance computing.
+
+### Who else uses JupyterHub?
+
+JupyterHub is used at a variety of institutions in academia,
+industry, and government research labs. It is most-commonly used by two kinds of groups:
+
+- Small teams (e.g., data science teams, research labs, or collaborative projects) to provide a
+  shared resource for interactive computing, collaboration, and analytics.
+- Large teams (e.g., a department, a large class, or a large group of remote users) to provide
+  access to organizational hardware, data, and analytics environments at scale.
+
+Here is a sample of organizations that use JupyterHub:
+
+- **Universities and colleges**: UC Berkeley, UC San Diego, Cal Poly SLO, Harvard University, University of Chicago,
+  University of Oslo, University of Sheffield, Université Paris Sud, University of Versailles
+- **Research laboratories**: NASA, NCAR, NOAA, the Large Synoptic Survey Telescope, Brookhaven National Lab,
+  Minnesota Supercomputing Institute, ALCF, CERN, Lawrence Livermore National Laboratory
+- **Online communities**: Pangeo, Quantopian, mybinder.org, MathHub, Open Humans
+- **Computing infrastructure providers**: NERSC, San Diego Supercomputing Center, Compute Canada
+- **Companies**: Capital One, SANDVIK code, Globus
+
+See the [Gallery of JupyterHub deployments](../gallery-jhub-deployments.md) for
+a more complete list of JupyterHub deployments at institutions.
+
+### How does JupyterHub compare with hosted products, like Google Colaboratory, RStudio.cloud, or Anaconda Enterprise?
+
+JupyterHub puts you in control of your data, infrastructure, and coding environment.
+In addition, it is vendor neutral, which reduces lock-in to a particular vendor or service.
+JupyterHub provides access to interactive computing environments in the cloud (similar to each of these services).
+Compared with the tools above, it is more flexible, more customizable, free, and
+gives administrators more control over their setup and hardware.
+
+Because JupyterHub is an open-source, community-driven tool, it can be extended and
+modified to fit an institution's needs. It plays nicely with the open source data science
+stack, and can serve a variety of computing environments, user interfaces, and
+computational hardware. It can also be deployed anywhere - on enterprise cloud infrastructure, on
+High-Performance-Computing machines, on local hardware, or even on a single laptop, which
+is not possible with most other tools for shared interactive computing.
+
+## For IT
+
+### How would I set up JupyterHub on institutional hardware?
+
+That depends on what kind of hardware you've got. JupyterHub is flexible enough to be deployed
+on a variety of hardware, including in-room hardware, on-prem clusters, cloud infrastructure,
+etc.
+
+The most common way to set up a JupyterHub is to use a JupyterHub distribution, these are pre-configured
+and opinionated ways to set up a JupyterHub on particular kinds of infrastructure. The two distributions
+that we currently suggest are:
+
+- [Zero to JupyterHub for Kubernetes](https://z2jh.jupyter.org) is a scalable JupyterHub deployment and
+  guide that runs on Kubernetes. Better for larger or dynamic user groups (50-10,000) or more complex
+  compute/data needs.
+- [The Littlest JupyterHub](https://tljh.jupyter.org) is a lightweight JupyterHub that runs on a single
+  single machine (in the cloud or under your desk). Better for smaller user groups (4-80) or more
+  lightweight computational resources.
+
+### Does JupyterHub run well in the cloud?
+
+Yes - most deployments of JupyterHub are run via cloud infrastructure and on a variety of cloud providers.
+Depending on the distribution of JupyterHub that you'd like to use, you can also connect your JupyterHub
+deployment with a number of other cloud-native services so that users have access to other resources from
+their interactive computing sessions.
+
+For example, if you use the [Zero to JupyterHub for Kubernetes](https://z2jh.jupyter.org) distribution,
+you'll be able to utilize container-based workflows of other technologies such as the [dask-kubernetes](https://kubernetes.dask.org/en/latest/)
+project for distributed computing.
+
+The Z2JH Helm Chart also has some functionality built in for auto-scaling your cluster up and down
+as more resources are needed - allowing you to utilize the benefits of a flexible cloud-based deployment.
+
+### Is JupyterHub secure?
+
+The short answer: yes. JupyterHub as a standalone application has been battle-tested at an institutional
+level for several years, and makes a number of "default" security decisions that are reasonable for most
+users.
+
+- For security considerations in the base JupyterHub application,
+  [see the JupyterHub security page](https://jupyterhub.readthedocs.io/en/stable/reference/websecurity.html).
+- For security considerations when deploying JupyterHub on Kubernetes, see the
+  [JupyterHub on Kubernetes security page](https://zero-to-jupyterhub.readthedocs.io/en/latest/security.html).
+
+The longer answer: it depends on your deployment. Because JupyterHub is very flexible, it can be used
+in a variety of deployment setups. This often entails connecting your JupyterHub to **other** infrastructure
+(such as a [Dask Gateway service](https://gateway.dask.org/)). There are many security decisions to be made
+in these cases, and the security of your JupyterHub deployment will often depend on these decisions.
+
+If you are worried about security, don't hesitate to reach out to the JupyterHub community in the
+[Jupyter Community Forum](https://discourse.jupyter.org/c/jupyterhub). This community of practice has many
+individuals with experience running secure JupyterHub deployments.
+
+### Does JupyterHub provide computing or data infrastructure?
+
+No - JupyterHub manages user sessions and can _control_ computing infrastructure, but it does not provide these
+things itself. You are expected to run JupyterHub on your own infrastructure (local or in the cloud). Moreover,
+JupyterHub has no internal concept of "data", but is designed to be able to communicate with data repositories
+(again, either locally or remotely) for use within interactive computing sessions.
+
+### How do I manage users?
+
+JupyterHub offers a few options for managing your users. Upon setting up a JupyterHub, you can choose what
+kind of **authentication** you'd like to use. For example, you can have users sign up with an institutional
+email address, or choose a username / password when they first log-in, or offload authentication onto
+another service such as an organization's OAuth.
+
+The users of a JupyterHub are stored locally, and can be modified manually by an administrator of the JupyterHub.
+Moreover, the _active_ users on a JupyterHub can be found on the administrator's page. This page
+gives you the abiltiy to stop or restart kernels, inspect user filesystems, and even take over user
+sessions to assist them with debugging.
+
+### How do I manage software environments?
+
+A key benefit of JupyterHub is the ability for an administrator to define the environment(s) that users
+have access to. There are many ways to do this, depending on what kind of infrastructure you're using for
+your JupyterHub.
+
+For example, **The Littlest JupyterHub** runs on a single VM. In this case, the administrator defines
+an environment by installing packages to a shared folder that exists on the path of all users. The
+**JupyterHub for Kubernetes** deployment uses Docker images to define environments. You can create your
+own list of Docker images that users can select from, and can also control things like the amount of
+RAM available to users, or the types of machines that their sessions will use in the cloud.
+
+### How does JupyterHub manage computational resources?
+
+For interactive computing sessions, JupyterHub controls computational resources via a **spawner**.
+Spawners define how a new user session is created, and are customized for particular kinds of
+infrastructure. For example, the KubeSpawner knows how to control a Kubernetes deployment
+to create new pods when users log in.
+
+For more sophisticated computational resources (like distributed computing), JupyterHub can
+connect with other infrastructure tools (like Dask or Spark). This allows users to control
+scalable or high-performance resources from within their JupyterHub sessions. The logic of
+how those resources are controlled is taken care of by the non-JupyterHub application.
+
+### Can JupyterHub be used with my high-performance computing resources?
+
+Yes - JupyterHub can provide access to many kinds of computing infrastructure.
+Especially when combined with other open-source schedulers such as Dask, you can manage fairly
+complex computing infrastructures from the interactive sessions of a JupyterHub. For example
+[see the Dask HPC page](https://docs.dask.org/en/latest/setup/hpc.html).
+
+### How much resources do user sessions take?
+
+This is highly configurable by the administrator. If you wish for your users to have simple
+data analytics environments for prototyping and light data exploring, you can restrict their
+memory and CPU based on the resources that you have available. If you'd like your JupyterHub
+to serve as a gateway to high-performance compute or data resources, you may increase the
+resources available on user machines, or connect them with computing infrastructures elsewhere.
+
+### Can I customize the look and feel of a JupyterHub?
+
+JupyterHub provides some customization of the graphics displayed to users. The most common
+modification is to add custom branding to the JupyterHub login page, loading pages, and
+various elements that persist across all pages (such as headers).
+
+## For Technical Leads
+
+### Will JupyterHub “just work” with our team's interactive computing setup?
+
+Depending on the complexity of your setup, you'll have different experiences with "out of the box"
+distributions of JupyterHub. If all of the resources you need will fit on a single VM, then
+[The Littlest JupyterHub](https://tljh.jupyter.org) should get you up-and-running within
+a half day or so. For more complex setups, such as scalable Kubernetes clusters or access
+to high-performance computing and data, it will require more time and expertise with
+the technologies your JupyterHub will use (e.g., dev-ops knowledge with cloud computing).
+
+In general, the base JupyterHub deployment is not the bottleneck for setup, it is connecting
+your JupyterHub with the various services and tools that you wish to provide to your users.
+
+### How well does JupyterHub scale? What are JupyterHub's limitations?
+
+JupyterHub works well at both a small scale (e.g., a single VM or machine) as well as a
+high scale (e.g., a scalable Kubernetes cluster). It can be used for teams as small as 2, and
+for user bases as large as 10,000. The scalability of JupyterHub largely depends on the
+infrastructure on which it is deployed. JupyterHub has been designed to be lightweight and
+flexible, so you can tailor your JupyterHub deployment to your needs.
+
+### Is JupyterHub resilient? What happens when a machine goes down?
+
+For JupyterHubs that are deployed in a containerized environment (e.g., Kubernetes), it is
+possible to configure the JupyterHub to be fairly resistant to failures in the system.
+For example, if JupyterHub fails, then user sessions will not be affected (though new
+users will not be able to log in). When a JupyterHub process is restarted, it should
+seamlessly connect with the user database and the system will return to normal.
+Again, the details of your JupyterHub deployment (e.g., whether it's deployed on a scalable cluster)
+will affect the resiliency of the deployment.
+
+### What interfaces does JupyterHub support?
+
+Out of the box, JupyterHub supports a variety of popular data science interfaces for user sessions,
+such as JupyterLab, Jupyter Notebooks, and RStudio. Any interface that can be served
+via a web address can be served with a JupyterHub (with the right setup).
+
+### Does JupyterHub make it easier for our team to collaborate?
+
+JupyterHub provides a standardized environment and access to shared resources for your teams.
+This greatly reduces the cost associated with sharing analyses and content with other team
+members, and makes it easier to collaborate and build off of one another's ideas. Combined with
+access to high-performance computing and data, JupyterHub provides a common resource to
+amplify your team's ability to prototype their analyses, scale them to larger data, and then
+share their results with one another.
+
+JupyterHub also provides a computational framework to share computational narratives between
+different levels of an organization. For example, data scientists can share Jupyter Notebooks
+rendered as [Voilà dashboards](https://voila.readthedocs.io/en/stable/) with those who are not
+familiar with programming, or create publicly-available interactive analyses to allow others to
+interact with your work.
+
+### Can I use JupyterHub with R/RStudio or other languages and environments?
+
+Yes, Jupyter is a polyglot project, and there are over 40 community-provided kernels for a variety
+of languages (the most common being Python, Julia, and R). You can also use a JupyterHub to provide
+access to other interfaces, such as RStudio, that provide their own access to a language kernel.
--- a/docs/source/getting-started/networking-basics.md
+++ b/docs/source/getting-started/networking-basics.md
@@ -11,8 +11,8 @@ This section will help you with basic proxy and network configuration to:

 The Proxy's main IP address setting determines where JupyterHub is available to users.
 By default, JupyterHub is configured to be available on all network interfaces
-(`''`) on port 8000. *Note*: Use of `'*'` is discouraged for IP configuration;
-instead, use of `'0.0.0.0'` is preferred. 
+(`''`) on port 8000. _Note_: Use of `'*'` is discouraged for IP configuration;
+instead, use of `'0.0.0.0'` is preferred.

 Changing the Proxy's main IP address and port can be done with the following
 JupyterHub **command line options**:
@@ -74,7 +74,7 @@ The Hub service listens only on `localhost` (port 8081) by default.
 The Hub needs to be accessible from both the proxy and all Spawners.
 When spawning local servers, an IP address setting of `localhost` is fine.

-If *either* the Proxy *or* (more likely) the Spawners will be remote or
+If _either_ the Proxy _or_ (more likely) the Spawners will be remote or
 isolated in containers, the Hub must listen on an IP that is accessible.

 ```python
@@ -82,20 +82,20 @@ c.JupyterHub.hub_ip = '10.0.1.4'
 c.JupyterHub.hub_port = 54321
 ```

-**Added in 0.8:** The `c.JupyterHub.hub_connect_ip` setting is the ip address or
+**Added in 0.8:** The `c.JupyterHub.hub_connect_ip` setting is the IP address or
 hostname that other services should use to connect to the Hub. A common
 configuration for, e.g. docker, is:

 ```python
 c.JupyterHub.hub_ip = '0.0.0.0'  # listen on all interfaces
-c.JupyterHub.hub_connect_ip = '10.0.1.4'  # ip as seen on the docker network. Can also be a hostname.
+c.JupyterHub.hub_connect_ip = '10.0.1.4'  # IP as seen on the docker network. Can also be a hostname.
 ```

 ## Adjusting the hub's URL

-The hub will most commonly be running on a hostname of its own.  If it
+The hub will most commonly be running on a hostname of its own. If it
 is not – for example, if the hub is being reverse-proxied and being
 exposed at a URL such as `https://proxy.example.org/jupyter/` – then
-you will need to tell JupyterHub the base URL of the service.  In such
+you will need to tell JupyterHub the base URL of the service. In such
 a case, it is both necessary and sufficient to set
 `c.JupyterHub.base_url = '/jupyter/'` in the configuration.
--- a/docs/source/getting-started/security-basics.rst
+++ b/docs/source/getting-started/security-basics.rst
@@ -5,17 +5,17 @@ Security settings

   You should not run JupyterHub without SSL encryption on a public network.

-Security is the most important aspect of configuring Jupyter. Three
-configuration settings are the main aspects of security configuration:
+Security is the most important aspect of configuring Jupyter.
+Three (3) configuration settings are the main aspects of security configuration:

 1. :ref:`SSL encryption <ssl-encryption>` (to enable HTTPS)
 2. :ref:`Cookie secret <cookie-secret>` (a key for encrypting browser cookies)
 3. Proxy :ref:`authentication token <authentication-token>` (used for the Hub and
   other services to authenticate to the Proxy)

-The Hub hashes all secrets (e.g., auth tokens) before storing them in its
+The Hub hashes all secrets (e.g. auth tokens) before storing them in its
 database. A loss of control over read-access to the database should have
-minimal impact on your deployment; if your database has been compromised, it
+minimal impact on your deployment. If your database has been compromised, it
 is still a good idea to revoke existing tokens.

 .. _ssl-encryption:
@@ -31,7 +31,7 @@ Using an SSL certificate

 This will require you to obtain an official, trusted SSL certificate or create a
 self-signed certificate. Once you have obtained and installed a key and
-certificate you need to specify their locations in the ``jupyterhub_config.py``
+certificate, you need to specify their locations in the ``jupyterhub_config.py``
 configuration file as follows:

 .. code-block:: python
@@ -72,7 +72,7 @@ would be the needed configuration:
 If SSL termination happens outside of the Hub
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-In certain cases, for example if the hub is running behind a reverse proxy, and
+In certain cases, for example, if the hub is running behind a reverse proxy, and
 `SSL termination is being provided by NGINX <https://www.nginx.com/resources/admin-guide/nginx-ssl-termination/>`_,
 it is reasonable to run the hub without SSL.

@@ -80,12 +80,55 @@ To achieve this, remove ``c.JupyterHub.ssl_key`` and ``c.JupyterHub.ssl_cert``
 from your configuration (setting them to ``None`` or an empty string does not
 have the same effect, and will result in an error).

+.. _authentication-token:
+
+Proxy authentication token
+--------------------------
+
+The Hub authenticates its requests to the Proxy using a secret token that
+the Hub and Proxy agree upon. Note that this applies to the default
+``ConfigurableHTTPProxy`` implementation. Not all proxy implementations
+use an auth token.
+
+The value of this token should be a random string (for example, generated by
+``openssl rand -hex 32``). You can store it in the configuration file or an
+environment variable.
+
+Generating and storing token in the configuration file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can set the value in the configuration file, ``jupyterhub_config.py``:
+
+.. code-block:: python
+
+    c.ConfigurableHTTPProxy.api_token = 'abc123...' # any random string
+
+Generating and storing as an environment variable
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can pass this value of the proxy authentication token to the Hub and Proxy
+using the ``CONFIGPROXY_AUTH_TOKEN`` environment variable:
+
+.. code-block:: bash
+
+    export CONFIGPROXY_AUTH_TOKEN=$(openssl rand -hex 32)
+
+This environment variable needs to be visible to the Hub and Proxy.
+
+Default if token is not set
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you do not set the Proxy authentication token, the Hub will generate a random
+key itself. This means that any time you restart the Hub, you **must also
+restart the Proxy**. If the proxy is a subprocess of the Hub, this should happen
+automatically (this is the default configuration).
+
 .. _cookie-secret:

 Cookie secret
 -------------

-The cookie secret is an encryption key, used to encrypt the browser cookies
+The cookie secret is an encryption key, used to encrypt the browser cookies,
 which are used for authentication. Three common methods are described for
 generating and configuring the cookie secret.

@@ -93,8 +136,8 @@ Generating and storing as a cookie secret file
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 The cookie secret should be 32 random bytes, encoded as hex, and is typically
-stored in a ``jupyterhub_cookie_secret`` file. An example command to generate the
-``jupyterhub_cookie_secret`` file is:
+stored in a ``jupyterhub_cookie_secret`` file. Below, is an example command to generate the
+``jupyterhub_cookie_secret`` file:

 .. code-block:: bash

@@ -112,7 +155,7 @@ The location of the ``jupyterhub_cookie_secret`` file can be specified in the

 If the cookie secret file doesn't exist when the Hub starts, a new cookie
 secret is generated and stored in the file. The file must not be readable by
-``group`` or ``other`` or the server won't start. The recommended permissions
+``group`` or ``other``, otherwise the server won't start. The recommended permissions
 for the cookie secret file are ``600`` (owner-only rw).

 Generating and storing as an environment variable
@@ -133,54 +176,79 @@ the Hub starts.
 Generating and storing as a binary string
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-You can also set the cookie secret in the configuration file
-itself, ``jupyterhub_config.py``, as a binary string:
+You can also set the cookie secret, as a binary string,
+in the configuration file (``jupyterhub_config.py``) itself:

 .. code-block:: python

    c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')

+.. _cookies:

-.. important::
+Cookies used by JupyterHub authentication
+-----------------------------------------

-   If the cookie secret value changes for the Hub, all single-user notebook
-   servers must also be restarted.
+The following cookies are used by the Hub for handling user authentication.

+This section was created based on this post_ from Discourse.

-.. _authentication-token:
+.. _post: https://discourse.jupyter.org/t/how-to-force-re-login-for-users/1998/6

-Proxy authentication token
--------------------------
+jupyterhub-hub-login
+~~~~~~~~~~~~~~~~~~~~

-The Hub authenticates its requests to the Proxy using a secret token that
-the Hub and Proxy agree upon. The value of this string should be a random
-string (for example, generated by ``openssl rand -hex 32``).
+This is the login token used when visiting Hub-served pages that are
+protected by authentication, such as the main home, the spawn form, etc.
+If this cookie is set, then the user is logged in.

-Generating and storing token in the configuration file
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Resetting the Hub cookie secret effectively revokes this cookie.

-Or you can set the value in the configuration file, ``jupyterhub_config.py``:
+This cookie is restricted to the path ``/hub/``.

-.. code-block:: python
+jupyterhub-user-<username>
+~~~~~~~~~~~~~~~~~~~~~~~~~~

-    c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
+This is the cookie used for authenticating with a single-user server.
+It is set by the single-user server, after OAuth with the Hub.

-Generating and storing as an environment variable
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Effectively the same as ``jupyterhub-hub-login``, but for the
+single-user server instead of the Hub. It contains an OAuth access token,
+which is checked with the Hub to authenticate the browser.

-You can pass this value of the proxy authentication token to the Hub and Proxy
-using the ``CONFIGPROXY_AUTH_TOKEN`` environment variable:
+Each OAuth access token is associated with a session id (see ``jupyterhub-session-id`` section
+below).

-.. code-block:: bash
+To avoid hitting the Hub on every request, the authentication response is cached.
+The cache key is comprised of both the token and session id, to avoid a stale cache.

-    export CONFIGPROXY_AUTH_TOKEN=$(openssl rand -hex 32)
+Resetting the Hub cookie secret effectively revokes this cookie.

-This environment variable needs to be visible to the Hub and Proxy.
+This cookie is restricted to the path ``/user/<username>``,
+to ensure that only the user’s server receives it.

-Default if token is not set
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+jupyterhub-session-id
+~~~~~~~~~~~~~~~~~~~~~

-If you don't set the Proxy authentication token, the Hub will generate a random
-key itself, which means that any time you restart the Hub you **must also
-restart the Proxy**. If the proxy is a subprocess of the Hub, this should happen
-automatically (this is the default configuration).
+This is a random string, meaningless in itself, and the only cookie
+shared by the Hub and single-user servers.
+
+Its sole purpose is to coordinate logout of the multiple OAuth cookies.
+
+This cookie is set to ``/`` so all endpoints can receive it, clear it, etc.
+
+jupyterhub-user-<username>-oauth-state
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A short-lived cookie, used solely to store and validate OAuth state.
+It is only set while OAuth between the single-user server and the Hub
+is processing.
+
+If you use your browser development tools, you should see this cookie
+for a very brief moment before you are logged in,
+with an expiration date shorter than ``jupyterhub-hub-login`` or
+``jupyterhub-user-<username>``.
+
+This cookie should not exist after you have successfully logged in.
+
+This cookie is restricted to the path ``/user/<username>``, so that only
+the user’s server receives it.
--- a/docs/source/getting-started/services-basics.md
+++ b/docs/source/getting-started/services-basics.md
@@ -2,10 +2,10 @@

 When working with JupyterHub, a **Service** is defined as a process
 that interacts with the Hub's REST API. A Service may perform a specific
-or action or task. For example, shutting down individuals' single user
+action or task. For example, shutting down individuals' single user
 notebook servers that have been idle for some time is a good example of
 a task that could be automated by a Service. Let's look at how the
-[cull_idle_servers][] script can be used as a Service.
+[jupyterhub_idle_culler][] script can be used as a Service.

 ## Real-world example to cull idle servers

@@ -15,16 +15,16 @@ document will:
 - explain some basic information about API tokens
 - clarify that API tokens can be used to authenticate to
  single-user servers as of [version 0.8.0](../changelog)
- show how the [cull_idle_servers][] script can be:
-    - used in a Hub-managed service
-    - run as a standalone script
+- show how the [jupyterhub_idle_culler][] script can be:
+  - used in a Hub-managed service
+  - run as a standalone script

-Both examples for `cull_idle_servers` will communicate tasks to the
+Both examples for `jupyterhub_idle_culler` will communicate tasks to the
 Hub via the REST API.

 ## API Token basics

-### Create an API token
+### Step 1: Generate an API token

 To run such an external service, an API token must be created and
 provided to the service.
@@ -43,12 +43,12 @@ generating an API token is available from the JupyterHub user interface:

 ![API TOKEN success page](../images/token-request-success.png)

-### Pass environment variable with token to the Hub
+### Step 2: Pass environment variable with token to the Hub

 In the case of `cull_idle_servers`, it is passed as the environment
 variable called `JUPYTERHUB_API_TOKEN`.

-### Use API tokens for services and tasks that require external access
+### Step 3: Use API tokens for services and tasks that require external access

 While API tokens are often associated with a specific user, API tokens
 can be used by services that require external access for activities
@@ -62,7 +62,7 @@ c.JupyterHub.services = [
 ]
 ```

-### Restart JupyterHub
+### Step 4: Restart JupyterHub

 Upon restarting JupyterHub, you should see a message like below in the
 logs:
@@ -78,44 +78,72 @@ single-user servers, and only cookies can be used for authentication.
 0.8 supports using JupyterHub API tokens to authenticate to single-user
 servers.

-## Configure `cull-idle` to run as a Hub-Managed Service
+## How to configure the idle culler to run as a Hub-Managed Service

-In `jupyterhub_config.py`, add the following dictionary for the
-`cull-idle` Service to the `c.JupyterHub.services` list:
+### Step 1: Install the idle culler:
+
+```
+pip install jupyterhub-idle-culler
+```
+
+### Step 2: In `jupyterhub_config.py`, add the following dictionary for the `idle-culler` Service to the `c.JupyterHub.services` list:

 ```python
 c.JupyterHub.services = [
    {
-        'name': 'cull-idle',
-        'admin': True,
-        'command': [sys.executable, 'cull_idle_servers.py', '--timeout=3600'],
+        'name': 'idle-culler',
+        'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600'],
+    }
+]
+
+c.JupyterHub.load_roles = [
+    {
+        "name": "list-and-cull", # name the role
+        "services": [
+            "idle-culler", # assign the service to this role
+        ],
+        "scopes": [
+            # declare what permissions the service should have
+            "list:users", # list users
+            "read:users:activity", # read user last-activity
+            "admin:servers", # start/stop servers
+        ],
    }
 ]
 ```

 where:

- `'admin': True` indicates that the Service has 'admin' permissions, and
- `'command'` indicates that the Service will be launched as a
+- `command` indicates that the Service will be launched as a
  subprocess, managed by the Hub.

-## Run `cull-idle` manually as a standalone script
+```{versionchanged} 2.0
+Prior to 2.0, the idle-culler required 'admin' permissions.
+It now needs the scopes:

-Now you can run your script, i.e. `cull_idle_servers`, by providing it
+- `list:users` to access the user list endpoint
+- `read:users:activity` to read activity info
+- `admin:servers` to start/stop servers
+```
+
+## How to run `cull-idle` manually as a standalone script
+
+Now you can run your script by providing it
 the API token and it will authenticate through the REST API to
 interact with it.

-This will run `cull-idle` manually. `cull-idle` can be run as a standalone
+This will run the idle culler service manually. It can be run as a standalone
 script anywhere with access to the Hub, and will periodically check for idle
 servers and shut them down via the Hub's REST API. In order to shutdown the
-servers, the token given to cull-idle must have admin privileges.
+servers, the token given to `cull-idle` must have permission to list users
+and admin their servers.

 Generate an API token and store it in the `JUPYTERHUB_API_TOKEN` environment
-variable. Run `cull_idle_servers.py` manually.
+variable. Run `jupyterhub_idle_culler` manually.

 ```bash
    export JUPYTERHUB_API_TOKEN='token'
-    python3 cull_idle_servers.py [--timeout=900] [--url=http://127.0.0.1:8081/hub/api]
+    python -m jupyterhub_idle_culler [--timeout=900] [--url=http://127.0.0.1:8081/hub/api]
 ```

-[cull_idle_servers]: https://github.com/jupyterhub/jupyterhub/blob/master/examples/cull-idle/cull_idle_servers.py
+[jupyterhub_idle_culler]: https://github.com/jupyterhub/jupyterhub-idle-culler
--- a/docs/source/getting-started/spawners-basics.md
+++ b/docs/source/getting-started/spawners-basics.md
@@ -1,12 +1,12 @@
 # Spawners and single-user notebook servers

-Since the single-user server is an instance of `jupyter notebook`, an entire separate
-multi-process application, there are many aspect of that server can configure, and a lot of ways
-to express that configuration.
+A Spawner starts each single-user notebook server. Since the single-user server is an instance of `jupyter notebook`, an entire separate
+multi-process application, many aspects of that server can be configured and there are a lot
+of ways to express that configuration.

 At the JupyterHub level, you can set some values on the Spawner. The simplest of these is
 `Spawner.notebook_dir`, which lets you set the root directory for a user's server. This root
-notebook directory is the highest level directory users will be able to access in the notebook
+notebook directory is the highest-level directory users will be able to access in the notebook
 dashboard. In this example, the root notebook directory is set to `~/notebooks`, where `~` is
 expanded to the user's home directory.

@@ -14,13 +14,13 @@ expanded to the user's home directory.
 c.Spawner.notebook_dir = '~/notebooks'
 ```

-You can also specify extra command-line arguments to the notebook server with:
+You can also specify extra command line arguments to the notebook server with:

 ```python
 c.Spawner.args = ['--debug', '--profile=PHYS131']
 ```

-This could be used to set the users default page for the single user server:
+This could be used to set the user's default page for the single-user server:

 ```python
 c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']