Merge branch 'main' into copyediting

This commit is contained in:
Min RK
2022-11-24 09:40:21 +01:00
committed by GitHub
328 changed files with 35662 additions and 7475 deletions

View File

@@ -0,0 +1,128 @@
(api-only)=
# Deploying JupyterHub in "API only mode"
As a service for deploying and managing Jupyter servers for users, JupyterHub
exposes this functionality _primarily_ via a [REST API](rest).
For convenience, JupyterHub also ships with a _basic_ web UI built using that REST API.
The basic web UI enables users to click a button to quickly start and stop their servers,
and it lets admins perform some basic user and server management tasks.
The REST API has always provided additional functionality beyond what is available in the basic web UI.
Similarly, we avoid implementing UI functionality that is also not available via the API.
With JupyterHub 2.0, the basic web UI will **always** be composed using the REST API.
In other words, no UI pages should rely on information not available via the REST API.
Previously, some admin UI functionality could only be achieved via admin pages,
such as paginated requests.
## Limited UI customization via templates
The JupyterHub UI is customizable via extensible HTML [templates](templates),
but this has some limited scope to what can be customized.
Adding some content and messages to existing pages is well supported,
but changing the page flow and what pages are available are beyond the scope of what is customizable.
## Rich UI customization with REST API based apps
Increasingly, JupyterHub is used purely as an API for managing Jupyter servers
for other Jupyter-based applications that might want to present a different user experience.
If you want a fully customized user experience,
you can now disable the Hub UI and use your own pages together with the JupyterHub REST API
to build your own web application to serve your users,
relying on the Hub only as an API for managing users and servers.
One example of such an application is [BinderHub][], which powers https://mybinder.org,
and motivates many of these changes.
BinderHub is distinct from a traditional JupyterHub deployment
because it uses temporary users created for each launch.
Instead of presenting a login page,
users are presented with a form to specify what environment they would like to launch:
![Binder launch form](../images/binderhub-form.png)
When a launch is requested:
1. an image is built, if necessary
2. a temporary user is created,
3. a server is launched for that user, and
4. when running, users are redirected to an already running server with an auth token in the URL
5. after the session is over, the user is deleted
This means that a lot of JupyterHub's UI flow doesn't make sense:
- there is no way for users to login
- the human user doesn't map onto a JupyterHub `User` in a meaningful way
- when a server isn't running, there isn't a 'restart your server' action available because the user has been deleted
- users do not have any access to any Hub functionality, so presenting pages for those features would be confusing
BinderHub is one of the motivating use cases for JupyterHub supporting being used _only_ via its API.
We'll use BinderHub here as an example of various configuration options.
[binderhub]: https://binderhub.readthedocs.io
## Disabling Hub UI
`c.JupyterHub.hub_routespec` is a configuration option to specify which URL prefix should be routed to the Hub.
The default is `/` which means that the Hub will receive all requests not already specified to be routed somewhere else.
There are three values that are most logical for `hub_routespec`:
- `/` - this is the default, and used in most deployments.
It is also the only option prior to JupyterHub 1.4.
- `/hub/` - this serves only Hub pages, both UI and API
- `/hub/api` - this serves _only the Hub API_, so all Hub UI is disabled,
aside from the OAuth confirmation page, if used.
If you choose a hub routespec other than `/`,
the main JupyterHub feature you will lose is the automatic handling of requests for `/user/:username`
when the requested server is not running.
JupyterHub's handling of this request shows this page,
telling you that the server is not running,
with a button to launch it again:
![screenshot of hub page for server not running](../images/server-not-running.png)
If you set `hub_routespec` to something other than `/`,
it is likely that you also want to register another destination for `/` to handle requests to not-running servers.
If you don't, you will see a default 404 page from the proxy:
![screenshot of CHP default 404](../images/chp-404.png)
For mybinder.org, the default "start my server" page doesn't make sense,
because when a server is gone, there is no restart action.
Instead, we provide hints about how to get back to a link to start a _new_ server:
![screenshot of mybinder.org 404](../images/binder-404.png)
To achieve this, mybinder.org registers a route for `/` that goes to a custom endpoint
that runs nginx and only serves this static HTML error page.
This is set with
```python
c.Proxy.extra_routes = {
"/": "http://custom-404-entpoint/",
}
```
You may want to use an alternate behavior, such as redirecting to a landing page,
or taking some other action based on the requested page.
If you use `c.JupyterHub.hub_routespec = "/hub/"`,
then all the Hub pages will be available,
and only this default-page-404 issue will come up.
If you use `c.JupyterHub.hub_routespec = "/hub/api/"`,
then only the Hub _API_ will be available,
and all UI will be up to you.
mybinder.org takes this last option,
because none of the Hub UI pages really make sense.
Binder users don't have any reason to know or care that JupyterHub happens
to be an implementation detail of how their environment is managed.
Seeing Hub error pages and messages in that situation is more likely to be confusing than helpful.
:::{versionadded} 1.4
`c.JupyterHub.hub_routespec` and `c.Proxy.extra_routes` are new in JupyterHub 1.4.
:::

View File

@@ -1,6 +1,6 @@
# Authenticators
The [Authenticator][] is the mechanism for authorizing users to use the
The {class}`.Authenticator` is the mechanism for authorizing users to use the
Hub and single user notebook servers.
## The default PAM Authenticator
@@ -36,7 +36,7 @@ A [generic implementation](https://github.com/jupyterhub/oauthenticator/blob/mas
## The Dummy Authenticator
When testing, it may be helpful to use the
:class:`~jupyterhub.auth.DummyAuthenticator`. This allows for any username and
{class}`jupyterhub.auth.DummyAuthenticator`. This allows for any username and
password unless if a global password has been set. Once set, any username will
still be accepted but the correct password will need to be provided.
@@ -88,7 +88,6 @@ class DictionaryAuthenticator(Authenticator):
return data['username']
```
#### Normalize usernames
Since the Authenticator and Spawner both use the same username,
@@ -111,11 +110,8 @@ normalize usernames using PAM (basically round-tripping them: username
to uid to username), which is useful in case you use some external
service that allows multiple usernames mapping to the same user (such
as ActiveDirectory, yes, this really happens). When
`pam_normalize_username` is on, usernames are *not* normalized to
`pam_normalize_username` is on, usernames are _not_ normalized to
lowercase.
NOTE: Earlier it says that usernames are normalized using PAM.
I guess that doesn't normalize them?
#### Validate usernames
@@ -133,7 +129,6 @@ To only allow usernames that start with 'w':
c.Authenticator.username_pattern = r'w.*'
```
### How to write a custom authenticator
You can use custom Authenticator subclasses to enable authentication
@@ -141,12 +136,11 @@ via other mechanisms. One such example is using [GitHub OAuth][].
Because the username is passed from the Authenticator to the Spawner,
a custom Authenticator and Spawner are often used together.
For example, the Authenticator methods, [pre_spawn_start(user, spawner)][]
and [post_spawn_stop(user, spawner)][], are hooks that can be used to do
For example, the Authenticator methods, {meth}`.Authenticator.pre_spawn_start`
and {meth}`.Authenticator.post_spawn_stop`, are hooks that can be used to do
auth-related startup (e.g. opening PAM sessions) and cleanup
(e.g. closing PAM sessions).
See a list of custom Authenticators [on the wiki](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
If you are interested in writing a custom authenticator, you can read
@@ -187,7 +181,6 @@ Additionally, configurable attributes for your authenticator will
appear in jupyterhub help output and auto-generated configuration files
via `jupyterhub --generate-config`.
### Authentication state
JupyterHub 0.8 adds the ability to persist state related to authentication,
@@ -221,25 +214,22 @@ To store auth_state, two conditions must be met:
export JUPYTERHUB_CRYPT_KEY=$(openssl rand -hex 32)
```
JupyterHub uses [Fernet](https://cryptography.io/en/latest/fernet/) to encrypt auth_state.
To facilitate key-rotation, `JUPYTERHUB_CRYPT_KEY` may be a semicolon-separated list of encryption keys.
If there are multiple keys present, the **first** key is always used to persist any new auth_state.
#### Using auth_state
Typically, if `auth_state` is persisted it is desirable to affect the Spawner environment in some way.
This may mean defining environment variables, placing certificate in the user's home directory, etc.
The `Authenticator.pre_spawn_start` method can be used to pass information from authenticator state
The {meth}`Authenticator.pre_spawn_start` method can be used to pass information from authenticator state
to Spawner environment:
```python
class MyAuthenticator(Authenticator):
@gen.coroutine
def authenticate(self, handler, data=None):
username = yield identify_user(handler, data)
upstream_token = yield token_for_user(username)
async def authenticate(self, handler, data=None):
username = await identify_user(handler, data)
upstream_token = await token_for_user(username)
return {
'name': username,
'auth_state': {
@@ -247,21 +237,69 @@ class MyAuthenticator(Authenticator):
},
}
@gen.coroutine
def pre_spawn_start(self, user, spawner):
async def pre_spawn_start(self, user, spawner):
"""Pass upstream_token to spawner via environment variable"""
auth_state = yield user.get_auth_state()
auth_state = await user.get_auth_state()
if not auth_state:
# auth_state not enabled
return
spawner.environment['UPSTREAM_TOKEN'] = auth_state['upstream_token']
```
Note that environment variable names and values are always strings, so passing multiple values means setting multiple environment variables or serializing more complex data into a single variable, e.g. as a JSON string.
auth state can also be used to configure the spawner via _config_ without subclassing
by setting `c.Spawner.auth_state_hook`. This function will be called with `(spawner, auth_state)`,
only when auth_state is defined.
For example:
(for KubeSpawner)
```python
def auth_state_hook(spawner, auth_state):
spawner.volumes = auth_state['user_volumes']
spawner.mounts = auth_state['user_mounts']
c.Spawner.auth_state_hook = auth_state_hook
```
(authenticator-groups)=
## Authenticator-managed group membership
:::{versionadded} 2.2
:::
Some identity providers may have their own concept of group membership that you would like to preserve in JupyterHub.
This is now possible with `Authenticator.managed_groups`.
You can set the config:
```python
c.Authenticator.manage_groups = True
```
to enable this behavior.
The default is False for Authenticators that ship with JupyterHub,
but may be True for custom Authenticators.
Check your Authenticator's documentation for manage_groups support.
If True, {meth}`.Authenticator.authenticate` and {meth}`.Authenticator.refresh_user` may include a field `groups`
which is a list of group names the user should be a member of:
- Membership will be added for any group in the list
- Membership in any groups not in the list will be revoked
- Any groups not already present in the database will be created
- If `None` is returned, no changes are made to the user's group membership
If authenticator-managed groups are enabled,
all group-management via the API is disabled.
## pre_spawn_start and post_spawn_stop hooks
Authenticators uses two hooks, [pre_spawn_start(user, spawner)][] and
[post_spawn_stop(user, spawner)][] to pass additional state information
between the authenticator and a spawner. These hooks are typically used for auth-related
Authenticators use two hooks, {meth}`.Authenticator.pre_spawn_start` and
{meth}`.Authenticator.post_spawn_stop(user, spawner)` to add pass additional state information
between the authenticator and a spawner. These hooks are typically used auth-related
startup, i.e. opening a PAM session, and auth-related cleanup, i.e. closing a
PAM session.
@@ -269,11 +307,7 @@ PAM session.
Beginning with version 0.8, JupyterHub is an OAuth provider.
[Authenticator]: https://github.com/jupyterhub/jupyterhub/blob/master/jupyterhub/auth.py
[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
[OAuth]: https://en.wikipedia.org/wiki/OAuth
[GitHub OAuth]: https://developer.github.com/v3/oauth/
[OAuthenticator]: https://github.com/jupyterhub/oauthenticator
[pre_spawn_start(user, spawner)]: https://jupyterhub.readthedocs.io/en/latest/api/auth.html#jupyterhub.auth.Authenticator.pre_spawn_start
[post_spawn_stop(user, spawner)]: https://jupyterhub.readthedocs.io/en/latest/api/auth.html#jupyterhub.auth.Authenticator.post_spawn_stop
[pam]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
[oauth]: https://en.wikipedia.org/wiki/OAuth
[github oauth]: https://developer.github.com/v3/oauth/
[oauthenticator]: https://github.com/jupyterhub/oauthenticator

View File

@@ -3,18 +3,17 @@
In this example, we show a configuration file for a fairly standard JupyterHub
deployment with the following assumptions:
* Running JupyterHub on a single cloud server
* Using SSL on the standard HTTPS port 443
* Using GitHub OAuth (using oauthenticator) for login
* Using the default spawner (to configure other spawners, uncomment and edit
- Running JupyterHub on a single cloud server
- Using SSL on the standard HTTPS port 443
- Using GitHub OAuth (using [OAuthenticator](https://oauthenticator.readthedocs.io/en/latest)) for login
- Using the default spawner (to configure other spawners, uncomment and edit
`spawner_class` as well as follow the instructions for your desired spawner)
* Users exist locally on the server
* Users' notebooks to be served from `~/assignments` to allow users to browse
- Users exist locally on the server
- Users' notebooks to be served from `~/assignments` to allow users to browse
for notebooks within other users' home directories
* You want the landing page for each user to be a `Welcome.ipynb` notebook in
their assignments directory.
* All runtime files are put into `/srv/jupyterhub` and log files in `/var/log`.
- You want the landing page for each user to be a `Welcome.ipynb` notebook in
their assignments directory
- All runtime files are put into `/srv/jupyterhub` and log files in `/var/log`
The `jupyterhub_config.py` file would have these settings:
@@ -52,7 +51,7 @@ c.GitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL']
c.LocalAuthenticator.create_system_users = True
# specify users and admin
c.Authenticator.whitelist = {'rgbkrk', 'minrk', 'jhamrick'}
c.Authenticator.allowed_users = {'rgbkrk', 'minrk', 'jhamrick'}
c.Authenticator.admin_users = {'jhamrick', 'rgbkrk'}
# uses the default spawner
@@ -70,7 +69,7 @@ c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
```
Using the GitHub Authenticator requires a few additional
environment variable to be set prior to launching JupyterHub:
environment variables to be set prior to launching JupyterHub:
```bash
export GITHUB_CLIENT_ID=github_id
@@ -80,3 +79,5 @@ export CONFIGPROXY_AUTH_TOKEN=super-secret
# append log output to log file /var/log/jupyterhub.log
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py &>> /var/log/jupyterhub.log
```
Visit the [Github OAuthenticator reference](https://oauthenticator.readthedocs.io/en/latest/api/gen/oauthenticator.github.html) to see the full list of options for configuring Github OAuth with JupyterHub.

View File

@@ -6,15 +6,15 @@ SSL port `443`. This could be useful if the JupyterHub server machine is also
hosting other domains or content on `443`. The goal in this example is to
satisfy the following:
* JupyterHub is running on a server, accessed *only* via `HUB.DOMAIN.TLD:443`
* On the same machine, `NO_HUB.DOMAIN.TLD` strictly serves different content,
- JupyterHub is running on a server, accessed _only_ via `HUB.DOMAIN.TLD:443`
- On the same machine, `NO_HUB.DOMAIN.TLD` strictly serves different content,
also on port `443`
* `nginx` or `apache` is used as the public access point (which means that
only nginx/apache will bind to `443`)
* After testing, the server in question should be able to score at least an A on the
- `nginx` or `apache` is used as the public access point (which means that
only nginx/apache will bind to `443`)
- After testing, the server in question should be able to score at least an A on the
Qualys SSL Labs [SSL Server Test](https://www.ssllabs.com/ssltest/)
Let's start out with needed JupyterHub configuration in `jupyterhub_config.py`:
Let's start out with the needed JupyterHub configuration in `jupyterhub_config.py`:
```python
# Force the proxy to only listen to connections to 127.0.0.1 (on port 8000)
@@ -30,15 +30,15 @@ This can take a few minutes:
openssl dhparam -out /etc/ssl/certs/dhparam.pem 4096
```
## nginx
## Nginx
This **`nginx` config file** is fairly standard fare except for the two
`location` blocks within the main section for HUB.DOMAIN.tld.
To create a new site for jupyterhub in your nginx config, make a new file
To create a new site for jupyterhub in your Nginx config, make a new file
in `sites.enabled`, e.g. `/etc/nginx/sites.enabled/jupyterhub.conf`:
```bash
# top-level http config for websocket headers
# Top-level HTTP config for WebSocket headers
# If Upgrade is defined, Connection = upgrade
# If Upgrade is empty, Connection = close
map $http_upgrade $connection_upgrade {
@@ -51,7 +51,7 @@ server {
listen 80;
server_name HUB.DOMAIN.TLD;
# Tell all requests to port 80 to be 302 redirected to HTTPS
# Redirect the request to HTTPS
return 302 https://$host$request_uri;
}
@@ -75,7 +75,7 @@ server {
ssl_stapling_verify on;
add_header Strict-Transport-Security max-age=15768000;
# Managing literal requests to the JupyterHub front end
# Managing literal requests to the JupyterHub frontend
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header X-Real-IP $remote_addr;
@@ -83,8 +83,12 @@ server {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# websocket headers
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_set_header X-Scheme $scheme;
proxy_buffering off;
}
# Managing requests to verify letsencrypt host
@@ -97,10 +101,10 @@ server {
If `nginx` is not running on port 443, substitute `$http_host` for `$host` on
the lines setting the `Host` header.
`nginx` will now be the front facing element of JupyterHub on `443` which means
`nginx` will now be the front-facing element of JupyterHub on `443` which means
it is also free to bind other servers, like `NO_HUB.DOMAIN.TLD` to the same port
on the same machine and network interface. In fact, one can simply use the same
server blocks as above for `NO_HUB` and simply add line for the root directory
server blocks as above for `NO_HUB` and simply add a line for the root directory
of the site as well as the applicable location call:
```bash
@@ -108,7 +112,7 @@ server {
listen 80;
server_name NO_HUB.DOMAIN.TLD;
# Tell all requests to port 80 to be 302 redirected to HTTPS
# Redirect the request to HTTPS
return 302 https://$host$request_uri;
}
@@ -139,25 +143,40 @@ Now restart `nginx`, restart the JupyterHub, and enjoy accessing
`https://HUB.DOMAIN.TLD` while serving other content securely on
`https://NO_HUB.DOMAIN.TLD`.
### SELinux permissions for Nginx
On distributions with SELinux enabled (e.g. Fedora), one may encounter permission errors
when the Nginx service is started.
We need to allow Nginx to perform network relay and connect to the JupyterHub port. The
following commands do that:
```bash
semanage port -a -t http_port_t -p tcp 8000
setsebool -P httpd_can_network_relay 1
setsebool -P httpd_can_network_connect 1
```
Replace 8000 with the port the JupyterHub server is running from.
## Apache
As with nginx above, you can use [Apache](https://httpd.apache.org) as the reverse proxy.
First, we will need to enable the apache modules that we are going to need:
As with Nginx above, you can use [Apache](https://httpd.apache.org) as the reverse proxy.
First, we will need to enable the Apache modules that we are going to need:
```bash
a2enmod ssl rewrite proxy proxy_http proxy_wstunnel
a2enmod ssl rewrite proxy headers proxy_http proxy_wstunnel
```
Our Apache configuration is equivalent to the nginx configuration above:
Our Apache configuration is equivalent to the Nginx configuration above:
- Redirect HTTP to HTTPS
- Good SSL Configuration
- Support for websockets on any proxied URL
- Support for WebSocket on any proxied URL
- JupyterHub is running locally at http://127.0.0.1:8000
```bash
# redirect HTTP to HTTPS
# Redirect HTTP to HTTPS
Listen 80
<VirtualHost HUB.DOMAIN.TLD:80>
ServerName HUB.DOMAIN.TLD
@@ -169,15 +188,26 @@ Listen 443
ServerName HUB.DOMAIN.TLD
# configure SSL
# Enable HTTP/2, if available
Protocols h2 http/1.1
# HTTP Strict Transport Security (mod_headers is required) (63072000 seconds)
Header always set Strict-Transport-Security "max-age=63072000"
# Configure SSL
SSLEngine on
SSLCertificateFile /etc/letsencrypt/live/HUB.DOMAIN.TLD/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/HUB.DOMAIN.TLD/privkey.pem
SSLProtocol All -SSLv2 -SSLv3
SSLOpenSSLConfCmd DHParameters /etc/ssl/certs/dhparam.pem
SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH
# Use RewriteEngine to handle websocket connection upgrades
# Intermediate configuration from SSL-config.mozilla.org (2022-03-03)
# Please note, that this configuration might be outdated - please update it accordingly using https://ssl-config.mozilla.org/
SSLProtocol all -SSLv3 -TLSv1 -TLSv1.1
SSLCipherSuite ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
SSLHonorCipherOrder off
SSLSessionTickets off
# Use RewriteEngine to handle WebSocket connection upgrades
RewriteEngine On
RewriteCond %{HTTP:Connection} Upgrade [NC]
RewriteCond %{HTTP:Upgrade} websocket [NC]
@@ -189,26 +219,29 @@ Listen 443
# proxy to JupyterHub
ProxyPass http://127.0.0.1:8000/
ProxyPassReverse http://127.0.0.1:8000/
RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
</Location>
</VirtualHost>
```
In case of the need to run the jupyterhub under /jhub/ or other location please use the below configurations:
In case of the need to run JupyterHub under /jhub/ or another location please use the below configurations:
- JupyterHub running locally at http://127.0.0.1:8000/jhub/ or other location
httpd.conf amendments:
```bash
RewriteRule /jhub/(.*) ws://127.0.0.1:8000/jhub/$1 [P,L]
RewriteRule /jhub/(.*) http://127.0.0.1:8000/jhub/$1 [P,L]
ProxyPass /jhub/ http://127.0.0.1:8000/jhub/
ProxyPassReverse /jhub/ http://127.0.0.1:8000/jhub/
```
```
jupyterhub_config.py amendments:
```python
# The public facing URL of the whole JupyterHub application.
# This is the address on which the proxy will bind. Sets protocol, ip, base_url
# This is the address on which the proxy will bind. Sets protocol, IP, base_url
c.JupyterHub.bind_url = 'http://127.0.0.1:8000/jhub/'
```

View File

@@ -0,0 +1,30 @@
==============================
Configuration Reference
==============================
.. important::
Make sure the version of JupyterHub for this documentation matches your
installation version, as the output of this command may change between versions.
JupyterHub configuration
------------------------
As explained in the `Configuration Basics <../getting-started/config-basics.html#generate-a-default-config-file>`_
section, the ``jupyterhub_config.py`` can be automatically generated via
.. code-block:: bash
jupyterhub --generate-config
The following contains the output of that command for reference.
.. jupyterhub-generate-config::
JupyterHub help command output
------------------------------
This section contains the output of the command ``jupyterhub --help-all``.
.. jupyterhub-help-all::

View File

@@ -6,10 +6,10 @@ Only do this if you are very sure you must.
## Overview
There are many Authenticators and Spawners available for JupyterHub. Some, such
as DockerSpawner or OAuthenticator, do not need any elevated permissions. This
There are many [Authenticators](../getting-started/authenticators-users-basics) and [Spawners](../getting-started/spawners-basics) available for JupyterHub. Some, such
as [DockerSpawner](https://github.com/jupyterhub/dockerspawner) or [OAuthenticator](https://github.com/jupyterhub/oauthenticator), do not need any elevated permissions. This
document describes how to get the full default behavior of JupyterHub while
running notebook servers as real system users on a shared system without
running notebook servers as real system users on a shared system, without
running the Hub itself as root.
Since JupyterHub needs to spawn processes as other users, the simplest way
@@ -50,14 +50,13 @@ To do this we add to `/etc/sudoers` (use `visudo` for safe editing of sudoers):
- specify the list of users `JUPYTER_USERS` for whom `rhea` can spawn servers
- set the command `JUPYTER_CMD` that `rhea` can execute on behalf of users
- give `rhea` permission to run `JUPYTER_CMD` on behalf of `JUPYTER_USERS`
- give `rhea` permission to run `JUPYTER_CMD` on behalf of `JUPYTER_USERS`
without entering a password
For example:
```bash
# comma-separated whitelist of users that can spawn single-user servers
# comma-separated list of users that can spawn single-user servers
# this should include all of your Hub users
Runas_Alias JUPYTER_USERS = rhea, zoe, wash
@@ -92,16 +91,16 @@ $ adduser -G jupyterhub newuser
Test that the new user doesn't need to enter a password to run the sudospawner
command.
This should prompt for your password to switch to rhea, but *not* prompt for
This should prompt for your password to switch to `rhea`, but _not_ prompt for
any password for the second switch. It should show some help output about
logging options:
```bash
$ sudo -u rhea sudo -n -u $USER /usr/local/bin/sudospawner --help
Usage: /usr/local/bin/sudospawner [OPTIONS]
Options:
--help show this help information
...
```
@@ -121,6 +120,11 @@ the shadow password database.
### Shadow group (Linux)
**Note:** On [Fedora based distributions](https://fedoraproject.org/wiki/List_of_Fedora_remixes) there is no clear way to configure
the PAM database to allow sufficient access for authenticating with the target user's password
from JupyterHub. As a workaround we recommend use an
[alternative authentication method](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
```bash
$ ls -l /etc/shadow
-rw-r----- 1 root shadow 2197 Jul 21 13:41 shadow
@@ -147,12 +151,13 @@ We want our new user to be able to read the shadow passwords, so add it to the s
$ sudo usermod -a -G shadow rhea
```
If you want jupyterhub to serve pages on a restricted port (such as port 80 for http),
If you want jupyterhub to serve pages on a restricted port (such as port 80 for HTTP),
then you will need to give `node` permission to do so:
```bash
sudo setcap 'cap_net_bind_service=+ep' /usr/bin/node
```
However, you may want to further understand the consequences of this.
([Further reading](http://man7.org/linux/man-pages/man7/capabilities.7.html))
@@ -162,7 +167,6 @@ distributions' packaging system. This can be used to keep any user's process
from using too much CPU cycles. You can configure it accoring to [these
instructions](http://ubuntuforums.org/showthread.php?t=992706).
### Shadow group (FreeBSD)
**NOTE:** This has not been tested on FreeBSD and may not work as expected on
@@ -184,7 +188,7 @@ $ sudo chgrp shadow /etc/master.passwd
$ sudo chmod g+r /etc/master.passwd
```
We want our new user to be able to read the shadow passwords, so add it to the
We want our new user to be able to read the shadow passwords, so add it to the
shadow group:
```bash
@@ -218,15 +222,15 @@ Finally, start the server as our newly configured user, `rhea`:
```bash
$ cd /etc/jupyterhub
$ sudo -u rhea jupyterhub --JupyterHub.spawner_class=sudospawner.SudoSpawner
```
```
And try logging in.
## Troubleshooting: SELinux
If you still get a generic `Permission denied` `PermissionError`, it's possible SELinux is blocking you.
Here's how you can make a module to allow this.
First, put this in a file named `sudo_exec_selinux.te`:
Here's how you can make a module to resolve this.
First, put this in a file named `sudo_exec_selinux.te`:
```bash
module sudo_exec_selinux 1.1;

View File

@@ -1,8 +1,8 @@
# Configuring user environments
Deploying JupyterHub means you are providing Jupyter notebook environments for
To deploy JupyterHub means you are providing Jupyter notebook environments for
multiple users. Often, this includes a desire to configure the user
environment in some way.
environment in a custom way.
Since the `jupyterhub-singleuser` server extends the standard Jupyter notebook
server, most configuration and documentation that applies to Jupyter Notebook
@@ -10,71 +10,56 @@ applies to the single-user environments. Configuration of user environments
typically does not occur through JupyterHub itself, but rather through system-wide
configuration of Jupyter, which is inherited by `jupyterhub-singleuser`.
**Tip:** When searching for configuration tips for JupyterHub user
environments, try removing JupyterHub from your search because there are a lot
more people out there configuring Jupyter than JupyterHub and the
configuration is the same.
**Tip:** When searching for configuration tips for JupyterHub user environments, you might want to remove JupyterHub from your search because there are a lot more people out there configuring Jupyter than JupyterHub and the configuration is the same.
This section will focus on user environments, including:
- Installing packages
- Configuring Jupyter and IPython
- Installing kernelspecs
- Using containers vs. multi-user hosts
This section will focus on user environments, which includes the following:
- [Installing packages](#installing-packages)
- [Configuring Jupyter and IPython](#configuring-jupyter-and-ipython)
- [Installing kernelspecs](#installing-kernelspecs)
- [Using containers vs. multi-user hosts](#multi-user-hosts-vs-containers)
## Installing packages
To make packages available to users, you generally will install packages
system-wide or in a shared environment.
To make packages available to users, you will typically install packages system-wide or in a shared environment.
This installation location should always be in the same environment that
`jupyterhub-singleuser` itself is installed in, and must be *readable and
executable* by your users. If you want users to be able to install additional
packages, it must also be *writable* by your users.
If you are using a standard system Python install, you would use:
This installation location should always be in the same environment where
`jupyterhub-singleuser` itself is installed in, and must be _readable and
executable_ by your users. If you want your users to be able to install additional
packages, the installation location must also be _writable_ by your users.
If you are using a standard Python installation on your system, use the following command:
```bash
sudo python3 -m pip install numpy
```
to install the numpy package in the default system Python 3 environment
to install the numpy package in the default Python 3 environment on your system
(typically `/usr/local`).
TODO: Get a link from the conda team for a description of what "appropriate permissions for users" is
You may also use conda to install packages. If you do, you should make sure
that the conda environment has appropriate permissions for users to be able to
run Python code in the env. The env must be *readable and executable* by all
users. Additionally it must be *writeable* if you want users to install
additional packages.
## Configuring Jupyter and IPython
[Jupyter](https://jupyter-notebook.readthedocs.io/en/stable/config_overview.html)
and [IPython](https://ipython.readthedocs.io/en/stable/development/config.html)
have their own configuration systems.
As a JupyterHub administrator, you will typically want to install and configure
environments for all JupyterHub users. For example, you wish for each student in
a class to have the same user environment configuration.
Jupyter and IPython support **"system-wide"** locations for configuration, which
is the logical place to put global configuration that you want to affect all
users. It's generally more efficient to configure user environments "system-wide",
and it's a good idea to avoid creating files in users' home directories.
As a JupyterHub administrator, you will typically want to install and configure environments for all JupyterHub users. For example, let's say you wish for each student in a class to have the same user environment configuration.
Jupyter and IPython support **"system-wide"** locations for configuration, which is the logical place to put global configuration that you want to affect all users. It's generally more efficient to configure user environments "system-wide", and it's a good practice to avoid creating files in the users' home directories.
The typical locations for these config files are:
- **system-wide** in `/etc/{jupyter|ipython}`
- **env-wide** (environment wide) in `{sys.prefix}/etc/{jupyter|ipython}`.
### Example: Enable an extension system-wide
For example, to enable the `cython` IPython extension for all of your users,
create the file `/etc/ipython/ipython_config.py`:
For example, to enable the `cython` IPython extension for all of your users, create the file `/etc/ipython/ipython_config.py`:
```python
c.InteractiveShellApp.extensions.append("cython")
@@ -82,32 +67,39 @@ c.InteractiveShellApp.extensions.append("cython")
### Example: Enable a Jupyter notebook configuration setting for all users
To enable Jupyter notebook's internal idle-shutdown behavior (requires
notebook ≥ 5.4), set the following in the `/etc/jupyter/jupyter_notebook_config.py`
file:
:::{note}
These examples configure the Jupyter ServerApp, which is used by JupyterLab, the default in JupyterHub 2.0.
If you are using the classing Jupyter Notebook server,
the same things should work,
with the following substitutions:
- Search for `jupyter_server_config`, and replace with `jupyter_notebook_config`
- Search for `NotebookApp`, and replace with `ServerApp`
:::
To enable Jupyter notebook's internal idle-shutdown behavior (requires notebook ≥ 5.4), set the following in the `/etc/jupyter/jupyter_server_config.py` file:
```python
# shutdown the server after no activity for an hour
c.NotebookApp.shutdown_no_activity_timeout = 60 * 60
c.ServerApp.shutdown_no_activity_timeout = 60 * 60
# shutdown kernels after no activity for 20 minutes
c.MappingKernelManager.cull_idle_timeout = 20 * 60
# check for idle kernels every two minutes
c.MappingKernelManager.cull_interval = 2 * 60
```
## Installing kernelspecs
You may have multiple Jupyter kernels installed and want to make sure that
they are available to all of your users. This means installing kernelspecs
either system-wide (e.g. in /usr/local/) or in the `sys.prefix` of JupyterHub
You may have multiple Jupyter kernels installed and want to make sure that they are available to all of your users. This means installing kernelspecs either system-wide (e.g. in /usr/local/) or in the `sys.prefix` of JupyterHub
itself.
Jupyter kernelspec installation is system wide by default, but some kernels
Jupyter kernelspec installation is system-wide by default, but some kernels
may default to installing kernelspecs in your home directory. These will need
to be moved system-wide to ensure that they are accessible.
You can see where your kernelspecs are with:
To see where your kernelspecs are, you can use the following command:
```bash
jupyter kernelspec list
@@ -115,15 +107,13 @@ jupyter kernelspec list
### Example: Installing kernels system-wide
Assuming I have a Python 2 and Python 3 environment that I want to make
sure are available, I can install their specs system-wide (in /usr/local) with:
Let's assume that I have a Python 2 and Python 3 environment that I want to make sure are available, I can install their specs **system-wide** (in /usr/local) using the following command:
```bash
/path/to/python3 -m IPython kernel install --prefix=/usr/local
/path/to/python2 -m IPython kernel install --prefix=/usr/local
/path/to/python3 -m ipykernel install --prefix=/usr/local
/path/to/python2 -m ipykernel install --prefix=/usr/local
```
## Multi-user hosts vs. Containers
There are two broad categories of user environments that depend on what
@@ -136,31 +126,25 @@ How you configure user environments for each category can differ a bit
depending on what Spawner you are using.
The first category is a **shared system (multi-user host)** where
each user has a JupyterHub account and a home directory as well as being
each user has a JupyterHub account, a home directory as well as being
a real system user. In this example, shared configuration and installation
must be in a 'system-wide' location, such as `/etc/` or `/usr/local`
must be in a 'system-wide' location, such as `/etc/`, or `/usr/local`
or a custom prefix such as `/opt/conda`.
When JupyterHub uses **container-based** Spawners (e.g. KubeSpawner or
DockerSpawner), the 'system-wide' environment is really the container image
which you are using for users.
DockerSpawner), the 'system-wide' environment is really the container image used for users.
In both cases, you want to *avoid putting configuration in user home
directories* because users can change those configuration settings. Also,
home directories typically persist once they are created, so they are
difficult for admins to update later.
In both cases, you want to _avoid putting configuration in user home
directories_ because users can change those configuration settings. Also, home directories typically persist once they are created, thereby making it difficult for admins to update later.
## Named servers
By default, in a JupyterHub deployment each user has exactly one server.
By default, in a JupyterHub deployment, each user has one server only.
JupyterHub can, however, have multiple servers per user.
This is most useful in deployments where users can configure the environment
in which their server will start (e.g. resource requests on an HPC cluster),
so that a given user can have multiple configurations running at the same time,
without having to stop and restart their one server.
This is mostly useful in deployments where users can configure the environment in which their server will start (e.g. resource requests on an HPC cluster), so that a given user can have multiple configurations running at the same time, without having to stop and restart their own server.
To allow named servers:
To allow named servers, include this code snippet in your config file:
```python
c.JupyterHub.allow_named_servers = True
@@ -176,10 +160,66 @@ as well as the admin page:
![named servers on the admin page](../images/named-servers-admin.png)
Named servers can be accessed, created, started, stopped, and deleted
from these pages. Activity tracking is now per-server as well.
from these pages. Activity tracking is now per server as well.
The number of named servers per user can be limited by setting
To limit the number of **named server** per user by setting a constant value, include this code snippet in your config file:
```python
c.JupyterHub.named_server_limit_per_user = 5
```
Alternatively, to use a callable/awaitable based on the handler object, include this code snippet in your config file:
```python
def named_server_limit_per_user_fn(handler):
user = handler.current_user
if user and user.admin:
return 0
return 5
c.JupyterHub.named_server_limit_per_user = named_server_limit_per_user_fn
```
This can be useful for quota service implementations. The example above limits the number of named servers for non-admin users only.
If `named_server_limit_per_user` is set to `0`, no limit is enforced.
(classic-notebook-ui)=
## Switching back to the classic notebook
By default, the single-user server launches JupyterLab,
which is based on [Jupyter Server][].
This is the default server when running JupyterHub ≥ 2.0.
To switch to using the legacy Jupyter Notebook server, you can set the `JUPYTERHUB_SINGLEUSER_APP` environment variable
(in the single-user environment) to:
```bash
export JUPYTERHUB_SINGLEUSER_APP='notebook.notebookapp.NotebookApp'
```
[jupyter server]: https://jupyter-server.readthedocs.io
[jupyter notebook]: https://jupyter-notebook.readthedocs.io
:::{versionchanged} 2.0
JupyterLab is now the default single-user UI, if available,
which is based on the [Jupyter Server][],
no longer the legacy [Jupyter Notebook][] server.
JupyterHub prior to 2.0 launched the legacy notebook server (`jupyter notebook`),
and the Jupyter server could be selected by specifying the following:
```python
# jupyterhub_config.py
c.Spawner.cmd = ["jupyter-labhub"]
```
Alternatively, for an otherwise customized Jupyter Server app,
set the environment variable using the following command:
```bash
export JUPYTERHUB_SINGLEUSER_APP='jupyter_server.serverapp.ServerApp'
```
:::

View File

@@ -46,8 +46,8 @@ additional configuration required for MySQL that is not needed for PostgreSQL.
- You should use the `pymysql` sqlalchemy provider (the other one, MySQLdb,
isn't available for py3).
- You also need to set `pool_recycle` to some value (typically 60 - 300)
which depends on your MySQL setup. This is necessary since MySQL kills
- You also need to set `pool_recycle` to some value (typically 60 - 300)
which depends on your MySQL setup. This is necessary since MySQL kills
connections serverside if they've been idle for a while, and the connection
from the hub will be idle for longer than most connections. This behavior
will lead to frustrating 'the connection has gone away' errors from

View File

@@ -1,6 +1,9 @@
Technical Reference
===================
This section covers more of the details of the JupyterHub architecture, as well as
what happens under-the-hood when you deploy and configure your JupyterHub.
.. toctree::
:maxdepth: 2
@@ -13,10 +16,17 @@ Technical Reference
proxy
separate-proxy
rest
rest-api
server-api
monitoring
database
templates
api-only
../events/index
config-user-env
config-examples
config-ghoauth
config-proxy
config-sudo
config-reference
oauth

View File

@@ -0,0 +1,20 @@
Monitoring
==========
This section covers details on monitoring the state of your JupyterHub installation.
JupyterHub expose the ``/metrics`` endpoint that returns text describing its current
operational state formatted in a way `Prometheus <https://prometheus.io/docs/introduction/overview/>`_ understands.
Prometheus is a separate open source tool that can be configured to repeatedly poll
JupyterHub's ``/metrics`` endpoint to parse and save its current state.
By doing so, Prometheus can describe JupyterHub's evolving state over time.
This evolving state can then be accessed through Prometheus that expose its underlying
storage to those allowed to access it, and be presented with dashboards by a
tool like `Grafana <https://grafana.com/docs/grafana/latest/getting-started/what-is-grafana/>`_.
.. toctree::
:maxdepth: 2
metrics

View File

@@ -0,0 +1,373 @@
# JupyterHub and OAuth
JupyterHub uses [OAuth 2](https://oauth.net/2/) as an internal mechanism for authenticating users.
As such, JupyterHub itself always functions as an OAuth **provider**.
You can find out more about what that means [below](oauth-terms).
Additionally, JupyterHub is _often_ deployed with [OAuthenticator](https://oauthenticator.readthedocs.io),
where an external identity provider, such as GitHub or KeyCloak, is used to authenticate users.
When this is the case, there are _two_ nested OAuth flows:
an _internal_ OAuth flow where JupyterHub is the **provider**,
and an _external_ OAuth flow, where JupyterHub is the **client**.
This means that when you are using JupyterHub, there is always _at least one_ and often two layers of OAuth involved in a user logging in and accessing their server.
The following points are noteworthy:
- Single-user servers _never_ need to communicate with or be aware of the upstream provider configured in your Authenticator.
As far as the servers are concerned, only JupyterHub is an OAuth provider,
and how users authenticate with the Hub itself is irrelevant.
- When interacting with a single-user server,
there are ~always two tokens:
first, a token issued to the server itself to communicate with the Hub API,
and second, a per-user token in the browser to represent the completed login process and authorized permissions.
More on this [later](two-tokens).
(oauth-terms)=
## Key OAuth terms
Here are some key definitions to keep in mind when we are talking about OAuth.
You can also read more in detail [here](https://www.oauth.com/oauth2-servers/definitions/).
- **provider**: The entity responsible for managing identity and authorization;
always a web server.
JupyterHub is _always_ an OAuth provider for JupyterHub's components.
When OAuthenticator is used, an external service, such as GitHub or KeyCloak, is also an OAuth provider.
- **client**: An entity that requests OAuth **tokens** on a user's behalf;
generally a web server of some kind.
OAuth **clients** are services that _delegate_ authentication and/or authorization
to an OAuth **provider**.
JupyterHub _services_ or single-user _servers_ are OAuth **clients** of the JupyterHub **provider**.
When OAuthenticator is used, JupyterHub is itself _also_ an OAuth **client** for the external OAuth **provider**, e.g. GitHub.
- **browser**: A user's web browser, which makes requests and stores things like cookies.
- **token**: The secret value used to represent a user's authorization. This is the final product of the OAuth process.
- **code**: A short-lived temporary secret that the **client** exchanges
for a **token** at the conclusion of OAuth,
in what's generally called the "OAuth callback handler."
## One oauth flow
OAuth **flow** is what we call the sequence of HTTP requests involved in authenticating a user and issuing a token, ultimately used for authorizing access to a service or single-user server.
A single OAuth flow typically goes like this:
### OAuth request and redirect
1. A **browser** makes an HTTP request to an OAuth **client**.
2. There are no credentials, so the client _redirects_ the browser to an "authorize" page on the OAuth **provider** with some extra information:
- the OAuth **client ID** of the client itself.
- the **redirect URI** to be redirected back to after completion.
- the **scopes** requested, which the user should be presented with to confirm.
This is the "X would like to be able to Y on your behalf. Allow this?" page you see on all the "Login with ..." pages around the Internet.
3. During this authorize step,
the browser must be _authenticated_ with the provider.
This is often already stored in a cookie,
but if not the provider webapp must begin its _own_ authentication process before serving the authorization page.
This _may_ even begin another OAuth flow!
4. After the user tells the provider that they want to proceed with the authorization,
the provider records this authorization in a short-lived record called an **OAuth code**.
5. Finally, the oauth provider redirects the browser _back_ to the oauth client's "redirect URI"
(or "OAuth callback URI"),
with the OAuth code in a URL parameter.
That marks the end of the requests made between the **browser** and the **provider**.
### State after redirect
At this point:
- The browser is authenticated with the _provider_.
- The user's authorized permissions are recorded in an _OAuth code_.
- The _provider_ knows that the permissions requested by the OAuth client have been granted, but the client doesn't know this yet.
- All the requests so far have been made directly by the browser.
No requests have originated from the client or provider.
### OAuth Client Handles Callback Request
At this stage, we get to finish the OAuth process.
Let's dig into what the OAuth client does when it handles
the OAuth callback request.
- The OAuth client receives the _code_ and makes an API request to the _provider_ to exchange the code for a real _token_.
This is the first direct request between the OAuth _client_ and the _provider_.
- Once the token is retrieved, the client _usually_
makes a second API request to the _provider_
to retrieve information about the owner of the token (the user).
This is the step where behavior diverges for different OAuth providers.
Up to this point, all OAuth providers are the same, following the OAuth specification.
However, OAuth does not define a standard for issuing tokens in exchange for information about their owner or permissions ([OpenID Connect](https://openid.net/connect/) does that),
so this step may be different for each OAuth provider.
- Finally, the OAuth client stores its own record that the user is authorized in a cookie.
This could be the token itself, or any other appropriate representation of successful authentication.
- Now that credentials have been established,
the browser can be redirected to the _original_ URL where it started,
to try the request again.
If the client wasn't able to keep track of the original URL all this time
(not always easy!),
you might end up back at a default landing page instead of where you started the login process. This is frustrating!
😮‍💨 _phew_.
So that's _one_ OAuth process.
## Full sequence of OAuth in JupyterHub
Let's go through the above OAuth process in JupyterHub,
with specific examples of each HTTP request and what information it contains.
For bonus points, we are using the double-OAuth example of JupyterHub configured with GitHubOAuthenticator.
To disambiguate, we will call the OAuth process where JupyterHub is the **provider** "internal OAuth,"
and the one with JupyterHub as a **client** "external OAuth."
Our starting point:
- a user's single-user server is running. Let's call them `danez`
- Jupyterhub is running with GitHub as an OAuth provider (this means two full instances of OAuth),
- Danez has a fresh browser session with no cookies yet.
First request:
- browser->single-user server running JupyterLab or Jupyter Classic
- `GET /user/danez/notebooks/mynotebook.ipynb`
- no credentials, so single-user server (as an OAuth **client**) starts internal OAuth process with JupyterHub (the **provider**)
- response: 302 redirect -> `/hub/api/oauth2/authorize`
with:
- client-id=`jupyterhub-user-danez`
- redirect-uri=`/user/danez/oauth_callback` (we'll come back later!)
Second request, following redirect:
- browser->JupyterHub
- `GET /hub/api/oauth2/authorize`
- no credentials, so JupyterHub starts external OAuth process _with GitHub_
- response: 302 redirect -> `https://github.com/login/oauth/authorize`
with:
- client-id=`jupyterhub-client-uuid`
- redirect-uri=`/hub/oauth_callback` (we'll come back later!)
_pause_ This is where JupyterHub configuration comes into play.
Recall, in this case JupyterHub is using:
```python
c.JupyterHub.authenticator_class = 'github'
```
That means authenticating a request to the Hub itself starts
a _second_, external OAuth process with GitHub as a provider.
This external OAuth process is optional, though.
If you were using the default username+password PAMAuthenticator,
this redirect would have been to `/hub/login` instead, to present the user
with a login form.
Third request, following redirect:
- browser->GitHub
- `GET https://github.com/login/oauth/authorize`
Here, GitHub prompts for login and asks for confirmation of authorization
(more redirects if you aren't logged in to GitHub yet, but ultimately back to this `/authorize` URL).
After successful authorization
(either by looking up a pre-existing authorization,
or recording it via form submission)
GitHub issues an **OAuth code** and redirects to `/hub/oauth_callback?code=github-code`
Next request:
- browser->JupyterHub
- `GET /hub/oauth_callback?code=github-code`
Inside the callback handler, JupyterHub makes two API requests:
The first:
- JupyterHub->GitHub
- `POST https://github.com/login/oauth/access_token`
- request made with OAuth **code** from URL parameter
- response includes an access **token**
The second:
- JupyterHub->GitHub
- `GET https://api.github.com/user`
- request made with access **token** in the `Authorization` header
- response is the user model, including username, email, etc.
Now the external OAuth callback request completes with:
- set cookie on `/hub/` path, recording jupyterhub authentication so we don't need to do external OAuth with GitHub again for a while
- redirect -> `/hub/api/oauth2/authorize`
🎉 At this point, we have completed our first OAuth flow! 🎉
Now, we get our first repeated request:
- browser->jupyterhub
- `GET /hub/api/oauth2/authorize`
- this time with credentials,
so jupyterhub either
1. serves the internal authorization confirmation page, or
2. automatically accepts authorization (shortcut taken when a user is visiting their own server)
- redirect -> `/user/danez/oauth_callback?code=jupyterhub-code`
Here, we start the same OAuth callback process as before, but at Danez's single-user server for the _internal_ OAuth.
- browser->single-user server
- `GET /user/danez/oauth_callback`
(in handler)
Inside the internal OAuth callback handler,
Danez's server makes two API requests to JupyterHub:
The first:
- single-user server->JupyterHub
- `POST /hub/api/oauth2/token`
- request made with oauth code from url parameter
- response includes an API token
The second:
- single-user server->JupyterHub
- `GET /hub/api/user`
- request made with token in the `Authorization` header
- response is the user model, including username, groups, etc.
Finally completing `GET /user/danez/oauth_callback`:
- response sets cookie, storing encrypted access token
- _finally_ redirects back to the original `/user/danez/notebooks/mynotebook.ipynb`
Final request:
- browser -> single-user server
- `GET /user/danez/notebooks/mynotebook.ipynb`
- encrypted jupyterhub token in cookie
To authenticate this request, the single token stored in the encrypted cookie is passed to the Hub for verification:
- single-user server -> Hub
- `GET /hub/api/user`
- browser's token in Authorization header
- response: user model with name, groups, etc.
If the user model matches who should be allowed (e.g. Danez),
then the request is allowed.
See {doc}`../rbac/scopes` for how JupyterHub uses scopes to determine authorized access to servers and services.
_the end_
## Token caches and expiry
Because tokens represent information from an external source,
they can become 'stale,'
or the information they represent may no longer be accurate.
For example: a user's GitHub account may no longer be authorized to use JupyterHub,
that should ultimately propagate to revoking access and force logging in again.
To handle this, OAuth tokens and the various places they are stored can _expire_,
which should have the same effect as no credentials,
and trigger the authorization process again.
In JupyterHub's internal OAuth, we have these layers of information that can go stale:
- The OAuth client has a **cache** of Hub responses for tokens,
so it doesn't need to make API requests to the Hub for every request it receives.
This cache has an expiry of five minutes by default,
and is governed by the configuration `HubAuth.cache_max_age` in the single-user server.
- The internal OAuth token is stored in a cookie, which has its own expiry (default: 14 days),
governed by `JupyterHub.cookie_max_age_days`.
- The internal OAuth token itself can also expire,
which is by default the same as the cookie expiry,
since it makes sense for the token itself and the place it is stored to expire at the same time.
This is governed by `JupyterHub.cookie_max_age_days` first,
or can overridden by `JupyterHub.oauth_token_expires_in`.
That's all for _internal_ auth storage,
but the information from the _external_ authentication provider
(could be PAM or GitHub OAuth, etc.) can also expire.
Authenticator configuration governs when JupyterHub needs to ask again,
triggering the external login process anew before letting a user proceed.
- `jupyterhub-hub-login` cookie stores that a browser is authenticated with the Hub.
This expires according to `JupyterHub.cookie_max_age_days` configuration,
with a default of 14 days.
The `jupyterhub-hub-login` cookie is encrypted with `JupyterHub.cookie_secret`
configuration.
- {meth}`.Authenticator.refresh_user` is a method to refresh a user's auth info.
By default, it does nothing, but it can return an updated user model if a user's information has changed,
or force a full login process again if needed.
- {attr}`.Authenticator.auth_refresh_age` configuration governs how often
`refresh_user()` will be called to check if a user must login again (default: 300 seconds).
- {attr}`.Authenticator.refresh_pre_spawn` configuration governs whether
`refresh_user()` should be called prior to spawning a server,
to force fresh auth info when a server is launched (default: False).
This can be useful when Authenticators pass access tokens to spawner environments, to ensure they aren't getting a stale token that's about to expire.
**So what happens when these things expire or get stale?**
- If the HubAuth **token response cache** expires,
when a request is made with a token,
the Hub is asked for the latest information about the token.
This usually has no visible effect, since it is just refreshing a cache.
If it turns out that the token itself has expired or been revoked,
the request will be denied.
- If the token has expired, but is still in the cookie:
when the token response cache expires,
the next time the server asks the hub about the token,
no user will be identified and the internal OAuth process begins again.
- If the token _cookie_ expires, the next browser request will be made with no credentials,
and the internal OAuth process will begin again.
This will usually have the form of a transparent redirect browsers won't notice.
However, if this occurs on an API request in a long-lived page visit
such as a JupyterLab session, the API request may fail and require
a page refresh to get renewed credentials.
- If the _JupyterHub_ cookie expires, the next time the browser makes a request to the Hub,
the Hub's authorization process must begin again (e.g. login with GitHub).
Hub cookie expiry on its own **does not** mean that a user can no longer access their single-user server!
- If credentials from the upstream provider (e.g. GitHub) become stale or outdated,
these will not be refreshed until/unless `refresh_user` is called
_and_ `refresh_user()` on the given Authenticator is implemented to perform such a check.
At this point, few Authenticators implement `refresh_user` to support this feature.
If your Authenticator does not or cannot implement `refresh_user`,
the only way to force a check is to reset the `JupyterHub.cookie_secret` encryption key,
which invalidates the `jupyterhub-hub-login` cookie for all users.
### Logging out
Logging out of JupyterHub means clearing and revoking many of these credentials:
- The `jupyterhub-hub-login` cookie is revoked, meaning the next request to the Hub itself will require a new login.
- The token stored in the `jupyterhub-user-username` cookie for the single-user server
will be revoked, based on its associaton with `jupyterhub-session-id`, but the _cookie itself cannot be cleared at this point_
- The shared `jupyterhub-session-id` is cleared, which ensures that the HubAuth **token response cache** will not be used,
and the next request with the expired token will ask the Hub, which will inform the single-user server that the token has expired
## Extra bits
(two-tokens)=
### A tale of two tokens
**TODO**: discuss API token issued to server at startup ($JUPYTERHUB_API_TOKEN)
and OAuth-issued token in the cookie,
and some details of how JupyterLab currently deals with that.
They are different, and JupyterLab should be making requests using the token from the cookie,
not the token from the server,
but that is not currently the case.
### Redirect loops
In general, an authenticated web endpoint has this behavior,
based on the authentication/authorization state of the browser:
- If authorized, allow the request to happen
- If authenticated (I know who you are) but not authorized (you are not allowed), fail with a 403 permission denied error
- If not authenticated, start a redirect process to establish authorization,
which should end in a redirect back to the original URL to try again.
**This is why problems in authentication result in redirect loops!**
If the second request fails to detect the authentication that should have been established during the redirect,
it will start the authentication redirect process over again,
and keep redirecting in a loop until the browser balks.

View File

@@ -7,9 +7,12 @@ Hub manages by default as a subprocess (it can be run externally, as well, and
typically is in production deployments).
The upside to CHP, and why we use it by default, is that it's easy to install
and run (if you have nodejs, you are set!). The downsides are that it's a
single process and does not support any persistence of the routing table. So
if the proxy process dies, your whole JupyterHub instance is inaccessible
and run (if you have nodejs, you are set!). The downsides are that
- it's a single process and
- does not support any persistence of the routing table.
So if the proxy process dies, your whole JupyterHub instance is inaccessible
until the Hub notices, restarts the proxy, and restores the routing table. For
deployments that want to avoid such a single point of failure, or leverage
existing proxy infrastructure in their chosen deployment (such as Kubernetes
@@ -54,7 +57,7 @@ class MyProxy(Proxy):
"""Stop the proxy"""
```
These methods **may** be coroutines.
These methods **may** be coroutines.
`c.Proxy.should_start` is a configurable flag that determines whether the
Hub should call these methods when the Hub itself starts and stops.
@@ -103,7 +106,7 @@ route to be proxied, such as `/user/name/`. A routespec will:
When adding a route, JupyterHub may pass a JSON-serializable dict as a `data`
argument that should be attached to the proxy route. When that route is
retrieved, the `data` argument should be returned as well. If your proxy
retrieved, the `data` argument should be returned as well. If your proxy
implementation doesn't support storing data attached to routes, then your
Python wrapper may have to handle storing the `data` piece itself, e.g in a
simple file or database.
@@ -136,7 +139,7 @@ async def delete_route(self, routespec):
### Retrieving routes
For retrieval, you only *need* to implement a single method that retrieves all
For retrieval, you only _need_ to implement a single method that retrieves all
routes. The return value for this function should be a dictionary, keyed by
`routespec`, of dicts whose keys are the same three arguments passed to
`add_route` (`routespec`, `target`, `data`)
@@ -220,3 +223,11 @@ as previously required.
Additionally, configurable attributes for your proxy will
appear in jupyterhub help output and auto-generated configuration files
via `jupyterhub --generate-config`.
### Index of proxies
A list of the proxies that are currently available for JupyterHub (that we know about).
1. [`jupyterhub/configurable-http-proxy`](https://github.com/jupyterhub/configurable-http-proxy) The default proxy which uses node-http-proxy
2. [`jupyterhub/traefik-proxy`](https://github.com/jupyterhub/traefik-proxy) The proxy which configures traefik proxy server for jupyterhub
3. [`AbdealiJK/configurable-http-proxy`](https://github.com/AbdealiJK/configurable-http-proxy) A pure python implementation of the configurable-http-proxy

View File

@@ -0,0 +1,27 @@
# JupyterHub REST API
Below is an interactive view of JupyterHub's OpenAPI specification.
<!-- client-rendered openapi UI copied from FastAPI -->
<link type="text/css" rel="stylesheet" href="https://cdn.jsdelivr.net/npm/swagger-ui-dist@3/swagger-ui.css">
<script src="https://cdn.jsdelivr.net/npm/swagger-ui-dist@4.1/swagger-ui-bundle.js"></script>
<!-- `SwaggerUIBundle` is now available on the page -->
<!-- render the ui here -->
<div id="openapi-ui"></div>
<script>
const ui = SwaggerUIBundle({
url: '../_static/rest-api.yml',
dom_id: '#openapi-ui',
presets: [
SwaggerUIBundle.presets.apis,
SwaggerUIBundle.SwaggerUIStandalonePreset
],
layout: "BaseLayout",
deepLinking: true,
showExtensions: true,
showCommonExtensions: true,
});
</script>

View File

@@ -1,14 +0,0 @@
:orphan:
===================
JupyterHub REST API
===================
.. this doc exists as a resolvable link target
.. which _static files are not
.. meta::
:http-equiv=refresh: 0;url=../_static/rest-api/index.html
The rest API docs are `here <../_static/rest-api/index.html>`_
if you are not redirected automatically.

View File

@@ -1,34 +1,39 @@
(rest-api)=
# Using JupyterHub's REST API
This section will give you information on:
- what you can do with the API
- create an API token
- add API tokens to the config files
- make an API request programmatically using the requests library
- learn more about JupyterHub's API
- What you can do with the API
- How to create an API token
- Assigning permissions to a token
- Updating to admin services
- Making an API request programmatically using the requests library
- Paginating API requests
- Enabling users to spawn multiple named-servers via the API
- Learn more about JupyterHub's API
Before we discuss about JupyterHub's REST API, you can learn about [REST APIs here](https://en.wikipedia.org/wiki/Representational_state_transfer). A REST
API provides a standard way for users to get and send information to the
Hub.
## What you can do with the API
Using the [JupyterHub REST API][], you can perform actions on the Hub,
such as:
- checking which users are active
- adding or removing users
- stopping or starting single user notebook servers
- authenticating services
A [REST](https://en.wikipedia.org/wiki/Representational_state_transfer)
API provides a standard way for users to get and send information to the
Hub.
- Checking which users are active
- Adding or removing users
- Stopping or starting single user notebook servers
- Authenticating services
- Communicating with an individual Jupyter server's REST API
## Create an API token
To send requests using JupyterHub API, you must pass an API token with
To send requests using the JupyterHub API, you must pass an API token with
the request.
As of [version 0.6.0](../changelog.md), the preferred way of
generating an API token is:
The preferred way of generating an API token is by running:
```bash
openssl rand -hex 32
@@ -38,8 +43,12 @@ This `openssl` command generates a potential token that can then be
added to JupyterHub using `.api_tokens` configuration setting in
`jupyterhub_config.py`.
Alternatively, use the `jupyterhub token` command to generate a token
for a specific hub user by passing the 'username':
```{note}
The api_tokens configuration has been softly deprecated since the introduction of services.
```
Alternatively, you can use the `jupyterhub token` command to generate a token
for a specific hub user by passing the **username**:
```bash
jupyterhub token <username>
@@ -48,25 +57,94 @@ jupyterhub token <username>
This command generates a random string to use as a token and registers
it for the given user with the Hub's database.
In [version 0.8.0](../changelog.md), a TOKEN request page for
In [version 0.8.0](../changelog.md), a token request page for
generating an API token is available from the JupyterHub user interface:
![Request API TOKEN page](../images/token-request.png)
:::{figure-md}
![API TOKEN success page](../images/token-request-success.png)
![token request page](../images/token-request.png)
## Add API tokens to the config file
JupyterHub's API token page
:::
You may also add a dictionary of API tokens and usernames to the hub's
configuration file, `jupyterhub_config.py` (note that
the **key** is the 'secret-token' while the **value** is the 'username'):
:::{figure-md}
![token-request-success](../images/token-request-success.png)
JupyterHub's token page after successfully requesting a token.
:::
## Assigning permissions to a token
Prior to JupyterHub 2.0, there were two levels of permissions:
1. user, and
2. admin
where a token would always have full permissions to do whatever its owner could do.
In JupyterHub 2.0,
specific permissions are now defined as '**scopes**',
and can be assigned both at the user/service level,
and at the individual token level.
This allows e.g. a user with full admin permissions to request a token with limited permissions.
## Updating to admin services
```{note}
The `api_tokens` configuration has been softly deprecated since the introduction of services.
We have no plans to remove it,
but deployments are encouraged to use service configuration instead.
```
If you have been using `api_tokens` to create an admin user
and the token for that user to perform some automations, then
the services' mechanism may be a better fit if you have the following configuration:
```python
c.JupyterHub.admin_users = {"service-admin"}
c.JupyterHub.api_tokens = {
'secret-token': 'username',
"secret-token": "service-admin",
}
```
This can be updated to create a service, with the following configuration:
```python
c.JupyterHub.services = [
{
# give the token a name
"name": "service-admin",
"api_token": "secret-token",
# "admin": True, # if using JupyterHub 1.x
},
]
# roles were introduced in JupyterHub 2.0
# prior to 2.0, only "admin": True or False was available
c.JupyterHub.load_roles = [
{
"name": "service-role",
"scopes": [
# specify the permissions the token should have
"admin:users",
],
"services": [
# assign the service the above permissions
"service-admin",
],
}
]
```
The token will have the permissions listed in the role
(see [scopes][] for a list of available permissions),
but there will no longer be a user account created to house it.
The main noticeable difference between a user and a service is that there will be no notebook server associated with the account
and the service will not show up in the various user list pages and APIs.
## Make an API request
To authenticate your requests, pass the API token in the request's
@@ -74,10 +152,9 @@ Authorization header.
### Use requests
Using the popular Python [requests](http://docs.python-requests.org/en/master/)
library, here's example code to make an API request for the users of a JupyterHub
deployment. An API GET request is made, and the request sends an API token for
authorization. The response contains information about the users:
Using the popular Python [requests](https://docs.python-requests.org)
library, an API GET request is made, and the request sends an API token for
authorization. The response contains information about the users, here's example code to make an API request for the users of a JupyterHub deployment
```python
import requests
@@ -86,9 +163,9 @@ api_url = 'http://127.0.0.1:8081/hub/api'
r = requests.get(api_url + '/users',
headers={
'Authorization': 'token %s' % token,
}
)
'Authorization': f'token {token}',
}
)
r.raise_for_status()
users = r.json()
@@ -106,23 +183,100 @@ data = {'name': 'mygroup', 'users': ['user1', 'user2']}
r = requests.post(api_url + '/groups/formgrade-data301/users',
headers={
'Authorization': 'token %s' % token,
},
json=data
'Authorization': f'token {token}',
},
json=data,
)
r.raise_for_status()
r.json()
```
The same API token can also authorize access to the [Jupyter Notebook REST API][]
provided by notebook servers managed by JupyterHub if one of the following is true:
1. The token is for the same user as the owner of the notebook
2. The token is tied to an admin user or service **and** `c.JupyterHub.admin_access` is set to `True`
provided by notebook servers managed by JupyterHub if it has the necessary `access:servers` scope.
(api-pagination)=
## Paginating API requests
```{versionadded} 2.0
```
Pagination is available through the `offset` and `limit` query parameters on
list endpoints, which can be used to return ideally sized windows of results.
Here's example code demonstrating pagination on the `GET /users`
endpoint to fetch the first 20 records.
```python
import os
import requests
api_url = 'http://127.0.0.1:8081/hub/api'
r = requests.get(
api_url + '/users?offset=0&limit=20',
headers={
"Accept": "application/jupyterhub-pagination+json",
"Authorization": f"token {token}",
},
)
r.raise_for_status()
r.json()
```
For backward-compatibility, the default structure of list responses is unchanged.
However, this lacks pagination information (e.g. is there a next page),
so if you have enough users that they won't fit in the first response,
it is a good idea to opt-in to the new paginated list format.
There is a new schema for list responses which include pagination information.
You can request this by including the header:
```
Accept: application/jupyterhub-pagination+json
```
with your request, in which case a response will look like:
```python
{
"items": [
{
"name": "username",
"kind": "user",
...
},
],
"_pagination": {
"offset": 0,
"limit": 20,
"total": 50,
"next": {
"offset": 20,
"limit": 20,
"url": "http://127.0.0.1:8081/hub/api/users?limit=20&offset=20"
}
}
}
```
where the list results (same as pre-2.0) will be in `items`,
and pagination info will be in `_pagination`.
The `next` field will include the `offset`, `limit`, and `url` for requesting the next page.
`next` will be `null` if there is no next page.
Pagination is governed by two configuration options:
- `JupyterHub.api_page_default_limit` - the page size, if `limit` is unspecified in the request
and the new pagination API is requested
(default: 50)
- `JupyterHub.api_page_max_limit` - the maximum page size a request can ask for (default: 200)
Pagination is enabled on the `GET /users`, `GET /groups`, and `GET /proxy` REST endpoints.
## Enabling users to spawn multiple named-servers via the API
With JupyterHub version 0.8, support for multiple servers per user has landed.
Support for multiple servers per user was introduced in JupyterHub [version 0.8.](../changelog.md)
Prior to that, each user could only launch a single default server via the API
like this:
@@ -131,14 +285,14 @@ curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/us
```
With the named-server functionality, it's now possible to launch more than one
specifically named servers against a given user. This could be used, for instance,
specifically named servers against a given user. This could be used, for instance,
to launch each server based on a different image.
First you must enable named-servers by including the following setting in the `jupyterhub_config.py` file.
`c.JupyterHub.allow_named_servers = True`
If using the [zero-to-jupyterhub-k8s](https://github.com/jupyterhub/zero-to-jupyterhub-k8s) set-up to run JupyterHub,
If you are using the [zero-to-jupyterhub-k8s](https://github.com/jupyterhub/zero-to-jupyterhub-k8s) set-up to run JupyterHub,
then instead of editing the `jupyterhub_config.py` file directly, you could pass
the following as part of the `config.yaml` file, as per the [tutorial](https://zero-to-jupyterhub.readthedocs.io/en/latest/):
@@ -149,6 +303,7 @@ hub:
```
With that setting in place, a new named-server is activated like this:
```bash
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/servers/<serverA>"
curl -X POST -H "Authorization: token <token>" "http://127.0.0.1:8081/hub/api/users/<user>/servers/<serverB>"
@@ -163,15 +318,11 @@ will need to be able to handle the case of multiple servers per user and ensure
uniqueness of names, particularly if servers are spawned via docker containers
or kubernetes pods.
## Learn more about the API
You can see the full [JupyterHub REST API][] for details. This REST API Spec can
be viewed in a more [interactive style on swagger's petstore][].
Both resources contain the same information and differ only in its display.
Note: The Swagger specification is being renamed the [OpenAPI Initiative][].
You can see the full [JupyterHub REST API][] for more details.
[interactive style on swagger's petstore]: http://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/master/docs/rest-api.yml#!/default
[OpenAPI Initiative]: https://www.openapis.org/
[JupyterHub REST API]: ./rest-api
[Jupyter Notebook REST API]: http://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyter/notebook/master/notebook/services/api/api.yaml
[openapi initiative]: https://www.openapis.org/
[jupyterhub rest api]: ./rest-api
[scopes]: ../rbac/scopes.md
[jupyter notebook rest api]: https://petstore3.swagger.io/?url=https://raw.githubusercontent.com/jupyter/notebook/HEAD/notebook/services/api/api.yaml

View File

@@ -1,27 +1,24 @@
# Running proxy separately from the hub
## Background
The thing which users directly connect to is the proxy, by default
`configurable-http-proxy`. The proxy either redirects users to the
The thing which users directly connect to is the proxy, which by default is
`configurable-http-proxy`. The proxy either redirects users to the
hub (for login and managing servers), or to their own single-user
servers. Thus, as long as the proxy stays running, access to existing
servers. Thus, as long as the proxy stays running, access to existing
servers continues, even if the hub itself restarts or goes down.
When you first configure the hub, you may not even realize this
because the proxy is automatically managed by the hub. This is great
for getting started and even most use, but everytime you restart the
hub, all user connections also get restarted. But it's also simple to
because the proxy is automatically managed by the hub. This is great
for getting started and even most use-cases, although, everytime you restart the
hub, all user connections are also restarted. However, it is also simple to
run the proxy as a service separate from the hub, so that you are free
to reconfigure the hub while only interrupting users who are waiting for their notebook server to start.
starting their notebook server.
The default JupyterHub proxy is
[configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy),
and that page has some docs. If you are using a different proxy, such
as Traefik, these instructions are probably not relevant to you.
[configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy). If you are using a different proxy, such
as [Traefik](https://github.com/traefik/traefik), these instructions are probably not relevant to you.
## Configuration options
@@ -37,24 +34,25 @@ it yourself).
token for authenticating communication with the proxy.
`c.ConfigurableHTTPProxy.api_url = 'http://localhost:8001'` should be
set to the URL which the hub uses to connect *to the proxy's API*.
set to the URL which the hub uses to connect _to the proxy's API_.
## Proxy configuration
You need to configure a service to start the proxy. An example
command line for this is `configurable-http-proxy --ip=127.0.0.1
--port=8000 --api-ip=127.0.0.1 --api-port=8001
--default-target=http://localhost:8081
--error-target=http://localhost:8081/hub/error`. (Details for how to
do this is out of scope for this tutorial - for example it might be a
systemd service on within another docker cotainer). The proxy has no
You need to configure a service to start the proxy. An example
command line argument for this is:
```bash
$ configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error
```
(Details on how to do this is out of the scope of this tutorial. For example, it might be a
systemd service configured within another docker container). The proxy has no
configuration files, all configuration is via the command line and
environment variables.
`--api-ip` and `--api-port` (which tells the proxy where to listen) should match the hub's `ConfigurableHTTPProxy.api_url`.
`--ip`, `-port`, and other options configure the *user* connections to the proxy.
`--ip`, `-port`, and other options configure the _user_ connections to the proxy.
`--default-target` and `--error-target` should point to the hub, and used when users navigate to the proxy originally.
@@ -63,10 +61,9 @@ match the token given to `c.ConfigurableHTTPProxy.auth_token`.
You should check the [configurable-http-proxy
options](https://github.com/jupyterhub/configurable-http-proxy) to see
what other options are needed, for example SSL options. Note that
these are configured in the hub if the hub is starting the proxy - you
need to move the options to here.
what other options are needed, for example, SSL options. Note that
these options are configured in the hub if the hub is starting the proxy, so you
need to configure the options there.
## Docker image
@@ -74,7 +71,6 @@ You can use [jupyterhub configurable-http-proxy docker
image](https://hub.docker.com/r/jupyterhub/configurable-http-proxy/)
to run the proxy.
## See also
* [jupyterhub configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy)
- [jupyterhub configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy)

View File

@@ -0,0 +1,332 @@
# Starting servers with the JupyterHub API
Sometimes, when working with applications such as [BinderHub](https://binderhub.readthedocs.io), it may be necessary to launch Jupyter-based services on behalf of your users.
Doing so can be achieved through JupyterHub's [REST API](../reference/rest.md), which allows one to launch and manage servers on behalf of users through API calls instead of the JupyterHub UI.
This way, you can take advantage of other user/launch/lifecycle patterns that are not natively supported by the JupyterHub UI, all without the need to develop the server management features of JupyterHub Spawners and/or Authenticators.
This tutorial goes through working with the JupyterHub API to manage servers for users.
In particular, it covers how to:
1. [Check the status of servers](checking)
2. [Start servers](starting)
3. [Wait for servers to be ready](waiting)
4. [Communicate with servers](communicating)
5. [Stop servers](stopping)
At the end, we also provide sample Python code that can be used to implement these steps.
(checking)=
## Checking server status
First, request information about a particular user using a GET request:
```
GET /hub/api/users/:username
```
The response you get will include a `servers` field, which is a dictionary, as shown in this JSON-formatted response:
**Required scope: `read:servers`**
```json
{
"admin": false,
"groups": [],
"pending": null,
"server": null,
"name": "test-1",
"kind": "user",
"last_activity": "2021-08-03T18:12:46.026411Z",
"created": "2021-08-03T18:09:59.767600Z",
"roles": ["user"],
"servers": {}
}
```
Many JupyterHub deployments only use a 'default' server, represented as an empty string `''` for a name. An investigation of the `servers` field can yield one of two results. First, it can be empty as in the sample JSON response above. In such a case, the user has no running servers.
However, should the user have running servers, then the returned dict should contain various information, as shown in this response:
```json
"servers": {
"": {
"name": "",
"last_activity": "2021-08-03T18:48:35.934000Z",
"started": "2021-08-03T18:48:29.093885Z",
"pending": null,
"ready": true,
"url": "/user/test-1/",
"user_options": {},
"progress_url": "/hub/api/users/test-1/server/progress"
}
}
```
Key properties of a server:
name
: the server's name. Always the same as the key in `servers`.
ready
: boolean. If true, the server can be expected to respond to requests at `url`.
pending
: `null` or a string indicating a transitional state (such as `start` or `stop`).
Will always be `null` if `ready` is true or a string if false.
url
: The server's url path (e.g. `/users/:name/:servername/`) where the server can be accessed if `ready` is true.
progress_url
: The API URL path (starting with `/hub/api`) where the progress API can be used to wait for the server to be ready.
last_activity
: ISO8601 timestamp indicating when activity was last observed on the server.
started
: ISO801 timestamp indicating when the server was last started.
The two responses above are from a user with no servers and another with one `ready` server. The sample below is a response likely to be received when one requests a server launch while the server is not yet ready:
```json
"servers": {
"": {
"name": "",
"last_activity": "2021-08-03T18:48:29.093885Z",
"started": "2021-08-03T18:48:29.093885Z",
"pending": "spawn",
"ready": false,
"url": "/user/test-1/",
"user_options": {},
"progress_url": "/hub/api/users/test-1/server/progress"
}
}
```
Note that `ready` is `false` and `pending` has the value `spawn`, meaning that the server is not ready and attempting to access it may not work as it is still in the process of spawning. We'll get more into this below in [waiting for a server][].
[waiting for a server]: waiting
(starting)=
## Starting servers
To start a server, make this API request:
```
POST /hub/api/users/:username/servers/[:servername]
```
**Required scope: `servers`**
Assuming the request was valid, there are two possible responses:
201 Created
: This status code means the launch completed and the server is ready and is available at the server's URL immediately.
202 Accepted
: This is the more likely response, and means that the server has begun launching,
but is not immediately ready. As a result, the server shows `pending: 'spawn'` at this point and you should wait for it to start.
(waiting)=
## Waiting for a server to start
After receiving a `202 Accepted` response, you have to wait for the server to start.
Two approaches can be applied to establish when the server is ready:
1. {ref}`Polling the server model <polling>`
2. {ref}`Using the progress API <progress>`
(polling)=
### Polling the server model
The simplest way to check if a server is ready is to programmatically query the server model until two conditions are true:
1. The server name is contained in the `servers` response, and
2. `servers['servername']['ready']` is true.
The Python code snippet below can be used to check if a server is ready:
```python
def server_ready(hub_url, user, server_name="", token):
r = requests.get(
f"{hub_url}/hub/api/users/{user}/servers/{server_name}",
headers={"Authorization": f"token {token}"},
)
r.raise_for_status()
user_model = r.json()
servers = user_model.get("servers", {})
if server_name not in servers:
return False
server = servers[server_name]
if server['ready']:
print(f"Server {user}/{server_name} ready at {server['url']}")
return True
else:
print(f"Server {user}/{server_name} not ready, pending {server['pending']}")
return False
```
You can keep making this check until `ready` is true.
(progress)=
### Using the progress API
The most _efficient_ way to wait for a server to start is by using the progress API.
The progress URL is available in the server model under `progress_url` and has the form `/hub/api/users/:user/servers/:servername/progress`.
The default server progress can be accessed at `:user/servers//progress` or `:user/server/progress` as demonstrated in the following GET request:
```
GET /hub/api/users/:user/servers/:servername/progress
```
**Required scope: `read:servers`**
The progress API is an example of an [EventStream][] API.
Messages are _streamed_ and delivered in the form:
```
data: {"progress": 10, "message": "...", ...}
```
where the line after `data:` contains a JSON-serialized dictionary.
Lines that do not start with `data:` should be ignored.
[eventstream]: https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#examples
Progress events have the form:
```python
{
"progress": 0-100,
"message": "",
"ready": True, # or False
}
```
progress
: integer, 0-100
message
: string message describing progress stages
ready
: present and true only for the last event when the server is ready
url
: only present if `ready` is true; will be the server's URL
The progress API can be used even with fully ready servers.
If the server is ready, there will only be one event, which will look like:
```json
{
"progress": 100,
"ready": true,
"message": "Server ready at /user/test-1/",
"html_message": "Server ready at <a href=\"/user/test-1/\">/user/test-1/</a>",
"url": "/user/test-1/"
}
```
where `ready` and `url` are the same as in the server model, and `ready` will always be true.
A significant advantage of the progress API is that it shows the status of the server through a stream of messages.
Below is an example of a typical complete stream from the API:
```
data: {"progress": 0, "message": "Server requested"}
data: {"progress": 50, "message": "Spawning server..."}
data: {"progress": 100, "ready": true, "message": "Server ready at /user/test-user/", "html_message": "Server ready at <a href=\"/user/test-user/\">/user/test-user/</a>", "url": "/user/test-user/"}
```
Here is a Python example for consuming an event stream:
```{literalinclude} ../../../examples/server-api/start-stop-server.py
:language: python
:pyobject: event_stream
```
(stopping)=
## Stopping servers
Servers can be stopped with a DELETE request:
```
DELETE /hub/api/users/:user/servers/[:servername]
```
**Required scope: `servers`**
Similar to when starting a server, issuing the DELETE request above might not stop the server immediately.
Instead, the DELETE request has two possible response codes:
204 Deleted
: This status code means the delete completed and the server is fully stopped.
It will now be absent from the user `servers` model.
202 Accepted
: This code means your request was accepted but is not yet completely processed.
The server has `pending: 'stop'` at this point.
There is no progress API for checking when a server actually stops.
The only way to wait for a server to stop is to poll it and wait for the server to disappear from the user `servers` model.
This Python code snippet can be used to stop a server and the wait for the process to complete:
```{literalinclude} ../../../examples/server-api/start-stop-server.py
:language: python
:pyobject: stop_server
```
(communicating)=
## Communicating with servers
JupyterHub tokens with the `access:servers` scope can be used to communicate with servers themselves.
The tokens can be the same as those you used to launch your service.
```{note}
Access scopes are new in JupyterHub 2.0.
To access servers in JupyterHub 1.x,
a token must be owned by the same user as the server,
*or* be an admin token if admin_access is enabled.
```
The URL returned from a server model is the URL path suffix,
e.g. `/user/:name/` to append to the jupyterhub base URL.
The returned URL is of the form `{hub_url}{server_url}`,
where `hub_url` would be `http://127.0.0.1:8000` by default and `server_url` is `/user/myname`.
When combined, the two give a full URL of `http://127.0.0.1:8000/user/myname`.
## Python example
The JupyterHub repo includes a complete example in {file}`examples/server-api`
that ties all theses steps together.
In summary, the processes involved in managing servers on behalf of users are:
1. Get user information from `/user/:name`.
2. The server model includes a `ready` state to tell you if it's ready.
3. If it's not ready, you can follow up with `progress_url` to wait for it.
4. If it is ready, you can use the `url` field to link directly to the running server.
The example below demonstrates starting and stopping servers via the JupyterHub API,
including waiting for them to start via the progress API and waiting for them to stop by polling the user model.
```{literalinclude} ../../../examples/server-api/start-stop-server.py
:language: python
:start-at: def event_stream
:end-before: def main
```

View File

@@ -1,17 +1,5 @@
# Services
With version 0.7, JupyterHub adds support for **Services**.
This section provides the following information about Services:
- [Definition of a Service](#definition-of-a-service)
- [Properties of a Service](#properties-of-a-service)
- [Hub-Managed Services](#hub-managed-services)
- [Launching a Hub-Managed Service](#launching-a-hub-managed-service)
- [Externally-Managed Services](#externally-managed-services)
- [Writing your own Services](#writing-your-own-services)
- [Hub Authentication and Services](#hub-authentication-and-services)
## Definition of a Service
When working with JupyterHub, a **Service** is defined as a process that interacts
@@ -45,17 +33,25 @@ A Service may have the following properties:
- `url: str (default - None)` - The URL where the service is/should be. If a
url is specified for where the Service runs its own web server,
the service will be added to the proxy at `/services/:name`
- `api_token: str (default - None)` - For Externally-Managed Services you need to specify
- `api_token: str (default - None)` - For Externally-Managed Services you need to specify
an API token to perform API requests to the Hub
- `display: bool (default - True)` - When set to true, display a link to the
service's URL under the 'Services' dropdown in user's hub home page.
- `oauth_no_confirm: bool (default - False)` - When set to true,
skip the OAuth confirmation page when users access this service.
By default, when users authenticate with a service using JupyterHub,
they are prompted to confirm that they want to grant that service
access to their credentials.
Skipping the confirmation page is useful for admin-managed services that are considered part of the Hub
and shouldn't need extra prompts for login.
If a service is also to be managed by the Hub, it has a few extra options:
- `command: (str/Popen list`) - Command for JupyterHub to spawn the service.
- Only use this if the service should be a subprocess.
- If command is not specified, the Service is assumed to be managed
externally.
- If a command is specified for launching the Service, the Service will
be started and managed by the Hub.
- `command: (str/Popen list)` - Command for JupyterHub to spawn the service. - Only use this if the service should be a subprocess. - If command is not specified, the Service is assumed to be managed
externally. - If a command is specified for launching the Service, the Service will
be started and managed by the Hub.
- `environment: dict` - additional environment variables for the Service.
- `user: str` - the name of a system user to manage the Service. If
unspecified, run as the same user as the Hub.
@@ -89,11 +85,21 @@ Hub-Managed Service would include:
This example would be configured as follows in `jupyterhub_config.py`:
```python
c.JupyterHub.load_roles = [
{
"name": "idle-culler",
"scopes": [
"read:users:activity", # read user last_activity
"servers", # start and stop servers
# 'admin:users' # needed if culling idle users as well
]
}
]
c.JupyterHub.services = [
{
'name': 'cull-idle',
'admin': True,
'command': [sys.executable, '/path/to/cull-idle.py', '--timeout']
'name': 'idle-culler',
'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600']
}
]
```
@@ -103,12 +109,14 @@ parameters, which describe the environment needed to start the Service process:
- `environment: dict` - additional environment variables for the Service.
- `user: str` - name of the user to run the server if different from the Hub.
Requires Hub to be root.
Requires Hub to be root.
- `cwd: path` directory in which to run the Service, if different from the
Hub directory.
Hub directory.
The Hub will pass the following environment variables to launch the Service:
(service-env)=
```bash
JUPYTERHUB_SERVICE_NAME: The name of the service
JUPYTERHUB_API_TOKEN: API token assigned to the service
@@ -117,21 +125,24 @@ JUPYTERHUB_BASE_URL: Base URL of the Hub (https://mydomain[:port]/)
JUPYTERHUB_SERVICE_PREFIX: URL path prefix of this service (/services/:service-name/)
JUPYTERHUB_SERVICE_URL: Local URL where the service is expected to be listening.
Only for proxied web services.
JUPYTERHUB_OAUTH_SCOPES: JSON-serialized list of scopes to use for allowing access to the service
(deprecated in 3.0, use JUPYTERHUB_OAUTH_ACCESS_SCOPES).
JUPYTERHUB_OAUTH_ACCESS_SCOPES: JSON-serialized list of scopes to use for allowing access to the service (new in 3.0).
JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES: JSON-serialized list of scopes that can be requested by the oauth client on behalf of users (new in 3.0).
```
For the previous 'cull idle' Service example, these environment variables
would be passed to the Service when the Hub starts the 'cull idle' Service:
```bash
JUPYTERHUB_SERVICE_NAME: 'cull-idle'
JUPYTERHUB_SERVICE_NAME: 'idle-culler'
JUPYTERHUB_API_TOKEN: API token assigned to the service
JUPYTERHUB_API_URL: http://127.0.0.1:8080/hub/api
JUPYTERHUB_BASE_URL: https://mydomain[:port]
JUPYTERHUB_SERVICE_PREFIX: /services/cull-idle/
JUPYTERHUB_SERVICE_PREFIX: /services/idle-culler/
```
See the JupyterHub GitHub repo for additional information about the
[`cull-idle` example](https://github.com/jupyterhub/jupyterhub/tree/master/examples/cull-idle).
See the GitHub repo for additional information about the [jupyterhub_idle_culler][].
## Externally-Managed Services
@@ -151,6 +162,8 @@ c.JupyterHub.services = [
{
'name': 'my-web-service',
'url': 'https://10.0.1.1:1984',
# any secret >8 characters, you'll use api_token to
# authenticate api requests to the hub from your service
'api_token': 'super-secret',
}
]
@@ -173,7 +186,7 @@ information to the Service via the environment variables described above. A
flexible Service, whether managed by the Hub or not, can make use of these
same environment variables.
When you run a service that has a url, it will be accessible under a
When you run a service that has a URL, it will be accessible under a
`/services/` prefix, such as `https://myhub.horse/services/my-service/`. For
your service to route proxied requests properly, it must take
`JUPYTERHUB_SERVICE_PREFIX` into account when routing requests. For example, a
@@ -188,18 +201,38 @@ extra slash you might get unexpected behavior. For example if your service has a
## Hub Authentication and Services
JupyterHub 0.7 introduces some utilities for using the Hub's authentication
mechanism to govern access to your service. When a user logs into JupyterHub,
the Hub sets a **cookie (`jupyterhub-services`)**. The service can use this
cookie to authenticate requests.
JupyterHub provides some utilities for using the Hub's authentication
mechanism to govern access to your service.
JupyterHub ships with a reference implementation of Hub authentication that
Requests to all JupyterHub services are made with OAuth tokens.
These can either be requests with a token in the `Authorization` header,
or url parameter `?token=...`,
or browser requests which must complete the OAuth authorization code flow,
which results in a token that should be persisted for future requests
(persistence is up to the service,
but an encrypted cookie confined to the service path is appropriate,
and provided by default).
:::{versionchanged} 2.0
The shared `jupyterhub-services` cookie is removed.
OAuth must be used to authenticate browser requests with services.
:::
JupyterHub includes a reference implementation of Hub authentication that
can be used by services. You may go beyond this reference implementation and
create custom hub-authenticating clients and services. We describe the process
below.
The reference, or base, implementation is the [`HubAuth`][HubAuth] class,
which implements the requests to the Hub.
The reference, or base, implementation is the {class}`.HubAuth` class,
which implements the API requests to the Hub that resolve a token to a User model.
There are two levels of authentication with the Hub:
- {class}`.HubAuth` - the most basic authentication,
for services that should only accept API requests authorized with a token.
- {class}`.HubOAuth` - For services that should use oauth to authenticate with the Hub.
This should be used for any service that serves pages that should be visited with a browser.
To use HubAuth, you must set the `.api_token` instance variable. This can be
done either programmatically when constructing the class, or via the
@@ -214,11 +247,9 @@ and [service-whoiami](https://github.com/jupyterhub/jupyterhub/tree/master/examp
(TODO: Where is this API TOKen set?)
Most of the logic for authentication implementation is found in the
[`HubAuth.user_for_cookie`][HubAuth.user_for_cookie]
and in the
[`HubAuth.user_for_token`][HubAuth.user_for_token]
methods, which makes a request of the Hub, and returns:
Most of the logic for authentication implementation is found in the
{meth}`.HubAuth.user_for_token` methods,
which makes a request of the Hub, and returns:
- None, if no user could be identified, or
- a dict of the following form:
@@ -227,7 +258,9 @@ methods, which makes a request of the Hub, and returns:
{
"name": "username",
"groups": ["list", "of", "groups"],
"admin": False, # or True
"scopes": [
"access:servers!server=username/",
],
}
```
@@ -237,79 +270,45 @@ action.
HubAuth also caches the Hub's response for a number of seconds,
configurable by the `cookie_cache_max_age` setting (default: five minutes).
If your service would like to make further requests _on behalf of users_,
it should use the token issued by this OAuth process.
If you are using tornado,
you can access the token authenticating the current request with {meth}`.HubAuth.get_token`.
:::{versionchanged} 2.2
{meth}`.HubAuth.get_token` adds support for retrieving
tokens stored in tornado cookies after the completion of OAuth.
Previously, it only retrieved tokens from URL parameters or the Authorization header.
Passing `get_token(handler, in_cookie=False)` preserves this behavior.
:::
### Flask Example
For example, you have a Flask service that returns information about a user.
JupyterHub's HubAuth class can be used to authenticate requests to the Flask
service. See the `service-whoami-flask` example in the
[JupyterHub GitHub repo](https://github.com/jupyterhub/jupyterhub/tree/master/examples/service-whoami-flask)
[JupyterHub GitHub repo](https://github.com/jupyterhub/jupyterhub/tree/HEAD/examples/service-whoami-flask)
for more details.
```python
from functools import wraps
import json
import os
from urllib.parse import quote
from flask import Flask, redirect, request, Response
from jupyterhub.services.auth import HubAuth
prefix = os.environ.get('JUPYTERHUB_SERVICE_PREFIX', '/')
auth = HubAuth(
api_token=os.environ['JUPYTERHUB_API_TOKEN'],
cookie_cache_max_age=60,
)
app = Flask(__name__)
def authenticated(f):
"""Decorator for authenticating with the Hub"""
@wraps(f)
def decorated(*args, **kwargs):
cookie = request.cookies.get(auth.cookie_name)
token = request.headers.get(auth.auth_header_name)
if cookie:
user = auth.user_for_cookie(cookie)
elif token:
user = auth.user_for_token(token)
else:
user = None
if user:
return f(user, *args, **kwargs)
else:
# redirect to login url on failed auth
return redirect(auth.login_url + '?next=%s' % quote(request.path))
return decorated
@app.route(prefix)
@authenticated
def whoami(user):
return Response(
json.dumps(user, indent=1, sort_keys=True),
mimetype='application/json',
)
```{literalinclude} ../../../examples/service-whoami-flask/whoami-flask.py
:language: python
```
### Authenticating tornado services with JupyterHub
Since most Jupyter services are written with tornado,
we include a mixin class, [`HubAuthenticated`][HubAuthenticated],
we include a mixin class, [`HubOAuthenticated`][huboauthenticated],
for quickly authenticating your own tornado services with JupyterHub.
Tornado's `@web.authenticated` method calls a Handler's `.get_current_user`
method to identify the user. Mixing in `HubAuthenticated` defines
`get_current_user` to use HubAuth. If you want to configure the HubAuth
instance beyond the default, you'll want to define an `initialize` method,
Tornado's {py:func}`~.tornado.web.authenticated` decorator calls a Handler's {py:meth}`~.tornado.web.RequestHandler.get_current_user`
method to identify the user. Mixing in {class}`.HubAuthenticated` defines
{meth}`~.HubAuthenticated.get_current_user` to use HubAuth. If you want to configure the HubAuth
instance beyond the default, you'll want to define an {py:meth}`~.tornado.web.RequestHandler.initialize` method,
such as:
```python
class MyHandler(HubAuthenticated, web.RequestHandler):
hub_users = {'inara', 'mal'}
class MyHandler(HubOAuthenticated, web.RequestHandler):
def initialize(self, hub_auth):
self.hub_auth = hub_auth
@@ -319,66 +318,97 @@ class MyHandler(HubAuthenticated, web.RequestHandler):
...
```
The HubAuth class will automatically load the desired configuration from the Service
[environment variables](service-env).
The HubAuth will automatically load the desired configuration from the Service
environment variables.
:::{versionchanged} 2.0
If you want to limit user access, you can whitelist users through either the
`.hub_users` attribute or `.hub_groups`. These are sets that check against the
username and user group list, respectively. If a user matches neither the user
list nor the group list, they will not be allowed access. If both are left
undefined, then any user will be allowed.
Access scopes are used to govern access to services.
Prior to 2.0,
sets of users and groups could be used to grant access
by defining `.hub_groups` or `.hub_users` on the authenticated handler.
These are ignored if the 2.0 `.hub_scopes` is defined.
:::
:::{seealso}
{meth}`.HubAuth.check_scopes`
:::
### Implementing your own Authentication with JupyterHub
If you don't want to use the reference implementation
(e.g. you find the implementation a poor fit for your Flask app),
you can implement authentication via the Hub yourself.
We recommend looking at the [`HubAuth`][HubAuth] class implementation for reference,
JupyterHub is a standard OAuth2 provider,
so you can use any OAuth 2 client implementation appropriate for your toolkit.
See the [FastAPI example][] for an example of using JupyterHub as an OAuth provider with [FastAPI][],
without using any code imported from JupyterHub.
On completion of OAuth, you will have an access token for JupyterHub,
which can be used to identify the user and the permissions (scopes)
the user has authorized for your service.
You will only get to this stage if the user has the required `access:services!service=$service-name` scope.
To retrieve the user model for the token, make a request to `GET /hub/api/user` with the token in the Authorization header.
For example, using flask:
```{literalinclude} ../../../examples/service-whoami-flask/whoami-flask.py
:language: python
```
We recommend looking at the [`HubOAuth`][huboauth] class implementation for reference,
and taking note of the following process:
1. retrieve the cookie `jupyterhub-services` from the request.
2. Make an API request `GET /hub/api/authorizations/cookie/jupyterhub-services/cookie-value`,
where cookie-value is the url-encoded value of the `jupyterhub-services` cookie.
This request must be authenticated with a Hub API token in the `Authorization` header.
For example, with [requests][]:
1. retrieve the token from the request.
2. Make an API request `GET /hub/api/user`,
with the token in the `Authorization` header.
```python
r = requests.get(
'/'.join((["http://127.0.0.1:8081/hub/api",
"authorizations/cookie/jupyterhub-services",
quote(encrypted_cookie, safe=''),
]),
headers = {
'Authorization' : 'token %s' % api_token,
},
)
r.raise_for_status()
user = r.json()
```
For example, with [requests][]:
```python
r = requests.get(
"http://127.0.0.1:8081/hub/api/user",
headers = {
'Authorization' : f'token {api_token}',
},
)
r.raise_for_status()
user = r.json()
```
3. On success, the reply will be a JSON model describing the user:
```json
```python
{
"name": "inara",
"groups": ["serenity", "guild"]
# groups may be omitted, depending on permissions
"groups": ["serenity", "guild"],
# scopes is new in JupyterHub 2.0
"scopes": [
"access:services",
"read:users:name",
"read:users!user=inara",
"..."
]
}
```
The `scopes` field can be used to manage access.
Note: a user will have access to a service to complete oauth access to the service for the first time.
Individual permissions may be revoked at any later point without revoking the token,
in which case the `scopes` field in this model should be checked on each access.
The default required scopes for access are available from `hub_auth.oauth_scopes` or `$JUPYTERHUB_OAUTH_ACCESS_SCOPES`.
An example of using an Externally-Managed Service and authentication is
in the [nbviewer README][nbviewer example] section on securing the notebook viewer,
and an example of its configuration is found [here](https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/base.py#L94).
and an example of its configuration is found [here](https://github.com/jupyter/nbviewer/blob/ed942b10a52b6259099e2dd687930871dc8aac22/nbviewer/providers/base.py#L95).
nbviewer can also be run as a Hub-Managed Service as described [nbviewer README][nbviewer example]
section on securing the notebook viewer.
[requests]: http://docs.python-requests.org/en/master/
[services_auth]: ../api/services.auth.html
[HubAuth]: ../api/services.auth.html#jupyterhub.services.auth.HubAuth
[HubAuth.user_for_cookie]: ../api/services.auth.html#jupyterhub.services.auth.HubAuth.user_for_cookie
[HubAuth.user_for_token]: ../api/services.auth.html#jupyterhub.services.auth.HubAuth.user_for_token
[HubAuthenticated]: ../api/services.auth.html#jupyterhub.services.auth.HubAuthenticated
[nbviewer example]: https://github.com/jupyter/nbviewer#securing-the-notebook-viewer
[fastapi example]: https://github.com/jupyterhub/jupyterhub/tree/HEAD/examples/service-fastapi
[fastapi]: https://fastapi.tiangolo.com
[jupyterhub_idle_culler]: https://github.com/jupyterhub/jupyterhub-idle-culler

View File

@@ -4,10 +4,9 @@ A [Spawner][] starts each single-user notebook server.
The Spawner represents an abstract interface to a process,
and a custom Spawner needs to be able to take three actions:
- start the process
- poll whether the process is still running
- stop the process
- start a process
- poll whether a process is still running
- stop a process
## Examples
@@ -15,11 +14,11 @@ Custom Spawners for JupyterHub can be found on the [JupyterHub wiki](https://git
Some examples include:
- [DockerSpawner](https://github.com/jupyterhub/dockerspawner) for spawning user servers in Docker containers
* `dockerspawner.DockerSpawner` for spawning identical Docker containers for
- `dockerspawner.DockerSpawner` for spawning identical Docker containers for
each user
* `dockerspawner.SystemUserSpawner` for spawning Docker containers with an
- `dockerspawner.SystemUserSpawner` for spawning Docker containers with an
environment and home directory for each user
* both `DockerSpawner` and `SystemUserSpawner` also work with Docker Swarm for
- both `DockerSpawner` and `SystemUserSpawner` also work with Docker Swarm for
launching containers on remote machines
- [SudoSpawner](https://github.com/jupyterhub/sudospawner) enables JupyterHub to
run without being root, by spawning an intermediate process via `sudo`
@@ -27,26 +26,25 @@ Some examples include:
servers using batch systems
- [YarnSpawner](https://github.com/jupyterhub/yarnspawner) for spawning notebook
servers in YARN containers on a Hadoop cluster
- [RemoteSpawner](https://github.com/zonca/remotespawner) to spawn notebooks
and a remote server and tunnel the port via SSH
- [SSHSpawner](https://github.com/NERSC/sshspawner) to spawn notebooks
on a remote server using SSH
- [KubeSpawner](https://github.com/jupyterhub/kubespawner) to spawn notebook servers on kubernetes cluster.
## Spawner control methods
### Spawner.start
`Spawner.start` should start the single-user server for a single user.
`Spawner.start` should start a single-user server for a single user.
Information about the user can be retrieved from `self.user`,
an object encapsulating the user's name, authentication, and server info.
The return value of `Spawner.start` should be the (ip, port) of the running server.
**NOTE:** When writing coroutines, *never* `yield` in between a database change and a commit.
The return value of `Spawner.start` should be the `(ip, port)` of the running server,
or a full URL as a string.
Most `Spawner.start` functions will look similar to this example:
```python
def start(self):
async def start(self):
self.ip = '127.0.0.1'
self.port = random_port()
# get environment variables,
@@ -58,8 +56,10 @@ def start(self):
cmd.extend(self.cmd)
cmd.extend(self.get_args())
yield self._actually_start_server_somehow(cmd, env)
return (self.ip, self.port)
await self._actually_start_server_somehow(cmd, env)
# url may not match self.ip:self.port, but it could!
url = self._get_connectable_url()
return url
```
When `Spawner.start` returns, the single-user server process should actually be running,
@@ -67,20 +67,71 @@ not just requested. JupyterHub can handle `Spawner.start` being very slow
(such as PBS-style batch queues, or instantiating whole AWS instances)
via relaxing the `Spawner.start_timeout` config value.
#### Note on IPs and ports
`Spawner.ip` and `Spawner.port` attributes set the _bind_ URL,
which the single-user server should listen on
(passed to the single-user process via the `JUPYTERHUB_SERVICE_URL` environment variable).
The _return_ value is the IP and port (or full URL) the Hub should _connect to_.
These are not necessarily the same, and usually won't be in any Spawner that works with remote resources or containers.
The default for `Spawner.ip`, and `Spawner.port` is `127.0.0.1:{random}`,
which is appropriate for Spawners that launch local processes,
where everything is on localhost and each server needs its own port.
For remote or container Spawners, it will often make sense to use a different value,
such as `ip = '0.0.0.0'` and a fixed port, e.g. `8888`.
The defaults can be changed in the class,
preserving configuration with traitlets:
```python
from traitlets import default
from jupyterhub.spawner import Spawner
class MySpawner(Spawner):
@default("ip")
def _default_ip(self):
return '0.0.0.0'
@default("port")
def _default_port(self):
return 8888
async def start(self):
env = self.get_env()
cmd = []
# get jupyterhub command to run,
# typically ['jupyterhub-singleuser']
cmd.extend(self.cmd)
cmd.extend(self.get_args())
remote_server_info = await self._actually_start_server_somehow(cmd, env)
url = self.get_public_url_from(remote_server_info)
return url
```
#### Exception handling
When `Spawner.start` raises an Exception, a message can be passed on to the user via the exception using a `.jupyterhub_html_message` or `.jupyterhub_message` attribute.
When the Exception has a `.jupyterhub_html_message` attribute, it will be rendered as HTML to the user.
Alternatively `.jupyterhub_message` is rendered as unformatted text.
If both attributes are not present, the Exception will be shown to the user as unformatted text.
### Spawner.poll
`Spawner.poll` should check if the spawner is still running.
`Spawner.poll` checks if the spawner is still running.
It should return `None` if it is still running,
and an integer exit status, otherwise.
For the local process case, `Spawner.poll` uses `os.kill(PID, 0)`
to check if the local process is still running.
In the case of local processes, `Spawner.poll` uses `os.kill(PID, 0)`
to check if the local process is still running. On Windows, it uses `psutil.pid_exists`.
### Spawner.stop
`Spawner.stop` should stop the process. It must be a tornado coroutine, which should return when the process has finished exiting.
## Spawner state
JupyterHub should be able to stop and restart without tearing down
@@ -90,7 +141,7 @@ A JSON-able dictionary of state can be used to store persisted information.
Unlike start, stop, and poll methods, the state methods must not be coroutines.
For the single-process case, the Spawner state is only the process ID of the server:
In the case of single processes, the Spawner state is only the process ID of the server:
```python
def get_state(self):
@@ -112,7 +163,6 @@ def clear_state(self):
self.pid = 0
```
## Spawner options form
(new in 0.4)
@@ -129,7 +179,7 @@ If the `Spawner.options_form` is defined, when a user tries to start their serve
If `Spawner.options_form` is undefined, the user's server is spawned directly, and no spawn page is rendered.
See [this example](https://github.com/jupyterhub/jupyterhub/blob/master/examples/spawn-form/jupyterhub_config.py) for a form that allows custom CLI args for the local spawner.
See [this example](https://github.com/jupyterhub/jupyterhub/blob/HEAD/examples/spawn-form/jupyterhub_config.py) for a form that allows custom CLI args for the local spawner.
### `Spawner.options_from_form`
@@ -170,8 +220,7 @@ which would return:
When `Spawner.start` is called, this dictionary is accessible as `self.user_options`.
[Spawner]: https://github.com/jupyterhub/jupyterhub/blob/master/jupyterhub/spawner.py
[spawner]: https://github.com/jupyterhub/jupyterhub/blob/HEAD/jupyterhub/spawner.py
## Writing a custom spawner
@@ -212,6 +261,75 @@ Additionally, configurable attributes for your spawner will
appear in jupyterhub help output and auto-generated configuration files
via `jupyterhub --generate-config`.
## Environment variables and command-line arguments
Spawners mainly do one thing: launch a command in an environment.
The command-line is constructed from user configuration:
- Spawner.cmd (default: `['jupyterhub-singleuser']`)
- Spawner.args (CLI args to pass to the cmd, default: empty)
where the configuration:
```python
c.Spawner.cmd = ["my-singleuser-wrapper"]
c.Spawner.args = ["--debug", "--flag"]
```
would result in spawning the command:
```bash
my-singleuser-wrapper --debug --flag
```
The `Spawner.get_args()` method is how `Spawner.args` is accessed,
and can be used by Spawners to customize/extend user-provided arguments.
Prior to 2.0, JupyterHub unconditionally added certain options _if specified_ to the command-line,
such as `--ip={Spawner.ip}` and `--port={Spawner.port}`.
These have now all been moved to environment variables,
and from JupyterHub 2.0,
the command-line launched by JupyterHub is fully specified by overridable configuration `Spawner.cmd + Spawner.args`.
Most process configuration is passed via environment variables.
Additional variables can be specified via the `Spawner.environment` configuration.
The process environment is returned by `Spawner.get_env`, which specifies the following environment variables:
- JUPYTERHUB*SERVICE_URL - the \_bind* URL where the server should launch its HTTP server (`http://127.0.0.1:12345`).
This includes `Spawner.ip` and `Spawner.port`; _new in 2.0, prior to 2.0 IP, port were on the command-line and only if specified_
- JUPYTERHUB_SERVICE_PREFIX - the URL prefix the service will run on (e.g. `/user/name/`)
- JUPYTERHUB_USER - the JupyterHub user's username
- JUPYTERHUB_SERVER_NAME - the server's name, if using named servers (default server has an empty name)
- JUPYTERHUB_API_URL - the full URL for the JupyterHub API (http://17.0.0.1:8001/hub/api)
- JUPYTERHUB_BASE_URL - the base URL of the whole jupyterhub deployment, i.e. the bit before `hub/` or `user/`,
as set by `c.JupyterHub.base_url` (default: `/`)
- JUPYTERHUB_API_TOKEN - the API token the server can use to make requests to the Hub.
This is also the OAuth client secret.
- JUPYTERHUB_CLIENT_ID - the OAuth client ID for authenticating visitors.
- JUPYTERHUB_OAUTH_CALLBACK_URL - the callback URL to use in OAuth, typically `/user/:name/oauth_callback`
- JUPYTERHUB_OAUTH_ACCESS_SCOPES - the scopes required to access the server (called JUPYTERHUB_OAUTH_SCOPES prior to 3.0)
- JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES - the scopes the service is allowed to request.
If no scopes are requested explicitly, these scopes will be requested.
Optional environment variables, depending on configuration:
- JUPYTERHUB*SSL*[KEYFILE|CERTFILE|CLIENT_CI] - SSL configuration, when `internal_ssl` is enabled
- JUPYTERHUB_ROOT_DIR - the root directory of the server (notebook directory), when `Spawner.notebook_dir` is defined (new in 2.0)
- JUPYTERHUB_DEFAULT_URL - the default URL for the server (for redirects from `/user/:name/`),
if `Spawner.default_url` is defined
(new in 2.0, previously passed via CLI)
- JUPYTERHUB_DEBUG=1 - generic debug flag, sets maximum log level when `Spawner.debug` is True
(new in 2.0, previously passed via CLI)
- JUPYTERHUB_DISABLE_USER_CONFIG=1 - disable loading user config,
sets maximum log level when `Spawner.debug` is True (new in 2.0,
previously passed via CLI)
- JUPYTERHUB*[MEM|CPU]*[LIMIT_GUARANTEE] - the values of CPU and memory limits and guarantees.
These are not expected to be enforced by the process,
but are made available as a hint,
e.g. for resource monitoring extensions.
## Spawners, resource limits, and guarantees (Optional)
@@ -220,14 +338,14 @@ guarantees on resources, such as CPU and memory. To provide a consistent
experience for sysadmins and users, we provide a standard way to set and
discover these resource limits and guarantees, such as for memory and CPU.
For the limits and guarantees to be useful, **the spawner must implement
support for them**. For example, LocalProcessSpawner, the default
support for them**. For example, `LocalProcessSpawner`, the default
spawner, does not support limits and guarantees. One of the spawners
that supports limits and guarantees is the
[`systemdspawner`](https://github.com/jupyterhub/systemdspawner).
### Memory Limits & Guarantees
`c.Spawner.mem_limit`: A **limit** specifies the *maximum amount of memory*
`c.Spawner.mem_limit`: A **limit** specifies the _maximum amount of memory_
that may be allocated, though there is no promise that the maximum amount will
be available. In supported spawners, you can set `c.Spawner.mem_limit` to
limit the total amount of memory that a single-user notebook server can
@@ -235,8 +353,8 @@ allocate. Attempting to use more memory than this limit will cause errors. The
single-user notebook server can discover its own memory limit by looking at
the environment variable `MEM_LIMIT`, which is specified in absolute bytes.
`c.Spawner.mem_guarantee`: Sometimes, a **guarantee** of a *minimum amount of
memory* is desirable. In this case, you can set `c.Spawner.mem_guarantee` to
`c.Spawner.mem_guarantee`: Sometimes, a **guarantee** of a _minimum amount of
memory_ is desirable. In this case, you can set `c.Spawner.mem_guarantee` to
to provide a guarantee that at minimum this much memory will always be
available for the single-user notebook server to use. The environment variable
`MEM_GUARANTEE` will also be set in the single-user notebook server.
@@ -250,7 +368,7 @@ limits or guarantees are provided, and no environment values are set.
`c.Spawner.cpu_limit`: In supported spawners, you can set
`c.Spawner.cpu_limit` to limit the total number of cpu-cores that a
single-user notebook server can use. These can be fractional - `0.5` means 50%
of one CPU core, `4.0` is 4 cpu-cores, etc. This value is also set in the
of one CPU core, `4.0` is 4 CPU-cores, etc. This value is also set in the
single-user notebook server's environment variable `CPU_LIMIT`. The limit does
not claim that you will be able to use all the CPU up to your limit as other
higher priority applications might be taking up CPU.
@@ -271,7 +389,7 @@ utilize these certs, there are two methods of interest on the base `Spawner`
class: `.create_certs` and `.move_certs`.
The first method, `.create_certs` will sign a key-cert pair using an internally
trusted authority for notebooks. During this process, `.create_certs` can
trusted authority for notebooks. During this process, `.create_certs` can
apply `ip` and `dns` name information to the cert via an `alt_names` `kwarg`.
This is used for certificate authentication (verification). Without proper
verification, the `Notebook` will be unable to communicate with the `Hub` and

View File

@@ -2,7 +2,7 @@
The **Technical Overview** section gives you a high-level view of:
- JupyterHub's Subsystems: Hub, Proxy, Single-User Notebook Server
- JupyterHub's major Subsystems: Hub, Proxy, Single-User Notebook Server
- how the subsystems interact
- the process from JupyterHub access to user login
- JupyterHub's default behavior
@@ -11,16 +11,16 @@ The **Technical Overview** section gives you a high-level view of:
The goal of this section is to share a deeper technical understanding of
JupyterHub and how it works.
## The Subsystems: Hub, Proxy, Single-User Notebook Server
## The Major Subsystems: Hub, Proxy, Single-User Notebook Server
JupyterHub is a set of processes that together provide a single user Jupyter
Notebook server for each person in a group. Three major subsystems are started
JupyterHub is a set of processes that together, provide a single-user Jupyter
Notebook server for each person in a group. Three subsystems are started
by the `jupyterhub` command line program:
- **Hub** (Python/Tornado): manages user accounts, authentication, and
coordinates Single User Notebook Servers using a Spawner.
coordinates Single User Notebook Servers using a [Spawner](./spawners.md).
- **Proxy**: the public facing part of JupyterHub that uses a dynamic proxy
- **Proxy**: the public-facing part of JupyterHub that uses a dynamic proxy
to route HTTP requests to the Hub and Single User Notebook Servers.
[configurable http proxy](https://github.com/jupyterhub/configurable-http-proxy)
(node-http-proxy) is the default proxy.
@@ -28,7 +28,7 @@ by the `jupyterhub` command line program:
- **Single-User Notebook Server** (Python/Tornado): a dedicated,
single-user, Jupyter Notebook server is started for each user on the system
when the user logs in. The object that starts the single-user notebook
servers is called a **Spawner**.
servers is called a **[Spawner](./spawners.md)**.
![JupyterHub subsystems](../images/jhub-parts.png)
@@ -41,8 +41,8 @@ The basic principles of operation are:
- The Hub spawns the proxy (in the default JupyterHub configuration)
- The proxy forwards all requests to the Hub by default
- The Hub handles login, and spawns single-user notebook servers on demand
- The Hub configures the proxy to forward url prefixes to single-user notebook
- The Hub handles login and spawns single-user notebook servers on demand
- The Hub configures the proxy to forward URL prefixes to single-user notebook
servers
The proxy is the only process that listens on a public interface. The Hub sits
@@ -50,17 +50,16 @@ behind the proxy at `/hub`. Single-user servers sit behind the proxy at
`/user/[username]`.
Different **[authenticators](./authenticators.md)** control access
to JupyterHub. The default one (PAM) uses the user accounts on the server where
to JupyterHub. The default one [(PAM)](https://en.wikipedia.org/wiki/Pluggable_authentication_module) uses the user accounts on the server where
JupyterHub is running. If you use this, you will need to create a user account
on the system for each user on your team. Using other authenticators, you can
on the system for each user on your team. However, using other authenticators you can
allow users to sign in with e.g. a GitHub account, or with any single-sign-on
system your organization has.
Next, **[spawners](./spawners.md)** control how JupyterHub starts
the individual notebook server for each user. The default spawner will
start a notebook server on the same machine running under their system username.
The other main option is to start each server in a separate container, often
using Docker.
The other main option is to start each server in a separate container, often using [Docker](https://jupyterhub-dockerspawner.readthedocs.io/en/latest/).
## The Process from JupyterHub Access to User Login
@@ -72,20 +71,20 @@ When a user accesses JupyterHub, the following events take place:
- A single-user notebook server instance is [spawned](./spawners.md) for the
logged-in user
- When the single-user notebook server starts, the proxy is notified to forward
requests to `/user/[username]/*` to the single-user notebook server.
- A cookie is set on `/hub/`, containing an encrypted token. (Prior to version
requests made to `/user/[username]/*`, to the single-user notebook server.
- A [cookie](https://en.wikipedia.org/wiki/HTTP_cookie) is set on `/hub/`, containing an encrypted token. (Prior to version
0.8, a cookie for `/user/[username]` was used too.)
- The browser is redirected to `/user/[username]`, and the request is handled by
the single-user notebook server.
The single-user server identifies the user with the Hub via OAuth:
How does the single-user server identify the user with the Hub via OAuth?
- on request, the single-user server checks a cookie
- if no cookie is set, redirect to the Hub for verification via OAuth
- after verification at the Hub, the browser is redirected back to the
- On request, the single-user server checks a cookie
- If no cookie is set, the single-user server redirects to the Hub for verification via OAuth
- After verification at the Hub, the browser is redirected back to the
single-user server
- the token is verified and stored in a cookie
- if no user is identified, the browser is redirected back to `/hub/login`
- The token is verified and stored in a cookie
- If no user is identified, the browser is redirected back to `/hub/login`
## Default Behavior
@@ -111,7 +110,7 @@ working directory:
This file needs to persist so that a **Hub** server restart will avoid
invalidating cookies. Conversely, deleting this file and restarting the server
effectively invalidates all login cookies. The cookie secret file is discussed
in the [Cookie Secret section of the Security Settings document](../getting-started/security-basics.md).
in the [Cookie Secret section of the Security Settings document](../getting-started/security-basics.rst).
The location of these files can be specified via configuration settings. It is
recommended that these files be stored in standard UNIX filesystem locations,

View File

@@ -1,8 +1,8 @@
# Working with templates and UI
The pages of the JupyterHub application are generated from
[Jinja](http://jinja.pocoo.org/) templates. These allow the header, for
example, to be defined once and incorporated into all pages. By providing
[Jinja](http://jinja.pocoo.org/) templates. These allow the header, for
example, to be defined once and incorporated into all pages. By providing
your own templates, you can have complete control over JupyterHub's
appearance.
@@ -10,7 +10,7 @@ appearance.
JupyterHub will look for custom templates in all of the paths in the
`JupyterHub.template_paths` configuration option, falling back on the
[default templates](https://github.com/jupyterhub/jupyterhub/tree/master/share/jupyterhub/templates)
[default templates](https://github.com/jupyterhub/jupyterhub/tree/HEAD/share/jupyterhub/templates)
if no custom template with that name is found. This fallback
behavior is new in version 0.9; previous versions searched only those paths
explicitly included in `template_paths`. You may override as many
@@ -20,8 +20,8 @@ or as few templates as you desire.
Jinja provides a mechanism to [extend templates](http://jinja.pocoo.org/docs/2.10/templates/#template-inheritance).
A base template can define a `block`, and child templates can replace or
supplement the material in the block. The
[JupyterHub templates](https://github.com/jupyterhub/jupyterhub/tree/master/share/jupyterhub/templates)
supplement the material in the block. The
[JupyterHub templates](https://github.com/jupyterhub/jupyterhub/tree/HEAD/share/jupyterhub/templates)
make extensive use of blocks, which allows you to customize parts of the
interface easily.
@@ -32,8 +32,8 @@ In general, a child template can extend a base template, `page.html`, by beginni
```
This works, unless you are trying to extend the default template for the same
file name. Starting in version 0.9, you may refer to the base file with a
`templates/` prefix. Thus, if you are writing a custom `page.html`, start the
file name. Starting in version 0.9, you may refer to the base file with a
`templates/` prefix. Thus, if you are writing a custom `page.html`, start the
file with this block:
```html
@@ -41,7 +41,7 @@ file with this block:
```
By defining `block`s with same name as in the base template, child templates
can replace those sections with custom content. The content from the base
can replace those sections with custom content. The content from the base
template can be included with the `{{ super() }}` directive.
### Example
@@ -52,10 +52,7 @@ text about the server starting up, place this content in a file named
`JupyterHub.template_paths` configuration option.
```html
{% extends "templates/spawn_pending.html" %}
{% block message %}
{{ super() }}
{% extends "templates/spawn_pending.html" %} {% block message %} {{ super() }}
<p>Patience is a virtue.</p>
{% endblock %}
```
@@ -69,9 +66,8 @@ To add announcements to be displayed on a page, you have two options:
### Announcement Configuration Variables
If you set the configuration variable `JupyterHub.template_vars =
{'announcement': 'some_text'}`, the given `some_text` will be placed on
the top of all pages. The more specific variables
If you set the configuration variable `JupyterHub.template_vars = {'announcement': 'some_text'}`, the given `some_text` will be placed on
the top of all pages. The more specific variables
`announcement_login`, `announcement_spawn`, `announcement_home`, and
`announcement_logout` are more specific and only show on their
respective pages (overriding the global `announcement` variable).
@@ -79,15 +75,14 @@ Note that changing these variables require a restart, unlike direct
template extension.
You can get the same effect by extending templates, which allows you
to update the messages without restarting. Set
to update the messages without restarting. Set
`c.JupyterHub.template_paths` as mentioned above, and then create a
template (for example, `login.html`) with:
```html
{% extends "templates/login.html" %}
{% set announcement = 'some message' %}
{% extends "templates/login.html" %} {% set announcement = 'some message' %}
```
Extending `page.html` puts the message on all pages, but note that
extending `page.html` take precedence over an extension of a specific
extending `page.html` takes precedence over an extension of a specific
page (unlike the variable-based approach above).

View File

@@ -2,17 +2,15 @@
This document describes how JupyterHub routes requests.
This does not include the [REST API](./rest.md) urls.
This does not include the [REST API](./rest.md) URLs.
In general, all URLs can be prefixed with `c.JupyterHub.base_url` to
run the whole JupyterHub application on a prefix.
All authenticated handlers redirect to `/hub/login` to login users
prior to being redirected back to the originating page.
All authenticated handlers redirect to `/hub/login` to log-in users
before being redirected back to the originating page.
The returned request should preserve all query parameters.
## `/`
The top-level request is always a simple redirect to `/hub/`,
@@ -27,12 +25,12 @@ This is an authenticated URL.
This handler redirects users to the default URL of the application,
which defaults to the user's default server.
That is, it redirects to `/hub/spawn` if the user's server is not running,
or the server itself (`/user/:name`) if the server is running.
That is, the handler redirects to `/hub/spawn` if the user's server is not running,
or to the server itself (`/user/:name`) if the server is running.
This default url behavior can be customized in two ways:
This default URL behavior can be customized in two ways:
To redirect users to the JupyterHub home page (`/hub/home`)
First, to redirect users to the JupyterHub home page (`/hub/home`)
instead of spawning their server,
set `redirect_to_server` to False:
@@ -42,7 +40,7 @@ c.JupyterHub.redirect_to_server = False
This might be useful if you have a Hub where you expect
users to be managing multiple server configurations
and automatic spawning is not desirable.
but automatic spawning is not desirable.
Second, you can customise the landing page to any page you like,
such as a custom service you have deployed e.g. with course information:
@@ -59,42 +57,42 @@ By default, the Hub home page has just one or two buttons
for starting and stopping the user's server.
If named servers are enabled, there will be some additional
tools for management of named servers.
tools for management of the named servers.
*Version added: 1.0* named server UI is new in 1.0.
_Version added: 1.0_ named server UI is new in 1.0.
## `/hub/login`
This is the JupyterHub login page.
If you have a form-based username+password login,
such as the default PAMAuthenticator,
such as the default [PAMAuthenticator](https://en.wikipedia.org/wiki/Pluggable_authentication_module),
this page will render the login form.
![A login form](../images/login-form.png)
If login is handled by an external service,
e.g. with OAuth, this page will have a button,
declaring "Login with ..." which users can click
to login with the chosen service.
declaring "Log in with ..." which users can click
to log in with the chosen service.
![A login redirect button](../images/login-button.png)
If you want to skip the user-interaction to initiate logging in
via the button, you can set
If you want to skip the user interaction and initiate login
via the button, you can set:
```python
c.Authenticator.auto_login = True
```
This can be useful when the user is "already logged in" via some mechanism,
but a handshake via redirects is necessary to complete the authentication with JupyterHub.
This can be useful when the user is "already logged in" via some mechanism.
However, a handshake via `redirects` is necessary to complete the authentication with JupyterHub.
## `/hub/logout`
Visiting `/hub/logout` clears cookies from the current browser.
Visiting `/hub/logout` clears [cookies](https://en.wikipedia.org/wiki/HTTP_cookie) from the current browser.
Note that **logging out does not stop a user's server(s)** by default.
If you would like to shutdown user servers on logout,
If you would like to shut down user servers on logout,
you can enable this behavior with:
```python
@@ -107,11 +105,11 @@ does not mean the user is no longer actively using their server from another mac
## `/user/:username[/:servername]`
If a user's server is running, this URL is handled by the user's given server,
not the Hub.
The username is the first part and, if using named servers,
not by the Hub.
The username is the first part, and if using named servers,
the server name is the second part.
If the user's server is *not* running, this will be redirected to `/hub/user/:username/...`
If the user's server is _not_ running, this will be redirected to `/hub/user/:username/...`
## `/hub/user/:username[/:servername]`
@@ -120,8 +118,14 @@ This URL indicates a request for a user server that is not running
if the specified server were running).
Handling this URL depends on two conditions: whether a requested user is found
as a match and the state of the requested user's notebook server.
as a match and the state of the requested user's notebook server,
for example:
1. the server is not active
a. user matches
b. user doesn't match
2. the server is ready
3. the server is pending, but not ready
If the server is pending spawn,
the browser will be redirected to `/hub/spawn-pending/:username/:servername`
@@ -137,39 +141,37 @@ Some checks are performed and a delay is added before redirecting back to `/user
If something is really wrong, this can result in a redirect loop.
Visiting this page will never result in triggering the spawn of servers
without additional user action (i.e. clicking the link on the page)
without additional user action (i.e. clicking the link on the page).
![Visiting a URL for a server that's not running](../images/not-running.png)
*Version changed: 1.0*
_Version changed: 1.0_
Prior to 1.0, this URL itself was responsible for spawning servers,
and served the progress page if it was pending,
redirected to running servers, and
This was useful because it made sure that requested servers were restarted after they stopped,
but could also be harmful because unused servers would continuously be restarted if e.g.
an idle JupyterLab frontend were open pointed at it,
which constantly makes polling requests.
Prior to 1.0, this URL itself was responsible for spawning servers.
If the progress page was pending, the URL redirected it to running servers.
This was useful because it made sure that the requested servers were restarted after they stopped.
However, it could also be harmful because unused servers would continuously be restarted if e.g.
an idle JupyterLab frontend that constantly makes polling requests was openly pointed at it.
### Special handling of API requests
Requests to `/user/:username[/:servername]/api/...` are assumed to be
from applications connected to stopped servers.
These are failed with 503 and an informative JSON error message
indicating how to spawn the server.
This is meant to help applications such as JupyterLab
These requests fail with a `503` status code and an informative JSON error message
that indicates how to spawn the server.
This is meant to help applications such as JupyterLab,
that are connected to a server that has stopped.
*Version changed: 1.0*
_Version changed: 1.0_
JupyterHub 0.9 failed these API requests with status 404,
but 1.0 uses 503.
JupyterHub version 0.9 failed these API requests with status `404`,
but version 1.0 uses 503.
## `/user-redirect/...`
This URL is for sharing a URL that will redirect a user
The `/user-redirect/...` URL is for sharing a URL that will redirect a user
to a path on their own default server.
This is useful when users have the same file at the same URL on their servers,
This is useful when different users have the same file at the same URL on their servers,
and you want a single link to give to any user that will open that file on their server.
e.g. a link to `/user-redirect/notebooks/Index.ipynb`
@@ -191,7 +193,7 @@ that is intended to make it possible.
### `/hub/spawn[/:username[/:servername]]`
Requesting `/hub/spawn` will spawn the default server for the current user.
If `username` and optionally `servername` are specified,
If the `username` and optionally `servername` are specified,
then the specified server for the specified user will be spawned.
Once spawn has been requested,
the browser is redirected to `/hub/spawn-pending/...`.
@@ -202,12 +204,12 @@ and a POST request will trigger the actual spawn and redirect.
![The spawn form](../images/spawn-form.png)
*Version added: 1.0*
_Version added: 1.0_
1.0 adds the ability to specify username and servername.
1.0 adds the ability to specify `username` and `servername`.
Prior to 1.0, only `/hub/spawn` was recognized for the default server.
*Version changed: 1.0*
_Version changed: 1.0_
Prior to 1.0, this page redirected back to `/hub/user/:username`,
which was responsible for triggering spawn and rendering progress, etc.
@@ -216,7 +218,7 @@ which was responsible for triggering spawn and rendering progress, etc.
![The spawn pending page](../images/spawn-pending.png)
*Version added: 1.0* this URL is new in JupyterHub 1.0.
_Version added: 1.0_ this URL is new in JupyterHub 1.0.
This page renders the progress view for the given spawn request.
Once the server is ready,
@@ -244,7 +246,7 @@ against the [JupyterHub REST API](./rest.md).
Administrators can take various administrative actions from this page:
1. add/remove users
2. grant admin privileges
3. start/stop user servers
4. shutdown JupyterHub itself
- add/remove users
- grant admin privileges
- start/stop user servers
- shutdown JupyterHub itself

View File

@@ -5,24 +5,24 @@ The **Security Overview** section helps you learn about:
- the design of JupyterHub with respect to web security
- the semi-trusted user
- the available mitigations to protect untrusted users from each other
- the value of periodic security audits.
- the value of periodic security audits
This overview also helps you obtain a deeper understanding of how JupyterHub
works.
## Semi-trusted and untrusted users
JupyterHub is designed to be a *simple multi-user server for modestly sized
groups* of **semi-trusted** users. While the design reflects serving semi-trusted
JupyterHub is designed to be a _simple multi-user server for modestly sized
groups_ of **semi-trusted** users. While the design reflects serving semi-trusted
users, JupyterHub is not necessarily unsuitable for serving **untrusted** users.
Using JupyterHub with **untrusted** users does mean more work by the
Using JupyterHub with **untrusted** users does mean more work for the
administrator. Much care is required to secure a Hub, with extra caution on
protecting users from each other as the Hub is serving untrusted users.
protecting users from each other, since the Hub serves untrusted users.
One aspect of JupyterHub's *design simplicity* for **semi-trusted** users is that
the Hub and single-user servers are placed in a *single domain*, behind a
[*proxy*][configurable-http-proxy]. If the Hub is serving untrusted
One aspect of JupyterHub's _design simplicity_ for **semi-trusted** users is that
the Hub and single-user servers are placed in a _single domain_, behind a
[_proxy_][configurable-http-proxy]. If the Hub is serving untrusted
users, many of the web's cross-site protections are not applied between
single-user servers and the Hub, or between single-user servers and each
other, since browsers see the whole thing (proxy, Hub, and single user
@@ -32,7 +32,7 @@ servers) as a single website (i.e. single domain).
To protect users from each other, a user must **never** be able to write arbitrary
HTML and serve it to another user on the Hub's domain. JupyterHub's
authentication setup prevents a user writing arbitrary HTML and serving it to
authentication setup prevents a user from writing arbitrary HTML and serving it to
another user because only the owner of a given single-user notebook server is
allowed to view user-authored pages served by the given single-user notebook
server.
@@ -40,25 +40,25 @@ server.
To protect all users from each other, JupyterHub administrators must
ensure that:
* A user **does not have permission** to modify their single-user notebook server,
- A user **does not have permission** to modify their single-user notebook server,
including:
- A user **may not** install new packages in the Python environment that runs
their single-user server.
- If the `PATH` is used to resolve the single-user executable (instead of
using an absolute path), a user **may not** create new files in any `PATH`
directory that precedes the directory containing `jupyterhub-singleuser`.
- A user may not modify environment variables (e.g. PATH, PYTHONPATH) for
- A user may not modify environment variables (e.g. `PATH`, `PYTHONPATH`) for
their single-user server.
* A user **may not** modify the configuration of the notebook server
- A user **may not** modify the configuration of the notebook server
(the `~/.jupyter` or `JUPYTER_CONFIG_DIR` directory).
If any additional services are run on the same domain as the Hub, the services
**must never** display user-authored HTML that is neither *sanitized* nor *sandboxed*
**must never** display user-authored HTML that is neither _sanitized_ nor _sandboxed_
(e.g. IFramed) to any user that lacks authentication as the author of a file.
## Mitigate security issues
Several approaches to mitigating these issues with configuration
The several approaches to mitigating security issues with configuration
options provided by JupyterHub include:
### Enable subdomains
@@ -76,16 +76,16 @@ resolves the cross-site issues.
### Disable user config
If subdomains are not available or not desirable, JupyterHub provides a
If subdomains are unavailable or undesirable, JupyterHub provides a
configuration option `Spawner.disable_user_config`, which can be set to prevent
the user-owned configuration files from being loaded. After implementing this
option, PATHs and package installation and PATHs are the other things that the
option, `PATH`s and package installation are the other things that the
admin must enforce.
### Prevent spawners from evaluating shell configuration files
For most Spawners, `PATH` is not something users can influence, but care should
be taken to ensure that the Spawner does *not* evaluate shell configuration
be taken to ensure that the Spawner does _not_ evaluate shell configuration
files prior to launching the server.
### Isolate packages using virtualenv
@@ -101,8 +101,8 @@ pose additional risk to the web application's security.
### Encrypt internal connections with SSL/TLS
By default, all communication on the server, between the proxy, hub, and single
-user notebooks is performed unencrypted. Setting the `internal_ssl` flag in
By default, all communications on the server, between the proxy, hub, and single
-user notebooks are performed unencrypted. Setting the `internal_ssl` flag in
`jupyterhub_config.py` secures the aforementioned routes. Turning this
feature on does require that the enabled `Spawner` can use the certificates
generated by the `Hub` (the default `LocalProcessSpawner` can, for instance).
@@ -119,19 +119,18 @@ extend to securing the `tcp` sockets as well.
## Security audits
We recommend that you do periodic reviews of your deployment's security. It's
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
versions up to date.
good practice to keep [JupyterHub](https://readthedocs.org/projects/jupyterhub/), [configurable-http-proxy][], and [nodejs
versions](https://github.com/nodejs/Release) up to date.
A handy website for testing your deployment is
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).
[configurable-http-proxy]: https://github.com/jupyterhub/configurable-http-proxy
## Vulnerability reporting
If you believe youve found a security vulnerability in JupyterHub, or any
If you believe you have found a security vulnerability in JupyterHub, or any
Jupyter project, please report it to
[security@ipython.org](mailto:security@iypthon.org). If you prefer to encrypt
[security@ipython.org](mailto:security@ipython.org). If you prefer to encrypt
your security reports, you can use [this PGP public
key](https://jupyter-notebook.readthedocs.io/en/stable/_downloads/ipython_security.asc).