mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-17 23:13:00 +00:00
Restructured references section of the docs
This commit is contained in:
132
docs/source/reference/technical-reference/technical-overview.md
Normal file
132
docs/source/reference/technical-reference/technical-overview.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Technical Overview
|
||||
|
||||
The **Technical Overview** section gives you a high-level view of:
|
||||
|
||||
- JupyterHub's major Subsystems: Hub, Proxy, Single-User Notebook Server
|
||||
- how the subsystems interact
|
||||
- the process from JupyterHub access to user login
|
||||
- JupyterHub's default behavior
|
||||
- customizing JupyterHub
|
||||
|
||||
The goal of this section is to share a deeper technical understanding of
|
||||
JupyterHub and how it works.
|
||||
|
||||
## The Major Subsystems: Hub, Proxy, Single-User Notebook Server
|
||||
|
||||
JupyterHub is a set of processes that together, provide a single-user Jupyter
|
||||
Notebook server for each person in a group. Three subsystems are started
|
||||
by the `jupyterhub` command line program:
|
||||
|
||||
- **Hub** (Python/Tornado): manages user accounts, authentication, and
|
||||
coordinates Single User Notebook Servers using a [Spawner](spawners-reference).
|
||||
|
||||
- **Proxy**: the public-facing part of JupyterHub that uses a dynamic proxy
|
||||
to route HTTP requests to the Hub and Single User Notebook Servers.
|
||||
[configurable http proxy](https://github.com/jupyterhub/configurable-http-proxy)
|
||||
(node-http-proxy) is the default proxy.
|
||||
|
||||
- **Single-User Notebook Server** (Python/Tornado): a dedicated,
|
||||
single-user, Jupyter Notebook server is started for each user on the system
|
||||
when the user logs in. The object that starts the single-user notebook
|
||||
servers is called a **[Spawner](spawners-reference)**.
|
||||
|
||||

|
||||
|
||||
## How the Subsystems Interact
|
||||
|
||||
Users access JupyterHub through a web browser, by going to the IP address or
|
||||
the domain name of the server.
|
||||
|
||||
The basic principles of operation are:
|
||||
|
||||
- The Hub spawns the proxy (in the default JupyterHub configuration)
|
||||
- The proxy forwards all requests to the Hub by default
|
||||
- The Hub handles login and spawns single-user notebook servers on demand
|
||||
- The Hub configures the proxy to forward URL prefixes to single-user notebook
|
||||
servers
|
||||
|
||||
The proxy is the only process that listens on a public interface. The Hub sits
|
||||
behind the proxy at `/hub`. Single-user servers sit behind the proxy at
|
||||
`/user/[username]`.
|
||||
|
||||
Different **[authenticators](authenticators-reference)** control access
|
||||
to JupyterHub. The default one [(PAM)](https://en.wikipedia.org/wiki/Pluggable_authentication_module) uses the user accounts on the server where
|
||||
JupyterHub is running. If you use this, you will need to create a user account
|
||||
on the system for each user on your team. However, using other authenticators you can
|
||||
allow users to sign in with e.g. a GitHub account, or with any single-sign-on
|
||||
system your organization has.
|
||||
|
||||
Next, **[spawners](spawners-reference)** control how JupyterHub starts
|
||||
the individual notebook server for each user. The default spawner will
|
||||
start a notebook server on the same machine running under their system username.
|
||||
The other main option is to start each server in a separate container, often using [Docker](https://jupyterhub-dockerspawner.readthedocs.io/en/latest/).
|
||||
|
||||
## The Process from JupyterHub Access to User Login
|
||||
|
||||
When a user accesses JupyterHub, the following events take place:
|
||||
|
||||
- Login data is handed to the [Authenticator](authenticators-reference) instance for
|
||||
validation
|
||||
- The Authenticator returns the username if the login information is valid
|
||||
- A single-user notebook server instance is [spawned](spawners-reference) for the
|
||||
logged-in user
|
||||
- When the single-user notebook server starts, the proxy is notified to forward
|
||||
requests made to `/user/[username]/*`, to the single-user notebook server.
|
||||
- A [cookie](https://en.wikipedia.org/wiki/HTTP_cookie) is set on `/hub/`, containing an encrypted token. (Prior to version
|
||||
0.8, a cookie for `/user/[username]` was used too.)
|
||||
- The browser is redirected to `/user/[username]`, and the request is handled by
|
||||
the single-user notebook server.
|
||||
|
||||
How does the single-user server identify the user with the Hub via OAuth?
|
||||
|
||||
- On request, the single-user server checks a cookie
|
||||
- If no cookie is set, the single-user server redirects to the Hub for verification via OAuth
|
||||
- After verification at the Hub, the browser is redirected back to the
|
||||
single-user server
|
||||
- The token is verified and stored in a cookie
|
||||
- If no user is identified, the browser is redirected back to `/hub/login`
|
||||
|
||||
## Default Behavior
|
||||
|
||||
By default, the **Proxy** listens on all public interfaces on port 8000.
|
||||
Thus you can reach JupyterHub through either:
|
||||
|
||||
- `http://localhost:8000`
|
||||
- or any other public IP or domain pointing to your system.
|
||||
|
||||
In their default configuration, the other services, the **Hub** and
|
||||
**Single-User Notebook Servers**, all communicate with each other on localhost
|
||||
only.
|
||||
|
||||
By default, starting JupyterHub will write two files to disk in the current
|
||||
working directory:
|
||||
|
||||
- `jupyterhub.sqlite` is the SQLite database containing all of the state of the
|
||||
**Hub**. This file allows the **Hub** to remember which users are running and
|
||||
where, as well as storing other information enabling you to restart parts of
|
||||
JupyterHub separately. It is important to note that this database contains
|
||||
**no** sensitive information other than **Hub** usernames.
|
||||
- `jupyterhub_cookie_secret` is the encryption key used for securing cookies.
|
||||
This file needs to persist so that a **Hub** server restart will avoid
|
||||
invalidating cookies. Conversely, deleting this file and restarting the server
|
||||
effectively invalidates all login cookies. The cookie secret file is discussed
|
||||
in the [Cookie Secret section of the Security Settings document](security-basics).
|
||||
|
||||
The location of these files can be specified via configuration settings. It is
|
||||
recommended that these files be stored in standard UNIX filesystem locations,
|
||||
such as `/etc/jupyterhub` for all configuration files and `/srv/jupyterhub` for
|
||||
all security and runtime files.
|
||||
|
||||
## Customizing JupyterHub
|
||||
|
||||
There are two basic extension points for JupyterHub:
|
||||
|
||||
- How users are authenticated by [Authenticators](authenticators-reference)
|
||||
- How user's single-user notebook server processes are started by
|
||||
[Spawners](spawners-reference)
|
||||
|
||||
Each is governed by a customizable class, and JupyterHub ships with basic
|
||||
defaults for each.
|
||||
|
||||
To enable custom authentication and/or spawning, subclass `Authenticator` or
|
||||
`Spawner`, and override the relevant methods.
|
Reference in New Issue
Block a user