mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-18 15:33:02 +00:00
Break up large getting started doc
This commit is contained in:
75
docs/source/authenticators-users-basics.md
Normal file
75
docs/source/authenticators-users-basics.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# Authentication and Users
|
||||
|
||||
The default Authenticator uses [PAM][] to authenticate system users with
|
||||
their username and password. The default behavior of this Authenticator
|
||||
is to allow any user with an account and password on the system to login.
|
||||
|
||||
## Creating a whitelist of users
|
||||
|
||||
You can restrict which users are allowed to login with `Authenticator.whitelist`:
|
||||
|
||||
|
||||
```python
|
||||
c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'}
|
||||
```
|
||||
|
||||
Users listed in the whitelist are added to the Hub database when the Hub is
|
||||
started.
|
||||
|
||||
## Managing Hub administrators
|
||||
|
||||
### Configuring admins (`admin_users`)
|
||||
|
||||
Admin users of JupyterHub, `admin_users`, have the ability to add and remove
|
||||
users from the user `whitelist` or to take actions on the users' behalf,
|
||||
such as stopping and restarting their servers.
|
||||
|
||||
A set of initial admin users, `admin_users` can configured be as follows:
|
||||
|
||||
```python
|
||||
c.Authenticator.admin_users = {'mal', 'zoe'}
|
||||
```
|
||||
Users in the admin list are automatically added to the user `whitelist`,
|
||||
if they are not already present.
|
||||
|
||||
### Admin access to other users' notebook servers (`admin_access`)
|
||||
|
||||
By default the admin users do not have permission to log in *as other users*
|
||||
since the default `JupyterHub.admin_access` setting is False.
|
||||
If `JupyterHub.admin_access` is set to True, then admin users have permission
|
||||
to log in *as other users* on their respective machines, for debugging.
|
||||
**You should make sure your users know if admin_access is enabled.**
|
||||
|
||||
Note: additional configuration examples are provided in this guide's
|
||||
[Configuration Examples section](./config-examples.html).
|
||||
|
||||
### Add or remove users from the Hub
|
||||
|
||||
Users can be added to and removed from the Hub via either the admin panel or
|
||||
REST API.
|
||||
|
||||
If a user is **added**, the user will be automatically added to the whitelist
|
||||
and database. Restarting the Hub will not require manually updating the
|
||||
whitelist in your config file, as the users will be loaded from the database.
|
||||
|
||||
After starting the Hub once, it is not sufficient to **remove** a user from
|
||||
the whitelist in your config file. You must also remove the user from the Hub's
|
||||
database, either by deleting the user from the admin page, or you can clear
|
||||
the `jupyterhub.sqlite` database and start fresh.
|
||||
|
||||
The default `PAMAuthenticator` is one case of a special kind of authenticator, called a
|
||||
`LocalAuthenticator`, indicating that it manages users on the local system. When you add a user to
|
||||
the Hub, a `LocalAuthenticator` checks if that user already exists. Normally, there will be an
|
||||
error telling you that the user doesn't exist. If you set the configuration value
|
||||
|
||||
```python
|
||||
c.LocalAuthenticator.create_system_users = True
|
||||
```
|
||||
|
||||
however, adding a user to the Hub that doesn't already exist on the system will result in the Hub
|
||||
creating that user via the system `adduser` command line tool. This option is typically used on
|
||||
hosted deployments of JupyterHub, to avoid the need to manually create all your users before
|
||||
launching the service. It is not recommended when running JupyterHub in situations where
|
||||
JupyterHub users maps directly onto UNIX users.
|
||||
|
||||
[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
|
@@ -1,470 +0,0 @@
|
||||
# Getting started with JupyterHub
|
||||
|
||||
This section contains getting started information on the following topics:
|
||||
|
||||
- [Technical Overview](#technical-overview)
|
||||
- [Installation](#installation)
|
||||
- [Configuration](#configuration)
|
||||
- [Networking](#networking)
|
||||
- [Security](#security)
|
||||
- [Authentication and users](#authentication-and-users)
|
||||
- [Spawners and single-user notebook servers](#spawners-and-single-user-notebook-servers)
|
||||
- [External Services](#external-services)
|
||||
|
||||
|
||||
## Technical Overview
|
||||
|
||||
JupyterHub is a set of processes that together provide a single user Jupyter
|
||||
Notebook server for each person in a group.
|
||||
|
||||
### Three subsystems
|
||||
Three major subsystems run by the `jupyterhub` command line program:
|
||||
|
||||
- **Single-User Notebook Server**: a dedicated, single-user, Jupyter Notebook server is
|
||||
started for each user on the system when the user logs in. The object that
|
||||
starts these servers is called a **Spawner**.
|
||||
- **Proxy**: the public facing part of JupyterHub that uses a dynamic proxy
|
||||
to route HTTP requests to the Hub and Single User Notebook Servers.
|
||||
- **Hub**: manages user accounts, authentication, and coordinates Single User
|
||||
Notebook Servers using a Spawner.
|
||||
|
||||

|
||||
|
||||
### Deployment server
|
||||
|
||||
To use JupyterHub, you need a Unix server (typically Linux) running somewhere
|
||||
that is accessible to your team on the network. The JupyterHub server can be
|
||||
on an internal network at your organization, or it can run on the public
|
||||
internet (in which case, take care with the Hub's
|
||||
[security](#security)).
|
||||
|
||||
### Basic operation
|
||||
Users access JupyterHub through a web browser, by going to the IP address or
|
||||
the domain name of the server.
|
||||
|
||||
Basic principles of operation:
|
||||
|
||||
* Hub spawns proxy
|
||||
* Proxy forwards all requests to hub by default
|
||||
* Hub handles login, and spawns single-user servers on demand
|
||||
* Hub configures proxy to forward url prefixes to single-user servers
|
||||
|
||||
Different **[authenticators](authenticators.html)** control access
|
||||
to JupyterHub. The default one (PAM) uses the user accounts on the server where
|
||||
JupyterHub is running. If you use this, you will need to create a user account
|
||||
on the system for each user on your team. Using other authenticators, you can
|
||||
allow users to sign in with e.g. a GitHub account, or with any single-sign-on
|
||||
system your organization has.
|
||||
|
||||
Next, **[spawners](spawners.html)** control how JupyterHub starts
|
||||
the individual notebook server for each user. The default spawner will
|
||||
start a notebook server on the same machine running under their system username.
|
||||
The other main option is to start each server in a separate container, often
|
||||
using Docker.
|
||||
|
||||
### Default behavior
|
||||
|
||||
**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.**
|
||||
|
||||
See [Security documentation](#security) for how to configure JupyterHub to use SSL,
|
||||
or put it behind SSL termination in another proxy server, such as nginx.
|
||||
|
||||
---
|
||||
|
||||
**Deprecation note:** Removed `--no-ssl` in version 0.7.
|
||||
|
||||
JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to
|
||||
allow running without SSL using the command `jupyterhub --no-ssl`. The
|
||||
`--no-ssl` command line option is not needed anymore in version 0.7.
|
||||
|
||||
---
|
||||
|
||||
To start JupyterHub in its default configuration, type the following at the command line:
|
||||
|
||||
```bash
|
||||
sudo jupyterhub
|
||||
```
|
||||
|
||||
The default Authenticator that ships with JupyterHub authenticates users
|
||||
with their system name and password (via [PAM][]).
|
||||
Any user on the system with a password will be allowed to start a single-user notebook server.
|
||||
|
||||
The default Spawner starts servers locally as each user, one dedicated server per user.
|
||||
These servers listen on localhost, and start in the given user's home directory.
|
||||
|
||||
By default, the **Proxy** listens on all public interfaces on port 8000.
|
||||
Thus you can reach JupyterHub through either:
|
||||
|
||||
- `http://localhost:8000`
|
||||
- or any other public IP or domain pointing to your system.
|
||||
|
||||
In their default configuration, the other services, the **Hub** and **Single-User Servers**,
|
||||
all communicate with each other on localhost only.
|
||||
|
||||
By default, starting JupyterHub will write two files to disk in the current working directory:
|
||||
|
||||
- `jupyterhub.sqlite` is the sqlite database containing all of the state of the **Hub**.
|
||||
This file allows the **Hub** to remember what users are running and where,
|
||||
as well as other information enabling you to restart parts of JupyterHub separately. It is
|
||||
important to note that this database contains *no* sensitive information other than **Hub**
|
||||
usernames.
|
||||
- `jupyterhub_cookie_secret` is the encryption key used for securing cookies.
|
||||
This file needs to persist in order for restarting the Hub server to avoid invalidating cookies.
|
||||
Conversely, deleting this file and restarting the server effectively invalidates all login cookies.
|
||||
The cookie secret file is discussed in the [Cookie Secret documentation](#cookie-secret).
|
||||
|
||||
The location of these files can be specified via configuration, discussed below.
|
||||
|
||||
|
||||
|
||||
## Networking
|
||||
|
||||
### Configuring the Proxy's IP address and port
|
||||
The Proxy's main IP address setting determines where JupyterHub is available to users.
|
||||
By default, JupyterHub is configured to be available on all network interfaces
|
||||
(`''`) on port 8000. **Note**: Use of `'*'` is discouraged for IP configuration;
|
||||
instead, use of `'0.0.0.0'` is preferred.
|
||||
|
||||
Changing the IP address and port can be done with the following command line
|
||||
arguments:
|
||||
|
||||
```bash
|
||||
jupyterhub --ip=192.168.1.2 --port=443
|
||||
```
|
||||
|
||||
Or by placing the following lines in a configuration file:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ip = '192.168.1.2'
|
||||
c.JupyterHub.port = 443
|
||||
```
|
||||
|
||||
Port 443 is used as an example since 443 is the default port for SSL/HTTPS.
|
||||
|
||||
Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub.
|
||||
However, more customized scenarios may need additional networking details to
|
||||
be configured.
|
||||
|
||||
|
||||
### Configuring the Proxy's REST API communication IP address and port (optional)
|
||||
The Hub service talks to the proxy via a REST API on a secondary port,
|
||||
whose network interface and port can be configured separately.
|
||||
By default, this REST API listens on port 8081 of localhost only.
|
||||
|
||||
If running the Proxy separate from the Hub,
|
||||
configure the REST API communication IP address and port with:
|
||||
|
||||
```python
|
||||
# ideally a private network address
|
||||
c.JupyterHub.proxy_api_ip = '10.0.1.4'
|
||||
c.JupyterHub.proxy_api_port = 5432
|
||||
```
|
||||
|
||||
### Configuring the Hub if Spawners or Proxy are remote or isolated in containers
|
||||
The Hub service also listens only on localhost (port 8080) by default.
|
||||
The Hub needs needs to be accessible from both the proxy and all Spawners.
|
||||
When spawning local servers, an IP address setting of localhost is fine.
|
||||
If *either* the Proxy *or* (more likely) the Spawners will be remote or
|
||||
isolated in containers, the Hub must listen on an IP that is accessible.
|
||||
|
||||
```python
|
||||
c.JupyterHub.hub_ip = '10.0.1.4'
|
||||
c.JupyterHub.hub_port = 54321
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.**
|
||||
|
||||
---
|
||||
|
||||
**Deprecation note:** Removed `--no-ssl` in version 0.7.
|
||||
|
||||
JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to
|
||||
allow running without SSL using the command `jupyterhub --no-ssl`. The
|
||||
`--no-ssl` command line option is not needed anymore in version 0.7.
|
||||
|
||||
---
|
||||
|
||||
Security is the most important aspect of configuring Jupyter. There are four main aspects of the
|
||||
security configuration:
|
||||
|
||||
1. SSL encryption (to enable HTTPS)
|
||||
2. Cookie secret (a key for encrypting browser cookies)
|
||||
3. Proxy authentication token (used for the Hub and other services to authenticate to the Proxy)
|
||||
4. Periodic security audits
|
||||
|
||||
*Note* that the **Hub** hashes all secrets (e.g., auth tokens) before storing them in its
|
||||
database. A loss of control over read-access to the database should have no security impact
|
||||
on your deployment.
|
||||
|
||||
### SSL encryption
|
||||
|
||||
Since JupyterHub includes authentication and allows arbitrary code execution, you should not run
|
||||
it without SSL (HTTPS). This will require you to obtain an official, trusted SSL certificate or
|
||||
create a self-signed certificate. Once you have obtained and installed a key and certificate you
|
||||
need to specify their locations in the configuration file as follows:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/path/to/my.key'
|
||||
c.JupyterHub.ssl_cert = '/path/to/my.cert'
|
||||
```
|
||||
|
||||
It is also possible to use letsencrypt (https://letsencrypt.org/) to obtain
|
||||
a free, trusted SSL certificate. If you run letsencrypt using the default
|
||||
options, the needed configuration is (replace `mydomain.tld` by your fully
|
||||
qualified domain name):
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
|
||||
```
|
||||
|
||||
If the fully qualified domain name (FQDN) is `example.com`, the following
|
||||
would be the needed configuration:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
|
||||
```
|
||||
|
||||
Some cert files also contain the key, in which case only the cert is needed. It is important that
|
||||
these files be put in a secure location on your server, where they are not readable by regular
|
||||
users.
|
||||
|
||||
Note on **chain certificates**: If you are using a chain certificate, see also
|
||||
[chained certificate for SSL](troubleshooting.md#chained-certificates-for-ssl) in the JupyterHub troubleshooting FAQ).
|
||||
|
||||
Note: In certain cases, e.g. **behind SSL termination in nginx**, allowing no SSL
|
||||
running on the hub may be desired.
|
||||
|
||||
### Cookie secret
|
||||
|
||||
The cookie secret is an encryption key, used to encrypt the browser cookies used for
|
||||
authentication. If this value changes for the Hub, all single-user servers must also be restarted.
|
||||
Normally, this value is stored in a file, the location of which can be specified in a config file
|
||||
as follows:
|
||||
|
||||
```python
|
||||
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret'
|
||||
```
|
||||
|
||||
The content of this file should be 32 random bytes, encoded as hex.
|
||||
An example would be to generate this file with:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32 > /srv/jupyterhub/cookie_secret
|
||||
```
|
||||
|
||||
In most deployments of JupyterHub, you should point this to a secure location on the file
|
||||
system, such as `/srv/jupyterhub/cookie_secret`. If the cookie secret file doesn't exist when
|
||||
the Hub starts, a new cookie secret is generated and stored in the file. The
|
||||
file must not be readable by group or other or the server won't start.
|
||||
The recommended permissions for the cookie secret file are 600 (owner-only rw).
|
||||
|
||||
|
||||
If you would like to avoid the need for files, the value can be loaded in the Hub process from
|
||||
the `JPY_COOKIE_SECRET` environment variable, which is a hex-encoded string. You
|
||||
can set it this way:
|
||||
|
||||
```bash
|
||||
export JPY_COOKIE_SECRET=`openssl rand -hex 32`
|
||||
```
|
||||
|
||||
For security reasons, this environment variable should only be visible to the Hub.
|
||||
If you set it dynamically as above, all users will be logged out each time the
|
||||
Hub starts.
|
||||
|
||||
You can also set the cookie secret in the configuration file itself,`jupyterhub_config.py`,
|
||||
as a binary string:
|
||||
|
||||
```python
|
||||
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
|
||||
```
|
||||
|
||||
### Proxy authentication token
|
||||
|
||||
The Hub authenticates its requests to the Proxy using a secret token that
|
||||
the Hub and Proxy agree upon. The value of this string should be a random
|
||||
string (for example, generated by `openssl rand -hex 32`). You can pass
|
||||
this value to the Hub and Proxy using either the `CONFIGPROXY_AUTH_TOKEN`
|
||||
environment variable:
|
||||
|
||||
```bash
|
||||
export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32`
|
||||
```
|
||||
|
||||
This environment variable needs to be visible to the Hub and Proxy.
|
||||
|
||||
Or you can set the value in the configuration file, `jupyterhub_config.py`:
|
||||
|
||||
```python
|
||||
c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
|
||||
```
|
||||
|
||||
If you don't set the Proxy authentication token, the Hub will generate a random key itself, which
|
||||
means that any time you restart the Hub you **must also restart the Proxy**. If the proxy is a
|
||||
subprocess of the Hub, this should happen automatically (this is the default configuration).
|
||||
|
||||
Another time you must set the Proxy authentication token yourself is if
|
||||
you want other services, such as [nbgrader](https://github.com/jupyter/nbgrader)
|
||||
to also be able to connect to the Proxy.
|
||||
|
||||
### Security audits
|
||||
|
||||
We recommend that you do periodic reviews of your deployment's security. It's
|
||||
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
|
||||
versions up to date.
|
||||
|
||||
A handy website for testing your deployment is
|
||||
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).
|
||||
|
||||
## Authentication and users
|
||||
|
||||
The default Authenticator uses [PAM][] to authenticate system users with
|
||||
their username and password. The default behavior of this Authenticator
|
||||
is to allow any user with an account and password on the system to login.
|
||||
|
||||
### Creating a whitelist of users
|
||||
|
||||
You can restrict which users are allowed to login with `Authenticator.whitelist`:
|
||||
|
||||
|
||||
```python
|
||||
c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'}
|
||||
```
|
||||
|
||||
Users listed in the whitelist are added to the Hub database when the Hub is
|
||||
started.
|
||||
|
||||
### Managing Hub administrators
|
||||
|
||||
#### Configuring admins (`admin_users`)
|
||||
|
||||
Admin users of JupyterHub, `admin_users`, have the ability to add and remove
|
||||
users from the user `whitelist` or to take actions on the users' behalf,
|
||||
such as stopping and restarting their servers.
|
||||
|
||||
A set of initial admin users, `admin_users` can configured be as follows:
|
||||
|
||||
```python
|
||||
c.Authenticator.admin_users = {'mal', 'zoe'}
|
||||
```
|
||||
Users in the admin list are automatically added to the user `whitelist`,
|
||||
if they are not already present.
|
||||
|
||||
#### Admin access to other users' notebook servers (`admin_access`)
|
||||
|
||||
By default the admin users do not have permission to log in *as other users*
|
||||
since the default `JupyterHub.admin_access` setting is False.
|
||||
If `JupyterHub.admin_access` is set to True, then admin users have permission
|
||||
to log in *as other users* on their respective machines, for debugging.
|
||||
**You should make sure your users know if admin_access is enabled.**
|
||||
|
||||
Note: additional configuration examples are provided in this guide's
|
||||
[Configuration Examples section](./config-examples.html).
|
||||
|
||||
### Add or remove users from the Hub
|
||||
|
||||
Users can be added to and removed from the Hub via either the admin panel or
|
||||
REST API.
|
||||
|
||||
If a user is **added**, the user will be automatically added to the whitelist
|
||||
and database. Restarting the Hub will not require manually updating the
|
||||
whitelist in your config file, as the users will be loaded from the database.
|
||||
|
||||
After starting the Hub once, it is not sufficient to **remove** a user from
|
||||
the whitelist in your config file. You must also remove the user from the Hub's
|
||||
database, either by deleting the user from the admin page, or you can clear
|
||||
the `jupyterhub.sqlite` database and start fresh.
|
||||
|
||||
The default `PAMAuthenticator` is one case of a special kind of authenticator, called a
|
||||
`LocalAuthenticator`, indicating that it manages users on the local system. When you add a user to
|
||||
the Hub, a `LocalAuthenticator` checks if that user already exists. Normally, there will be an
|
||||
error telling you that the user doesn't exist. If you set the configuration value
|
||||
|
||||
```python
|
||||
c.LocalAuthenticator.create_system_users = True
|
||||
```
|
||||
|
||||
however, adding a user to the Hub that doesn't already exist on the system will result in the Hub
|
||||
creating that user via the system `adduser` command line tool. This option is typically used on
|
||||
hosted deployments of JupyterHub, to avoid the need to manually create all your users before
|
||||
launching the service. It is not recommended when running JupyterHub in situations where
|
||||
JupyterHub users maps directly onto UNIX users.
|
||||
|
||||
## Spawners and single-user notebook servers
|
||||
|
||||
Since the single-user server is an instance of `jupyter notebook`, an entire separate
|
||||
multi-process application, there are many aspect of that server can configure, and a lot of ways
|
||||
to express that configuration.
|
||||
|
||||
At the JupyterHub level, you can set some values on the Spawner. The simplest of these is
|
||||
`Spawner.notebook_dir`, which lets you set the root directory for a user's server. This root
|
||||
notebook directory is the highest level directory users will be able to access in the notebook
|
||||
dashboard. In this example, the root notebook directory is set to `~/notebooks`, where `~` is
|
||||
expanded to the user's home directory.
|
||||
|
||||
```python
|
||||
c.Spawner.notebook_dir = '~/notebooks'
|
||||
```
|
||||
|
||||
You can also specify extra command-line arguments to the notebook server with:
|
||||
|
||||
```python
|
||||
c.Spawner.args = ['--debug', '--profile=PHYS131']
|
||||
```
|
||||
|
||||
This could be used to set the users default page for the single user server:
|
||||
|
||||
```python
|
||||
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
|
||||
```
|
||||
|
||||
Since the single-user server extends the notebook server application,
|
||||
it still loads configuration from the `jupyter_notebook_config.py` config file.
|
||||
Each user may have one of these files in `$HOME/.jupyter/`.
|
||||
Jupyter also supports loading system-wide config files from `/etc/jupyter/`,
|
||||
which is the place to put configuration that you want to affect all of your users.
|
||||
|
||||
## External services
|
||||
|
||||
JupyterHub has a REST API that can be used by external services like the
|
||||
[cull_idle_servers](https://github.com/jupyterhub/jupyterhub/blob/master/examples/cull-idle/cull_idle_servers.py)
|
||||
script which monitors and kills idle single-user servers periodically. In order to run such an
|
||||
external service, you need to provide it an API token. In the case of `cull_idle_servers`, it is passed
|
||||
as the environment variable called `JPY_API_TOKEN`.
|
||||
|
||||
Currently there are two ways of registering that token with JupyterHub. The first one is to use
|
||||
the `jupyterhub` command to generate a token for a specific hub user:
|
||||
|
||||
```bash
|
||||
jupyterhub token <username>
|
||||
```
|
||||
|
||||
As of [version 0.6.0](./changelog.html), the preferred way of doing this is to first generate an API token:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32
|
||||
```
|
||||
|
||||
|
||||
and then write it to your JupyterHub configuration file (note that the **key** is the token while the **value** is the username):
|
||||
|
||||
```python
|
||||
c.JupyterHub.api_tokens = {'token' : 'username'}
|
||||
```
|
||||
|
||||
Upon restarting JupyterHub, you should see a message like below in the logs:
|
||||
|
||||
```
|
||||
Adding API token for <username>
|
||||
```
|
||||
|
||||
Now you can run your script, i.e. `cull_idle_servers`, by providing it the API token and it will authenticate through
|
||||
the REST API to interact with it.
|
||||
|
||||
|
||||
[oauth-setup]: https://github.com/jupyterhub/oauthenticator#setup
|
||||
[oauthenticator]: https://github.com/jupyterhub/oauthenticator
|
||||
[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
|
13
docs/source/getting-started.rst
Normal file
13
docs/source/getting-started.rst
Normal file
@@ -0,0 +1,13 @@
|
||||
Getting Started
|
||||
===============
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
technical-overview
|
||||
config-basics
|
||||
networking-basics
|
||||
security-basics
|
||||
authenticators-users-basics
|
||||
spawners-basics
|
||||
services-basics
|
@@ -36,9 +36,17 @@ Contents
|
||||
|
||||
* :doc:`quickstart`
|
||||
* :doc:`quickstart-docker`
|
||||
* :doc: `installation-basics`
|
||||
* :doc: `config-basics`
|
||||
* :doc:`getting-started`
|
||||
* :doc:`installation-basics`
|
||||
|
||||
**Getting Started**
|
||||
|
||||
* :doc:`technical-overview`
|
||||
* :doc:`config-basics`
|
||||
* :doc:`networking-basics`
|
||||
* :doc:`security-basics`
|
||||
* :doc:`authenticators-users-basics`
|
||||
* :doc:`spawners-basics`
|
||||
* :doc:`services-basics`
|
||||
|
||||
**Configuration Reference**
|
||||
|
||||
@@ -93,7 +101,8 @@ Full Table of Contents
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
user-guide
|
||||
installation-guide
|
||||
getting-started
|
||||
configuration-guide
|
||||
api/index
|
||||
troubleshooting
|
||||
|
@@ -7,5 +7,3 @@ Installation Guide
|
||||
quickstart
|
||||
quickstart-docker
|
||||
installation-basics
|
||||
config-basics
|
||||
getting-started
|
@@ -1,45 +0,0 @@
|
||||
# JupyterHub Deployment on AWS
|
||||
|
||||
Documentation on deploying JupyterHub on an AWS EC2 Instance using NGINX Plus.
|
||||
|
||||
>CAUTION: Document is a work-in-progress. Information found on this page is partially incomplete and may require additional research.
|
||||
|
||||
## Setting Up Amazon EC2 Instance
|
||||
|
||||
### AMI
|
||||
Choose one of the following Amazon Machine Images that are compatible with NGINX Plus:
|
||||
|
||||
* NGINX Plus – Amazon Linux AMI (HVM)
|
||||
* NGINX Plus – Ubuntu AMI (HVM)
|
||||
* NGINX Plus – Amazon Linux AMI (PV)
|
||||
* NGINX Plus – Ubuntu AMI (PV)
|
||||
|
||||
Refer to the [NGINX AMI Installation Guide](https://www.nginx.com/resources/admin-guide/setting-nginx-plus-environment-amazon-ec2/) for more information.
|
||||
|
||||
### Instance Type & Storage
|
||||
Instance type selection depends heavily on memory usage. Amazon Compute Optimized instances are recommended.
|
||||
|
||||
As a rule of thumb consider **100-200 MB/user** plus **5x-10x the amount of data you are loading from disk**, depending on the kind of analysis. After selecting your instance, you can add more memory and select memory type (GP2/IO1) in the 'Add Storage' page.
|
||||
|
||||
(Pictured below: c4.2xlarge)
|
||||
|
||||

|
||||
|
||||
### Configure Security Group
|
||||
The standard HTTPS and HTTP ports (80, 443) need to be opened to allow JupyterHub to be proxied by NGINX.
|
||||
|
||||
Additionally, in order to enable Docker containers to connect to JupyterHub port 8081 will need to be opened. Open a new 'Custom TCP Rule' and set the Source in CIDR Block Notation to:
|
||||
> <Netword IP Address>/24
|
||||
|
||||
Below is a reference image for the security group set-up. Depending on specific use-cases, port rules may differ and likely should not be open to 'anywhere'. Your network IP will also differ.
|
||||
|
||||

|
||||
|
||||
Refer to the [Amazon EC2 Security Groups for Linux Instances Page](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html) for more information.
|
||||
|
||||
----
|
||||
|
||||
## To-Do Sections
|
||||
- [x] Setting Up Amazon EC2 Instance
|
||||
- [ ] Setting Up JupyterHub & Web Server on EC2 VM
|
||||
- [ ] Setting Up Docker Spawner
|
54
docs/source/networking-basics.md
Normal file
54
docs/source/networking-basics.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# Networking basics
|
||||
|
||||
## Configuring the Proxy's IP address and port
|
||||
The Proxy's main IP address setting determines where JupyterHub is available to users.
|
||||
By default, JupyterHub is configured to be available on all network interfaces
|
||||
(`''`) on port 8000. **Note**: Use of `'*'` is discouraged for IP configuration;
|
||||
instead, use of `'0.0.0.0'` is preferred.
|
||||
|
||||
Changing the IP address and port can be done with the following command line
|
||||
arguments:
|
||||
|
||||
```bash
|
||||
jupyterhub --ip=192.168.1.2 --port=443
|
||||
```
|
||||
|
||||
Or by placing the following lines in a configuration file:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ip = '192.168.1.2'
|
||||
c.JupyterHub.port = 443
|
||||
```
|
||||
|
||||
Port 443 is used as an example since 443 is the default port for SSL/HTTPS.
|
||||
|
||||
Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub.
|
||||
However, more customized scenarios may need additional networking details to
|
||||
be configured.
|
||||
|
||||
|
||||
## Configuring the Proxy's REST API communication IP address and port (optional)
|
||||
The Hub service talks to the proxy via a REST API on a secondary port,
|
||||
whose network interface and port can be configured separately.
|
||||
By default, this REST API listens on port 8081 of localhost only.
|
||||
|
||||
If running the Proxy separate from the Hub,
|
||||
configure the REST API communication IP address and port with:
|
||||
|
||||
```python
|
||||
# ideally a private network address
|
||||
c.JupyterHub.proxy_api_ip = '10.0.1.4'
|
||||
c.JupyterHub.proxy_api_port = 5432
|
||||
```
|
||||
|
||||
## Configuring the Hub if Spawners or Proxy are remote or isolated in containers
|
||||
The Hub service also listens only on localhost (port 8080) by default.
|
||||
The Hub needs needs to be accessible from both the proxy and all Spawners.
|
||||
When spawning local servers, an IP address setting of localhost is fine.
|
||||
If *either* the Proxy *or* (more likely) the Spawners will be remote or
|
||||
isolated in containers, the Hub must listen on an IP that is accessible.
|
||||
|
||||
```python
|
||||
c.JupyterHub.hub_ip = '10.0.1.4'
|
||||
c.JupyterHub.hub_port = 54321
|
||||
```
|
146
docs/source/security-basics.md
Normal file
146
docs/source/security-basics.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# Security
|
||||
|
||||
**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.**
|
||||
|
||||
---
|
||||
|
||||
**Deprecation note:** Removed `--no-ssl` in version 0.7.
|
||||
|
||||
JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to
|
||||
allow running without SSL using the command `jupyterhub --no-ssl`. The
|
||||
`--no-ssl` command line option is not needed anymore in version 0.7.
|
||||
|
||||
---
|
||||
|
||||
Security is the most important aspect of configuring Jupyter. There are four main aspects of the
|
||||
security configuration:
|
||||
|
||||
1. SSL encryption (to enable HTTPS)
|
||||
2. Cookie secret (a key for encrypting browser cookies)
|
||||
3. Proxy authentication token (used for the Hub and other services to authenticate to the Proxy)
|
||||
4. Periodic security audits
|
||||
|
||||
*Note* that the **Hub** hashes all secrets (e.g., auth tokens) before storing them in its
|
||||
database. A loss of control over read-access to the database should have no security impact
|
||||
on your deployment.
|
||||
|
||||
## SSL encryption
|
||||
|
||||
Since JupyterHub includes authentication and allows arbitrary code execution, you should not run
|
||||
it without SSL (HTTPS). This will require you to obtain an official, trusted SSL certificate or
|
||||
create a self-signed certificate. Once you have obtained and installed a key and certificate you
|
||||
need to specify their locations in the configuration file as follows:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/path/to/my.key'
|
||||
c.JupyterHub.ssl_cert = '/path/to/my.cert'
|
||||
```
|
||||
|
||||
It is also possible to use letsencrypt (https://letsencrypt.org/) to obtain
|
||||
a free, trusted SSL certificate. If you run letsencrypt using the default
|
||||
options, the needed configuration is (replace `mydomain.tld` by your fully
|
||||
qualified domain name):
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
|
||||
```
|
||||
|
||||
If the fully qualified domain name (FQDN) is `example.com`, the following
|
||||
would be the needed configuration:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
|
||||
```
|
||||
|
||||
Some cert files also contain the key, in which case only the cert is needed. It is important that
|
||||
these files be put in a secure location on your server, where they are not readable by regular
|
||||
users.
|
||||
|
||||
Note on **chain certificates**: If you are using a chain certificate, see also
|
||||
[chained certificate for SSL](troubleshooting.md#chained-certificates-for-ssl) in the JupyterHub troubleshooting FAQ).
|
||||
|
||||
Note: In certain cases, e.g. **behind SSL termination in nginx**, allowing no SSL
|
||||
running on the hub may be desired.
|
||||
|
||||
## Cookie secret
|
||||
|
||||
The cookie secret is an encryption key, used to encrypt the browser cookies used for
|
||||
authentication. If this value changes for the Hub, all single-user servers must also be restarted.
|
||||
Normally, this value is stored in a file, the location of which can be specified in a config file
|
||||
as follows:
|
||||
|
||||
```python
|
||||
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret'
|
||||
```
|
||||
|
||||
The content of this file should be 32 random bytes, encoded as hex.
|
||||
An example would be to generate this file with:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32 > /srv/jupyterhub/cookie_secret
|
||||
```
|
||||
|
||||
In most deployments of JupyterHub, you should point this to a secure location on the file
|
||||
system, such as `/srv/jupyterhub/cookie_secret`. If the cookie secret file doesn't exist when
|
||||
the Hub starts, a new cookie secret is generated and stored in the file. The
|
||||
file must not be readable by group or other or the server won't start.
|
||||
The recommended permissions for the cookie secret file are 600 (owner-only rw).
|
||||
|
||||
|
||||
If you would like to avoid the need for files, the value can be loaded in the Hub process from
|
||||
the `JPY_COOKIE_SECRET` environment variable, which is a hex-encoded string. You
|
||||
can set it this way:
|
||||
|
||||
```bash
|
||||
export JPY_COOKIE_SECRET=`openssl rand -hex 32`
|
||||
```
|
||||
|
||||
For security reasons, this environment variable should only be visible to the Hub.
|
||||
If you set it dynamically as above, all users will be logged out each time the
|
||||
Hub starts.
|
||||
|
||||
You can also set the cookie secret in the configuration file itself,`jupyterhub_config.py`,
|
||||
as a binary string:
|
||||
|
||||
```python
|
||||
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
|
||||
```
|
||||
|
||||
## Proxy authentication token
|
||||
|
||||
The Hub authenticates its requests to the Proxy using a secret token that
|
||||
the Hub and Proxy agree upon. The value of this string should be a random
|
||||
string (for example, generated by `openssl rand -hex 32`). You can pass
|
||||
this value to the Hub and Proxy using either the `CONFIGPROXY_AUTH_TOKEN`
|
||||
environment variable:
|
||||
|
||||
```bash
|
||||
export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32`
|
||||
```
|
||||
|
||||
This environment variable needs to be visible to the Hub and Proxy.
|
||||
|
||||
Or you can set the value in the configuration file, `jupyterhub_config.py`:
|
||||
|
||||
```python
|
||||
c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
|
||||
```
|
||||
|
||||
If you don't set the Proxy authentication token, the Hub will generate a random key itself, which
|
||||
means that any time you restart the Hub you **must also restart the Proxy**. If the proxy is a
|
||||
subprocess of the Hub, this should happen automatically (this is the default configuration).
|
||||
|
||||
Another time you must set the Proxy authentication token yourself is if
|
||||
you want other services, such as [nbgrader](https://github.com/jupyter/nbgrader)
|
||||
to also be able to connect to the Proxy.
|
||||
|
||||
## Security audits
|
||||
|
||||
We recommend that you do periodic reviews of your deployment's security. It's
|
||||
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
|
||||
versions up to date.
|
||||
|
||||
A handy website for testing your deployment is
|
||||
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).
|
36
docs/source/services-basics.md
Normal file
36
docs/source/services-basics.md
Normal file
@@ -0,0 +1,36 @@
|
||||
## External services
|
||||
|
||||
JupyterHub has a REST API that can be used by external services like the
|
||||
[cull_idle_servers](https://github.com/jupyterhub/jupyterhub/blob/master/examples/cull-idle/cull_idle_servers.py)
|
||||
script which monitors and kills idle single-user servers periodically. In order to run such an
|
||||
external service, you need to provide it an API token. In the case of `cull_idle_servers`, it is passed
|
||||
as the environment variable called `JPY_API_TOKEN`.
|
||||
|
||||
Currently there are two ways of registering that token with JupyterHub. The first one is to use
|
||||
the `jupyterhub` command to generate a token for a specific hub user:
|
||||
|
||||
```bash
|
||||
jupyterhub token <username>
|
||||
```
|
||||
|
||||
As of [version 0.6.0](./changelog.html), the preferred way of doing this is to first generate an API token:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32
|
||||
```
|
||||
|
||||
|
||||
and then write it to your JupyterHub configuration file (note that the **key** is the token while the **value** is the username):
|
||||
|
||||
```python
|
||||
c.JupyterHub.api_tokens = {'token' : 'username'}
|
||||
```
|
||||
|
||||
Upon restarting JupyterHub, you should see a message like below in the logs:
|
||||
|
||||
```
|
||||
Adding API token for <username>
|
||||
```
|
||||
|
||||
Now you can run your script, i.e. `cull_idle_servers`, by providing it the API token and it will authenticate through
|
||||
the REST API to interact with it.
|
33
docs/source/spawners-basics.md
Normal file
33
docs/source/spawners-basics.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Spawners and single-user notebook servers
|
||||
|
||||
Since the single-user server is an instance of `jupyter notebook`, an entire separate
|
||||
multi-process application, there are many aspect of that server can configure, and a lot of ways
|
||||
to express that configuration.
|
||||
|
||||
At the JupyterHub level, you can set some values on the Spawner. The simplest of these is
|
||||
`Spawner.notebook_dir`, which lets you set the root directory for a user's server. This root
|
||||
notebook directory is the highest level directory users will be able to access in the notebook
|
||||
dashboard. In this example, the root notebook directory is set to `~/notebooks`, where `~` is
|
||||
expanded to the user's home directory.
|
||||
|
||||
```python
|
||||
c.Spawner.notebook_dir = '~/notebooks'
|
||||
```
|
||||
|
||||
You can also specify extra command-line arguments to the notebook server with:
|
||||
|
||||
```python
|
||||
c.Spawner.args = ['--debug', '--profile=PHYS131']
|
||||
```
|
||||
|
||||
This could be used to set the users default page for the single user server:
|
||||
|
||||
```python
|
||||
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']
|
||||
```
|
||||
|
||||
Since the single-user server extends the notebook server application,
|
||||
it still loads configuration from the `jupyter_notebook_config.py` config file.
|
||||
Each user may have one of these files in `$HOME/.jupyter/`.
|
||||
Jupyter also supports loading system-wide config files from `/etc/jupyter/`,
|
||||
which is the place to put configuration that you want to affect all of your users.
|
104
docs/source/technical-overview.md
Normal file
104
docs/source/technical-overview.md
Normal file
@@ -0,0 +1,104 @@
|
||||
## Technical Overview
|
||||
|
||||
JupyterHub is a set of processes that together provide a single user Jupyter
|
||||
Notebook server for each person in a group.
|
||||
|
||||
### Three subsystems
|
||||
Three major subsystems run by the `jupyterhub` command line program:
|
||||
|
||||
- **Single-User Notebook Server**: a dedicated, single-user, Jupyter Notebook server is
|
||||
started for each user on the system when the user logs in. The object that
|
||||
starts these servers is called a **Spawner**.
|
||||
- **Proxy**: the public facing part of JupyterHub that uses a dynamic proxy
|
||||
to route HTTP requests to the Hub and Single User Notebook Servers.
|
||||
- **Hub**: manages user accounts, authentication, and coordinates Single User
|
||||
Notebook Servers using a Spawner.
|
||||
|
||||

|
||||
|
||||
### Deployment server
|
||||
|
||||
To use JupyterHub, you need a Unix server (typically Linux) running somewhere
|
||||
that is accessible to your team on the network. The JupyterHub server can be
|
||||
on an internal network at your organization, or it can run on the public
|
||||
internet (in which case, take care with the Hub's
|
||||
[security](#security)).
|
||||
|
||||
### Basic operation
|
||||
Users access JupyterHub through a web browser, by going to the IP address or
|
||||
the domain name of the server.
|
||||
|
||||
Basic principles of operation:
|
||||
|
||||
* Hub spawns proxy
|
||||
* Proxy forwards all requests to hub by default
|
||||
* Hub handles login, and spawns single-user servers on demand
|
||||
* Hub configures proxy to forward url prefixes to single-user servers
|
||||
|
||||
Different **[authenticators](authenticators.html)** control access
|
||||
to JupyterHub. The default one (PAM) uses the user accounts on the server where
|
||||
JupyterHub is running. If you use this, you will need to create a user account
|
||||
on the system for each user on your team. Using other authenticators, you can
|
||||
allow users to sign in with e.g. a GitHub account, or with any single-sign-on
|
||||
system your organization has.
|
||||
|
||||
Next, **[spawners](spawners.html)** control how JupyterHub starts
|
||||
the individual notebook server for each user. The default spawner will
|
||||
start a notebook server on the same machine running under their system username.
|
||||
The other main option is to start each server in a separate container, often
|
||||
using Docker.
|
||||
|
||||
### Default behavior
|
||||
|
||||
**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.**
|
||||
|
||||
See [Security documentation](#security) for how to configure JupyterHub to use SSL,
|
||||
or put it behind SSL termination in another proxy server, such as nginx.
|
||||
|
||||
---
|
||||
|
||||
**Deprecation note:** Removed `--no-ssl` in version 0.7.
|
||||
|
||||
JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to
|
||||
allow running without SSL using the command `jupyterhub --no-ssl`. The
|
||||
`--no-ssl` command line option is not needed anymore in version 0.7.
|
||||
|
||||
---
|
||||
|
||||
To start JupyterHub in its default configuration, type the following at the command line:
|
||||
|
||||
```bash
|
||||
sudo jupyterhub
|
||||
```
|
||||
|
||||
The default Authenticator that ships with JupyterHub authenticates users
|
||||
with their system name and password (via [PAM][]).
|
||||
Any user on the system with a password will be allowed to start a single-user notebook server.
|
||||
|
||||
The default Spawner starts servers locally as each user, one dedicated server per user.
|
||||
These servers listen on localhost, and start in the given user's home directory.
|
||||
|
||||
By default, the **Proxy** listens on all public interfaces on port 8000.
|
||||
Thus you can reach JupyterHub through either:
|
||||
|
||||
- `http://localhost:8000`
|
||||
- or any other public IP or domain pointing to your system.
|
||||
|
||||
In their default configuration, the other services, the **Hub** and **Single-User Servers**,
|
||||
all communicate with each other on localhost only.
|
||||
|
||||
By default, starting JupyterHub will write two files to disk in the current working directory:
|
||||
|
||||
- `jupyterhub.sqlite` is the sqlite database containing all of the state of the **Hub**.
|
||||
This file allows the **Hub** to remember what users are running and where,
|
||||
as well as other information enabling you to restart parts of JupyterHub separately. It is
|
||||
important to note that this database contains *no* sensitive information other than **Hub**
|
||||
usernames.
|
||||
- `jupyterhub_cookie_secret` is the encryption key used for securing cookies.
|
||||
This file needs to persist in order for restarting the Hub server to avoid invalidating cookies.
|
||||
Conversely, deleting this file and restarting the server effectively invalidates all login cookies.
|
||||
The cookie secret file is discussed in the [Cookie Secret documentation](#cookie-secret).
|
||||
|
||||
The location of these files can be specified via configuration.
|
||||
|
||||
[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
|
Reference in New Issue
Block a user