mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-17 15:03:02 +00:00
Edit and deduplicate security docs
This commit is contained in:
@@ -1,146 +0,0 @@
|
||||
# Security
|
||||
|
||||
**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.**
|
||||
|
||||
---
|
||||
|
||||
**Deprecation note:** Removed `--no-ssl` in version 0.7.
|
||||
|
||||
JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to
|
||||
allow running without SSL using the command `jupyterhub --no-ssl`. The
|
||||
`--no-ssl` command line option is not needed anymore in version 0.7.
|
||||
|
||||
---
|
||||
|
||||
Security is the most important aspect of configuring Jupyter. There are four main aspects of the
|
||||
security configuration:
|
||||
|
||||
1. SSL encryption (to enable HTTPS)
|
||||
2. Cookie secret (a key for encrypting browser cookies)
|
||||
3. Proxy authentication token (used for the Hub and other services to authenticate to the Proxy)
|
||||
4. Periodic security audits
|
||||
|
||||
*Note* that the **Hub** hashes all secrets (e.g., auth tokens) before storing them in its
|
||||
database. A loss of control over read-access to the database should have no security impact
|
||||
on your deployment.
|
||||
|
||||
## SSL encryption
|
||||
|
||||
Since JupyterHub includes authentication and allows arbitrary code execution, you should not run
|
||||
it without SSL (HTTPS). This will require you to obtain an official, trusted SSL certificate or
|
||||
create a self-signed certificate. Once you have obtained and installed a key and certificate you
|
||||
need to specify their locations in the configuration file as follows:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/path/to/my.key'
|
||||
c.JupyterHub.ssl_cert = '/path/to/my.cert'
|
||||
```
|
||||
|
||||
It is also possible to use letsencrypt (https://letsencrypt.org/) to obtain
|
||||
a free, trusted SSL certificate. If you run letsencrypt using the default
|
||||
options, the needed configuration is (replace `mydomain.tld` by your fully
|
||||
qualified domain name):
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
|
||||
```
|
||||
|
||||
If the fully qualified domain name (FQDN) is `example.com`, the following
|
||||
would be the needed configuration:
|
||||
|
||||
```python
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
|
||||
```
|
||||
|
||||
Some cert files also contain the key, in which case only the cert is needed. It is important that
|
||||
these files be put in a secure location on your server, where they are not readable by regular
|
||||
users.
|
||||
|
||||
Note on **chain certificates**: If you are using a chain certificate, see also
|
||||
[chained certificate for SSL](troubleshooting.md#chained-certificates-for-ssl) in the JupyterHub troubleshooting FAQ).
|
||||
|
||||
Note: In certain cases, e.g. **behind SSL termination in nginx**, allowing no SSL
|
||||
running on the hub may be desired.
|
||||
|
||||
## Cookie secret
|
||||
|
||||
The cookie secret is an encryption key, used to encrypt the browser cookies used for
|
||||
authentication. If this value changes for the Hub, all single-user servers must also be restarted.
|
||||
Normally, this value is stored in a file, the location of which can be specified in a config file
|
||||
as follows:
|
||||
|
||||
```python
|
||||
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret'
|
||||
```
|
||||
|
||||
The content of this file should be 32 random bytes, encoded as hex.
|
||||
An example would be to generate this file with:
|
||||
|
||||
```bash
|
||||
openssl rand -hex 32 > /srv/jupyterhub/cookie_secret
|
||||
```
|
||||
|
||||
In most deployments of JupyterHub, you should point this to a secure location on the file
|
||||
system, such as `/srv/jupyterhub/cookie_secret`. If the cookie secret file doesn't exist when
|
||||
the Hub starts, a new cookie secret is generated and stored in the file. The
|
||||
file must not be readable by group or other or the server won't start.
|
||||
The recommended permissions for the cookie secret file are 600 (owner-only rw).
|
||||
|
||||
|
||||
If you would like to avoid the need for files, the value can be loaded in the Hub process from
|
||||
the `JPY_COOKIE_SECRET` environment variable, which is a hex-encoded string. You
|
||||
can set it this way:
|
||||
|
||||
```bash
|
||||
export JPY_COOKIE_SECRET=`openssl rand -hex 32`
|
||||
```
|
||||
|
||||
For security reasons, this environment variable should only be visible to the Hub.
|
||||
If you set it dynamically as above, all users will be logged out each time the
|
||||
Hub starts.
|
||||
|
||||
You can also set the cookie secret in the configuration file itself,`jupyterhub_config.py`,
|
||||
as a binary string:
|
||||
|
||||
```python
|
||||
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
|
||||
```
|
||||
|
||||
## Proxy authentication token
|
||||
|
||||
The Hub authenticates its requests to the Proxy using a secret token that
|
||||
the Hub and Proxy agree upon. The value of this string should be a random
|
||||
string (for example, generated by `openssl rand -hex 32`). You can pass
|
||||
this value to the Hub and Proxy using either the `CONFIGPROXY_AUTH_TOKEN`
|
||||
environment variable:
|
||||
|
||||
```bash
|
||||
export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32`
|
||||
```
|
||||
|
||||
This environment variable needs to be visible to the Hub and Proxy.
|
||||
|
||||
Or you can set the value in the configuration file, `jupyterhub_config.py`:
|
||||
|
||||
```python
|
||||
c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
|
||||
```
|
||||
|
||||
If you don't set the Proxy authentication token, the Hub will generate a random key itself, which
|
||||
means that any time you restart the Hub you **must also restart the Proxy**. If the proxy is a
|
||||
subprocess of the Hub, this should happen automatically (this is the default configuration).
|
||||
|
||||
Another time you must set the Proxy authentication token yourself is if
|
||||
you want other services, such as [nbgrader](https://github.com/jupyter/nbgrader)
|
||||
to also be able to connect to the Proxy.
|
||||
|
||||
## Security audits
|
||||
|
||||
We recommend that you do periodic reviews of your deployment's security. It's
|
||||
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
|
||||
versions up to date.
|
||||
|
||||
A handy website for testing your deployment is
|
||||
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).
|
176
docs/source/security-basics.rst
Normal file
176
docs/source/security-basics.rst
Normal file
@@ -0,0 +1,176 @@
|
||||
Security
|
||||
========
|
||||
|
||||
.. important::
|
||||
|
||||
You should not run JupyterHub without SSL encryption on a public network
|
||||
|
||||
Security is the most important aspect of configuring Jupyter. There are four
|
||||
main aspects of the security configuration:
|
||||
|
||||
1. `SSL encryption <ssl-encryption>`_ (to enable HTTPS)
|
||||
2. `Cookie secret <cookie-secret>`_ (a key for encrypting browser cookies)
|
||||
3. Proxy `authentication token <authentication-token>`_ (used for the Hub and
|
||||
other services to authenticate to the Proxy)
|
||||
4. Periodic `security audits <security-audits>`_
|
||||
|
||||
The Hub hashes all secrets (e.g., auth tokens) before storing them in its
|
||||
database. A loss of control over read-access to the database should have no
|
||||
security impact on your deployment.
|
||||
|
||||
.. _ssl-encryption:
|
||||
|
||||
Enabling SSL encryption
|
||||
-----------------------
|
||||
|
||||
Since JupyterHub includes authentication and allows arbitrary code execution,
|
||||
you should not run it without SSL (HTTPS).
|
||||
|
||||
Using an SSL certificate
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This will require you to obtain an official, trusted SSL certificate or create a
|
||||
self-signed certificate. Once you have obtained and installed a key and
|
||||
certificate you need to specify their locations in the configuration file as
|
||||
follows:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
c.JupyterHub.ssl_key = '/path/to/my.key'
|
||||
c.JupyterHub.ssl_cert = '/path/to/my.cert'
|
||||
|
||||
|
||||
Some cert files also contain the key, in which case only the cert is needed. It
|
||||
is important that these files be put in a secure location on your server, where
|
||||
they are not readable by regular users.
|
||||
|
||||
If you are using a **chain certificate**, see also chained certificate for SSL
|
||||
in the JupyterHub `troubleshooting FAQ <troubleshooting>`_.
|
||||
|
||||
Using letsencrypt
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
It is also possible to use `letsencrypt <https://letsencrypt.org/>`_ to obtain
|
||||
a free, trusted SSL certificate. If you run letsencrypt using the default
|
||||
options, the needed configuration is (replace ``mydomain.tld`` by your fully
|
||||
qualified domain name):
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
|
||||
|
||||
|
||||
If the fully qualified domain name (FQDN) is ``example.com``, the following
|
||||
would be the needed configuration:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
|
||||
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
|
||||
|
||||
|
||||
If SSL termination happens outside of the Hub
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In certain cases, e.g. behind `SSL termination in NGINX <https://www.nginx.com/resources/admin-guide/nginx-ssl-termination/>`_,
|
||||
allowing no SSL running on the hub may be the desired configuration option.
|
||||
|
||||
.. _cookie-secret:
|
||||
|
||||
Cookie secret
|
||||
-------------
|
||||
|
||||
The cookie secret is an encryption key, used to encrypt the browser cookies used
|
||||
for authentication. If this value changes for the Hub, all single-user servers
|
||||
must also be restarted.
|
||||
|
||||
Normally, this value is stored in a file, the location of which can be specified
|
||||
in a config file as follows:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret'
|
||||
|
||||
|
||||
The content of this file should be 32 random bytes, encoded as hex.
|
||||
An example would be to generate this file with:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
openssl rand -hex 32 > /srv/jupyterhub/cookie_secret
|
||||
|
||||
|
||||
In most deployments of JupyterHub, you should point this to a secure location on
|
||||
the file system, such as ``/srv/jupyterhub/cookie_secret``. If the cookie secret
|
||||
file doesn't exist when the Hub starts, a new cookie secret is generated and
|
||||
stored in the file. The file must not be readable by group or other or the
|
||||
server won't start. The recommended permissions for the cookie secret file are
|
||||
``600`` (owner-only rw).
|
||||
|
||||
|
||||
If you would like to avoid the need for files, the value can be loaded in the
|
||||
Hub process from the ``JPY_COOKIE_SECRET`` environment variable, which is a
|
||||
hex-encoded string. You can set it this way:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
export JPY_COOKIE_SECRET=`openssl rand -hex 32`
|
||||
|
||||
|
||||
For security reasons, this environment variable should only be visible to the
|
||||
Hub. If you set it dynamically as above, all users will be logged out each time
|
||||
the Hub starts.
|
||||
|
||||
You can also set the cookie secret in the configuration file
|
||||
itself,``jupyterhub_config.py``, as a binary string:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
|
||||
|
||||
|
||||
.. _authentication-token:
|
||||
|
||||
Proxy authentication token
|
||||
--------------------------
|
||||
|
||||
The Hub authenticates its requests to the Proxy using a secret token that
|
||||
the Hub and Proxy agree upon. The value of this string should be a random
|
||||
string (for example, generated by ``openssl rand -hex 32``). You can pass
|
||||
this value to the Hub and Proxy using either the ``CONFIGPROXY_AUTH_TOKEN``
|
||||
environment variable:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32`
|
||||
|
||||
|
||||
This environment variable needs to be visible to the Hub and Proxy.
|
||||
|
||||
Or you can set the value in the configuration file, ``jupyterhub_config.py``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
|
||||
|
||||
If you don't set the Proxy authentication token, the Hub will generate a random
|
||||
key itself, which means that any time you restart the Hub you **must also
|
||||
restart the Proxy**. If the proxy is a subprocess of the Hub, this should happen
|
||||
automatically (this is the default configuration).
|
||||
|
||||
Another time you must set the Proxy authentication token yourself is if
|
||||
you want other services, such as `nbgrader <https://github.com/jupyter/nbgrader>`_,
|
||||
to also be able to connect to the Proxy.
|
||||
|
||||
.. _security-audits:
|
||||
|
||||
Security audits
|
||||
---------------
|
||||
|
||||
We recommend that you do periodic reviews of your deployment's security. It's
|
||||
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
|
||||
versions up to date.
|
||||
|
||||
A handy website for testing your deployment is
|
||||
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).
|
@@ -1,77 +1,90 @@
|
||||
# Web Security in JupyterHub
|
||||
# Web Security and JupyterHub's design
|
||||
|
||||
JupyterHub is designed to be a simple multi-user server for modestly sized
|
||||
groups of semi-trusted users. While the design reflects serving semi-trusted
|
||||
users, JupyterHub is not necessarily unsuitable for serving untrusted users.
|
||||
Using JupyterHub with untrusted users does mean more work and much care is
|
||||
required to secure a Hub against untrusted users, with extra caution on
|
||||
## JupyterHub's design approach
|
||||
|
||||
JupyterHub is designed to be a *simple multi-user server for modestly sized
|
||||
groups* of **semi-trusted** users. While the design reflects serving semi-trusted
|
||||
users, JupyterHub is not necessarily unsuitable for serving **untrusted** users.
|
||||
|
||||
Using JupyterHub with **untrusted** users does mean more work by the
|
||||
administrator. Much care is required to secure a Hub, with extra caution on
|
||||
protecting users from each other as the Hub is serving untrusted users.
|
||||
|
||||
One aspect of JupyterHub's design simplicity for semi-trusted users is that
|
||||
the Hub and single-user servers are placed in a single domain, behind a
|
||||
[proxy][configurable-http-proxy]. As a result, if the Hub is serving untrusted
|
||||
One aspect of JupyterHub's *design simplicity* for **semi-trusted** users is that
|
||||
the Hub and single-user servers are placed in a *single domain*, behind a
|
||||
[*proxy*][configurable-http-proxy]. If the Hub is serving untrusted
|
||||
users, many of the web's cross-site protections are not applied between
|
||||
single-user servers and the Hub, or between single-user servers and each
|
||||
other, since browsers see the whole thing (proxy, Hub, and single user
|
||||
servers) as a single website.
|
||||
servers) as a single website (i.e. single domain).
|
||||
|
||||
To protect users from each other, a user must never be able to write arbitrary
|
||||
## How to protect users from each other
|
||||
|
||||
To protect users from each other, a user must **never** be able to write arbitrary
|
||||
HTML and serve it to another user on the Hub's domain. JupyterHub's
|
||||
authentication setup prevents this because only the owner of a given
|
||||
single-user server is allowed to view user-authored pages served by their
|
||||
server. To protect all users from each other, JupyterHub administrators must
|
||||
authentication setup prevents a user writing arbitrary HTML and serving it to
|
||||
another user because only the owner of a given single-user notebook server is
|
||||
allowed to view user-authored pages served by the given single-user notebook
|
||||
server.
|
||||
|
||||
To protect all users from each other, JupyterHub administrators must
|
||||
ensure that:
|
||||
|
||||
* A user does not have permission to modify their single-user server:
|
||||
- A user may not install new packages in the Python environment that runs
|
||||
their server.
|
||||
- If the PATH is used to resolve the single-user executable (instead of an
|
||||
absolute path), a user may not create new files in any PATH directory
|
||||
that precedes the directory containing jupyterhub-singleuser.
|
||||
* A user **does not have permission** to modify their single-user notebook server,
|
||||
including:
|
||||
- A user **may not** install new packages in the Python environment that runs
|
||||
their single-user server.
|
||||
- If the `PATH` is used to resolve the single-user executable (instead of
|
||||
using an absolute path), a user **may not** create new files in any `PATH`
|
||||
directory that precedes the directory containing `jupyterhub-singleuser`.
|
||||
- A user may not modify environment variables (e.g. PATH, PYTHONPATH) for
|
||||
their single-user server.
|
||||
* A user may not modify the configuration of the notebook server
|
||||
(the ~/.jupyter or JUPYTER_CONFIG_DIR directory).
|
||||
* A user **may not** modify the configuration of the notebook server
|
||||
(the `~/.jupyter` or `JUPYTER_CONFIG_DIR` directory).
|
||||
|
||||
If any additional services are run on the same domain as the Hub, the services
|
||||
must never display user-authored HTML that is neither sanitized nor sandboxed
|
||||
**must never** display user-authored HTML that is neither *sanitized* nor *sandboxed*
|
||||
(e.g. IFramed) to any user that lacks authentication as the author of a file.
|
||||
|
||||
## Mitigate security issues through configuration options
|
||||
|
||||
## Mitigations
|
||||
There are two main approaches to mitigating these issues with configuration
|
||||
options provided by JupyterHub:
|
||||
|
||||
There are two main configuration options provided by JupyterHub to mitigate
|
||||
these issues:
|
||||
### Enable subdomains
|
||||
|
||||
### Subdomains
|
||||
|
||||
JupyterHub 0.5 adds the ability to run single-user servers on their own
|
||||
subdomains, which means the cross-origin protections between servers has the
|
||||
JupyterHub provides the ability to run single-user servers on their own
|
||||
subdomains. This means the cross-origin protections between servers has the
|
||||
desired effect, and user servers and the Hub are protected from each other. A
|
||||
user's server will be at `username.jupyter.mydomain.com`, etc. This requires
|
||||
all user subdomains to point to the same address, which is most easily
|
||||
user's single-user server will be at `username.jupyter.mydomain.com`. This also
|
||||
requires all user subdomains to point to the same address, which is most easily
|
||||
accomplished with wildcard DNS. Since this spreads the service across multiple
|
||||
domains, you will need wildcard SSL, as well. Unfortunately, for many
|
||||
institutional domains, wildcard DNS and SSL are not available, but if you do
|
||||
plan to serve untrusted users, enabling subdomains is highly encouraged, as it
|
||||
resolves all of the cross-site issues.
|
||||
institutional domains, wildcard DNS and SSL are not available. **If you do plan
|
||||
to serve untrusted users, enabling subdomains is highly encouraged**, as it
|
||||
resolves the cross-site issues.
|
||||
|
||||
### Disabling user config
|
||||
### Steps to take when subdomains can not be used
|
||||
|
||||
If subdomains are not available or not desirable, 0.5 also adds an option
|
||||
`Spawner.disable_user_config`, which you can set to prevent the user-owned
|
||||
configuration files from being loaded. This leaves only package installation
|
||||
and PATHs as things the admin must enforce.
|
||||
#### Disable user config
|
||||
|
||||
For most Spawners, PATH is not something users can influence, but care should
|
||||
be taken to ensure that the Spawn does *not* evaluate shell configuration
|
||||
If subdomains are not available or not desirable, JupyterHub provides a a
|
||||
configuration option `Spawner.disable_user_config`, which can be set to prevent
|
||||
the user-owned configuration files from being loaded. After implementing this
|
||||
option, PATHs and package installation and PATHs are the other things that the
|
||||
admin must enforce.
|
||||
|
||||
#### Prevent spawners from evaluating shell configuration files
|
||||
|
||||
For most Spawners, `PATH` is not something users can influence, but care should
|
||||
be taken to ensure that the Spawner does *not* evaluate shell configuration
|
||||
files prior to launching the server.
|
||||
|
||||
#### Isolate packages using virtualenv
|
||||
|
||||
Package isolation is most easily handled by running the single-user server in
|
||||
a virtualenv with disabled system-site-packages.
|
||||
|
||||
## Extra notes
|
||||
|
||||
It is important to note that the control over the environment only affects the
|
||||
single-user server, and not the environment(s) in which the user's kernel(s)
|
||||
may run. Installing additional packages in the kernel environment does not
|
||||
|
Reference in New Issue
Block a user