Merge branch 'master' into named_servers

This commit is contained in:
Min RK
2017-07-24 15:21:42 +02:00
9 changed files with 284 additions and 214 deletions

View File

@@ -1,146 +0,0 @@
# Security
**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.**
---
**Deprecation note:** Removed `--no-ssl` in version 0.7.
JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to
allow running without SSL using the command `jupyterhub --no-ssl`. The
`--no-ssl` command line option is not needed anymore in version 0.7.
---
Security is the most important aspect of configuring Jupyter. There are four main aspects of the
security configuration:
1. SSL encryption (to enable HTTPS)
2. Cookie secret (a key for encrypting browser cookies)
3. Proxy authentication token (used for the Hub and other services to authenticate to the Proxy)
4. Periodic security audits
*Note* that the **Hub** hashes all secrets (e.g., auth tokens) before storing them in its
database. A loss of control over read-access to the database should have no security impact
on your deployment.
## SSL encryption
Since JupyterHub includes authentication and allows arbitrary code execution, you should not run
it without SSL (HTTPS). This will require you to obtain an official, trusted SSL certificate or
create a self-signed certificate. Once you have obtained and installed a key and certificate you
need to specify their locations in the configuration file as follows:
```python
c.JupyterHub.ssl_key = '/path/to/my.key'
c.JupyterHub.ssl_cert = '/path/to/my.cert'
```
It is also possible to use letsencrypt (https://letsencrypt.org/) to obtain
a free, trusted SSL certificate. If you run letsencrypt using the default
options, the needed configuration is (replace `mydomain.tld` by your fully
qualified domain name):
```python
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
```
If the fully qualified domain name (FQDN) is `example.com`, the following
would be the needed configuration:
```python
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
```
Some cert files also contain the key, in which case only the cert is needed. It is important that
these files be put in a secure location on your server, where they are not readable by regular
users.
Note on **chain certificates**: If you are using a chain certificate, see also
[chained certificate for SSL](troubleshooting.md#chained-certificates-for-ssl) in the JupyterHub troubleshooting FAQ).
Note: In certain cases, e.g. **behind SSL termination in nginx**, allowing no SSL
running on the hub may be desired.
## Cookie secret
The cookie secret is an encryption key, used to encrypt the browser cookies used for
authentication. If this value changes for the Hub, all single-user servers must also be restarted.
Normally, this value is stored in a file, the location of which can be specified in a config file
as follows:
```python
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret'
```
The content of this file should be 32 random bytes, encoded as hex.
An example would be to generate this file with:
```bash
openssl rand -hex 32 > /srv/jupyterhub/cookie_secret
```
In most deployments of JupyterHub, you should point this to a secure location on the file
system, such as `/srv/jupyterhub/cookie_secret`. If the cookie secret file doesn't exist when
the Hub starts, a new cookie secret is generated and stored in the file. The
file must not be readable by group or other or the server won't start.
The recommended permissions for the cookie secret file are 600 (owner-only rw).
If you would like to avoid the need for files, the value can be loaded in the Hub process from
the `JPY_COOKIE_SECRET` environment variable, which is a hex-encoded string. You
can set it this way:
```bash
export JPY_COOKIE_SECRET=`openssl rand -hex 32`
```
For security reasons, this environment variable should only be visible to the Hub.
If you set it dynamically as above, all users will be logged out each time the
Hub starts.
You can also set the cookie secret in the configuration file itself,`jupyterhub_config.py`,
as a binary string:
```python
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
```
## Proxy authentication token
The Hub authenticates its requests to the Proxy using a secret token that
the Hub and Proxy agree upon. The value of this string should be a random
string (for example, generated by `openssl rand -hex 32`). You can pass
this value to the Hub and Proxy using either the `CONFIGPROXY_AUTH_TOKEN`
environment variable:
```bash
export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32`
```
This environment variable needs to be visible to the Hub and Proxy.
Or you can set the value in the configuration file, `jupyterhub_config.py`:
```python
c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
```
If you don't set the Proxy authentication token, the Hub will generate a random key itself, which
means that any time you restart the Hub you **must also restart the Proxy**. If the proxy is a
subprocess of the Hub, this should happen automatically (this is the default configuration).
Another time you must set the Proxy authentication token yourself is if
you want other services, such as [nbgrader](https://github.com/jupyter/nbgrader)
to also be able to connect to the Proxy.
## Security audits
We recommend that you do periodic reviews of your deployment's security. It's
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
versions up to date.
A handy website for testing your deployment is
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).

View File

@@ -0,0 +1,181 @@
Security settings
=================
.. important::
You should not run JupyterHub without SSL encryption on a public network.
Security is the most important aspect of configuring Jupyter. Three
configuration settings are the main aspects of security configuration:
1. :ref:`SSL encryption <ssl-encryption>` (to enable HTTPS)
2. :ref:`Cookie secret <cookie-secret>` (a key for encrypting browser cookies)
3. Proxy :ref:`authentication token <authentication-token>` (used for the Hub and
other services to authenticate to the Proxy)
The Hub hashes all secrets (e.g., auth tokens) before storing them in its
database. A loss of control over read-access to the database should have
minimal impact on your deployment; if your database has been compromised, it
is still a good idea to revoke existing tokens.
.. _ssl-encryption:
Enabling SSL encryption
-----------------------
Since JupyterHub includes authentication and allows arbitrary code execution,
you should not run it without SSL (HTTPS).
Using an SSL certificate
~~~~~~~~~~~~~~~~~~~~~~~~
This will require you to obtain an official, trusted SSL certificate or create a
self-signed certificate. Once you have obtained and installed a key and
certificate you need to specify their locations in the ``jupyterhub_config.py``
configuration file as follows:
.. code-block:: python
c.JupyterHub.ssl_key = '/path/to/my.key'
c.JupyterHub.ssl_cert = '/path/to/my.cert'
Some cert files also contain the key, in which case only the cert is needed. It
is important that these files be put in a secure location on your server, where
they are not readable by regular users.
If you are using a **chain certificate**, see also chained certificate for SSL
in the JupyterHub `troubleshooting FAQ <troubleshooting>`_.
Using letsencrypt
~~~~~~~~~~~~~~~~~
It is also possible to use `letsencrypt <https://letsencrypt.org/>`_ to obtain
a free, trusted SSL certificate. If you run letsencrypt using the default
options, the needed configuration is (replace ``mydomain.tld`` by your fully
qualified domain name):
.. code-block:: python
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem'
If the fully qualified domain name (FQDN) is ``example.com``, the following
would be the needed configuration:
.. code-block:: python
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem'
If SSL termination happens outside of the Hub
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In certain cases, e.g. behind `SSL termination in NGINX <https://www.nginx.com/resources/admin-guide/nginx-ssl-termination/>`_,
allowing no SSL running on the hub may be the desired configuration option.
.. _cookie-secret:
Cookie secret
-------------
The cookie secret is an encryption key, used to encrypt the browser cookies
which are used for authentication. Three common methods are described for
generating and configuring the cookie secret.
Generating and storing as a cookie secret file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The cookie secret should be 32 random bytes, encoded as hex, and is typically
stored in a ``jupyterhub_cookie_secret`` file. An example command to generate the
``jupyterhub_cookie_secret`` file is:
.. code-block:: bash
openssl rand -hex 32 > /srv/jupyterhub/jupyterhub_cookie_secret
In most deployments of JupyterHub, you should point this to a secure location on
the file system, such as ``/srv/jupyterhub/jupyterhub_cookie_secret``.
The location of the ``jupyterhub_cookie_secret`` file can be specified in the
``jupyterhub_config.py`` file as follows:
.. code-block:: python
c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/jupyterhub_cookie_secret'
If the cookie secret file doesn't exist when the Hub starts, a new cookie
secret is generated and stored in the file. The file must not be readable by
``group`` or ``other`` or the server won't start. The recommended permissions
for the cookie secret file are ``600`` (owner-only rw).
Generating and storing as an environment variable
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you would like to avoid the need for files, the value can be loaded in the
Hub process from the ``JPY_COOKIE_SECRET`` environment variable, which is a
hex-encoded string. You can set it this way:
.. code-block:: bash
export JPY_COOKIE_SECRET=`openssl rand -hex 32`
For security reasons, this environment variable should only be visible to the
Hub. If you set it dynamically as above, all users will be logged out each time
the Hub starts.
Generating and storing as a binary string
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can also set the cookie secret in the configuration file
itself, ``jupyterhub_config.py``, as a binary string:
.. code-block:: python
c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING')
.. important::
If the cookie secret value changes for the Hub, all single-user notebook
servers must also be restarted.
.. _authentication-token:
Proxy authentication token
--------------------------
The Hub authenticates its requests to the Proxy using a secret token that
the Hub and Proxy agree upon. The value of this string should be a random
string (for example, generated by ``openssl rand -hex 32``).
Generating and storing token in the configuration file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Or you can set the value in the configuration file, ``jupyterhub_config.py``:
.. code-block:: python
c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5'
Generating and storing as an environment variable
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can pass this value of the proxy authentication token to the Hub and Proxy
using the ``CONFIGPROXY_AUTH_TOKEN`` environment variable:
.. code-block:: bash
export CONFIGPROXY_AUTH_TOKEN='openssl rand -hex 32'
This environment variable needs to be visible to the Hub and Proxy.
Default if token is not set
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you don't set the Proxy authentication token, the Hub will generate a random
key itself, which means that any time you restart the Hub you **must also
restart the Proxy**. If the proxy is a subprocess of the Hub, this should happen
automatically (this is the default configuration).

View File

@@ -1,80 +1,112 @@
# Web Security in JupyterHub
# Security Overview
JupyterHub is designed to be a simple multi-user server for modestly sized
groups of semi-trusted users. While the design reflects serving semi-trusted
users, JupyterHub is not necessarily unsuitable for serving untrusted users.
Using JupyterHub with untrusted users does mean more work and much care is
required to secure a Hub against untrusted users, with extra caution on
The **Security Overview** section helps you learn about:
- the design of JupyterHub with respect to web security
- the semi-trusted user
- the available mitigations to protect untrusted users from each other
- the value of periodic security audits.
This overview also helps you obtain a deeper understanding of how JupyterHub
works.
## Semi-trusted and untrusted users
JupyterHub is designed to be a *simple multi-user server for modestly sized
groups* of **semi-trusted** users. While the design reflects serving semi-trusted
users, JupyterHub is not necessarily unsuitable for serving **untrusted** users.
Using JupyterHub with **untrusted** users does mean more work by the
administrator. Much care is required to secure a Hub, with extra caution on
protecting users from each other as the Hub is serving untrusted users.
One aspect of JupyterHub's design simplicity for semi-trusted users is that
the Hub and single-user servers are placed in a single domain, behind a
[proxy][configurable-http-proxy]. As a result, if the Hub is serving untrusted
One aspect of JupyterHub's *design simplicity* for **semi-trusted** users is that
the Hub and single-user servers are placed in a *single domain*, behind a
[*proxy*][configurable-http-proxy]. If the Hub is serving untrusted
users, many of the web's cross-site protections are not applied between
single-user servers and the Hub, or between single-user servers and each
other, since browsers see the whole thing (proxy, Hub, and single user
servers) as a single website.
servers) as a single website (i.e. single domain).
To protect users from each other, a user must never be able to write arbitrary
## Protect users from each other
To protect users from each other, a user must **never** be able to write arbitrary
HTML and serve it to another user on the Hub's domain. JupyterHub's
authentication setup prevents this because only the owner of a given
single-user server is allowed to view user-authored pages served by their
server. To protect all users from each other, JupyterHub administrators must
authentication setup prevents a user writing arbitrary HTML and serving it to
another user because only the owner of a given single-user notebook server is
allowed to view user-authored pages served by the given single-user notebook
server.
To protect all users from each other, JupyterHub administrators must
ensure that:
* A user does not have permission to modify their single-user server:
- A user may not install new packages in the Python environment that runs
their server.
- If the PATH is used to resolve the single-user executable (instead of an
absolute path), a user may not create new files in any PATH directory
that precedes the directory containing jupyterhub-singleuser.
* A user **does not have permission** to modify their single-user notebook server,
including:
- A user **may not** install new packages in the Python environment that runs
their single-user server.
- If the `PATH` is used to resolve the single-user executable (instead of
using an absolute path), a user **may not** create new files in any `PATH`
directory that precedes the directory containing `jupyterhub-singleuser`.
- A user may not modify environment variables (e.g. PATH, PYTHONPATH) for
their single-user server.
* A user may not modify the configuration of the notebook server
(the ~/.jupyter or JUPYTER_CONFIG_DIR directory).
* A user **may not** modify the configuration of the notebook server
(the `~/.jupyter` or `JUPYTER_CONFIG_DIR` directory).
If any additional services are run on the same domain as the Hub, the services
must never display user-authored HTML that is neither sanitized nor sandboxed
**must never** display user-authored HTML that is neither *sanitized* nor *sandboxed*
(e.g. IFramed) to any user that lacks authentication as the author of a file.
## Mitigate security issues
## Mitigations
Several approaches to mitigating these issues with configuration
options provided by JupyterHub include:
There are two main configuration options provided by JupyterHub to mitigate
these issues:
### Enable subdomains
### Subdomains
JupyterHub 0.5 adds the ability to run single-user servers on their own
subdomains, which means the cross-origin protections between servers has the
JupyterHub provides the ability to run single-user servers on their own
subdomains. This means the cross-origin protections between servers has the
desired effect, and user servers and the Hub are protected from each other. A
user's server will be at `username.jupyter.mydomain.com`, etc. This requires
all user subdomains to point to the same address, which is most easily
user's single-user server will be at `username.jupyter.mydomain.com`. This also
requires all user subdomains to point to the same address, which is most easily
accomplished with wildcard DNS. Since this spreads the service across multiple
domains, you will need wildcard SSL, as well. Unfortunately, for many
institutional domains, wildcard DNS and SSL are not available, but if you do
plan to serve untrusted users, enabling subdomains is highly encouraged, as it
resolves all of the cross-site issues.
institutional domains, wildcard DNS and SSL are not available. **If you do plan
to serve untrusted users, enabling subdomains is highly encouraged**, as it
resolves the cross-site issues.
### Disabling user config
### Disable user config
If subdomains are not available or not desirable, 0.5 also adds an option
`Spawner.disable_user_config`, which you can set to prevent the user-owned
configuration files from being loaded. This leaves only package installation
and PATHs as things the admin must enforce.
If subdomains are not available or not desirable, JupyterHub provides a a
configuration option `Spawner.disable_user_config`, which can be set to prevent
the user-owned configuration files from being loaded. After implementing this
option, PATHs and package installation and PATHs are the other things that the
admin must enforce.
For most Spawners, PATH is not something users can influence, but care should
be taken to ensure that the Spawn does *not* evaluate shell configuration
### Prevent spawners from evaluating shell configuration files
For most Spawners, `PATH` is not something users can influence, but care should
be taken to ensure that the Spawner does *not* evaluate shell configuration
files prior to launching the server.
Package isolation is most easily handled by running the single-user server in
a virtualenv with disabled system-site-packages.
### Isolate packages using virtualenv
## Extra notes
Package isolation is most easily handled by running the single-user server in
a virtualenv with disabled system-site-packages. The user should not have
permission to install packages into this environment.
It is important to note that the control over the environment only affects the
single-user server, and not the environment(s) in which the user's kernel(s)
may run. Installing additional packages in the kernel environment does not
pose additional risk to the web application's security.
## Security audits
We recommend that you do periodic reviews of your deployment's security. It's
good practice to keep JupyterHub, configurable-http-proxy, and nodejs
versions up to date.
A handy website for testing your deployment is
[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html).
[configurable-http-proxy]: https://github.com/jupyterhub/configurable-http-proxy

View File

@@ -14,8 +14,6 @@ import os
import re
import shutil
import signal
import socket
from subprocess import Popen
import sys
from textwrap import dedent
import threading
@@ -358,14 +356,14 @@ class JupyterHub(Application):
).tag(config=True)
proxy_cmd = Command([], config=True,
help="DEPRECATED. Use ConfigurableHTTPProxy.command",
help="DEPRECATED since version 0.8. Use ConfigurableHTTPProxy.command",
).tag(config=True)
debug_proxy = Bool(False,
help="DEPRECATED: Use ConfigurableHTTPProxy.debug",
help="DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.debug",
).tag(config=True)
proxy_auth_token = Unicode(
help="DEPRECATED: Use ConfigurableHTTPProxy.auth_token"
help="DEPRECATED since version 0.8: Use ConfigurableHTTPProxy.auth_token"
).tag(config=True)
_proxy_config_map = {
@@ -380,10 +378,10 @@ class JupyterHub(Application):
self.config.ConfigurableHTTPProxy[dest] = change.new
proxy_api_ip = Unicode(
help="DEPRECATED: Use ConfigurableHTTPProxy.api_url"
help="DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url"
).tag(config=True)
proxy_api_port = Integer(
help="DEPRECATED: Use ConfigurableHTTPProxy.api_url"
help="DEPRECATED since version 0.8 : Use ConfigurableHTTPProxy.api_url"
).tag(config=True)
@observe('proxy_api_port', 'proxy_api_ip')
def _deprecated_proxy_api(self, change):
@@ -465,7 +463,8 @@ class JupyterHub(Application):
@observe('api_tokens')
def _deprecate_api_tokens(self, change):
self.log.warning("JupyterHub.api_tokens is pending deprecation."
self.log.warning("JupyterHub.api_tokens is pending deprecation"
" since JupyterHub version 0.8."
" Consider using JupyterHub.service_tokens."
" If you have a use case for services that identify as users,"
" let us know: https://github.com/jupyterhub/jupyterhub/issues"
@@ -573,7 +572,7 @@ class JupyterHub(Application):
"""
).tag(config=True)
admin_users = Set(
help="""DEPRECATED, use Authenticator.admin_users instead."""
help="""DEPRECATED since version 0.7.2, use Authenticator.admin_users instead."""
).tag(config=True)
tornado_settings = Dict(
@@ -866,7 +865,7 @@ class JupyterHub(Application):
if self.admin_users and not self.authenticator.admin_users:
self.log.warning(
"\nJupyterHub.admin_users is deprecated."
"\nJupyterHub.admin_users is deprecated since version 0.7.2."
"\nUse Authenticator.admin_users instead."
)
self.authenticator.admin_users = self.admin_users
@@ -1170,7 +1169,6 @@ class JupyterHub(Application):
self.session_factory,
url_prefix=url_path_join(base_url, 'api/oauth2'),
login_url=url_path_join(base_url, 'login')
,
)
def init_proxy(self):

View File

@@ -98,6 +98,8 @@ class BaseHandler(RequestHandler):
def finish(self, *args, **kwargs):
"""Roll back any uncommitted transactions from the handler."""
if self.db.dirty:
self.log.warning("Rolling back dirty objects %s", self.db.dirty)
self.db.rollback()
super().finish(*args, **kwargs)

View File

@@ -87,11 +87,11 @@ class LoginHandler(BaseHandler):
authenticated = yield self.authenticate(data)
auth_timer.stop(send=False)
if authenticated:
# unpack auth dict
username = authenticated['name']
auth_state = authenticated.get('auth_state')
if authenticated:
self.statsd.incr('login.success')
self.statsd.timing('login.authenticate.success', auth_timer.ms)
user = self.user_from_username(username)
@@ -101,7 +101,7 @@ class LoginHandler(BaseHandler):
already_running = False
if user.spawner:
status = yield user.spawner.poll()
already_running = (status == None)
already_running = (status is None)
if not already_running and not user.spawner.options_form:
yield self.spawn_single_user(user)
self.set_login_cookie(user)
@@ -117,7 +117,7 @@ class LoginHandler(BaseHandler):
self.log.debug("Failed login for %s", data.get('username', 'unknown user'))
html = self._render(
login_error='Invalid username or password',
username=username,
username=data['username'],
)
self.finish(html)

View File

@@ -82,7 +82,11 @@ class Server(HasTraits):
# setter to pass through to the database
@observe('ip', 'proto', 'port', 'base_url', 'cookie_name')
def _change(self, change):
if self.orm_server:
if self.orm_server and getattr(self.orm_server, change.name) != change.new:
# setattr on an sqlalchemy object sets the dirty flag,
# even if the value doesn't change.
# Avoid calling setattr when there's been no change,
# to avoid setting the dirty flag and triggering rollback.
setattr(self.orm_server, change.name, change.new)
@property

View File

@@ -309,6 +309,8 @@ class Proxy(LoggingConfigurable):
self.log.warning(
"Adding missing route for %s (%s)", spec, spawner.server)
futures.append(self.add_user(user, name))
elif spawner._proxy_pending:
good_routes.add(user.proxy_spec(name))
# check service routes
service_routes = {r['data']['service']

View File

@@ -72,10 +72,7 @@ def test_admin_not_admin(app):
assert r.status_code == 403
def test_admin(app):
cookies = app.login_user('river')
u = orm.User.find(app.db, 'river')
u.admin = True
app.db.commit()
cookies = app.login_user('admin')
r = get_page('admin', app, cookies=cookies)
r.raise_for_status()
assert r.url.endswith('/admin')