diff --git a/README.md b/README.md index a959008f..2e330d77 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ [![PyPI](https://img.shields.io/pypi/v/jupyterhub.svg)](https://pypi.python.org/pypi/jupyterhub) [![Documentation Status](https://readthedocs.org/projects/jupyterhub/badge/?version=latest)](http://jupyterhub.readthedocs.org/en/latest/?badge=latest) +[![Documentation Status](http://readthedocs.org/projects/jupyterhub/badge/?version=0.7.2)](http://jupyterhub.readthedocs.io/en/0.7.2/?badge=0.7.2) [![Build Status](https://travis-ci.org/jupyterhub/jupyterhub.svg?branch=master)](https://travis-ci.org/jupyterhub/jupyterhub) [![Circle CI](https://circleci.com/gh/jupyterhub/jupyterhub.svg?style=shield&circle-token=b5b65862eb2617b9a8d39e79340b0a6b816da8cc)](https://circleci.com/gh/jupyterhub/jupyterhub) [![codecov.io](https://codecov.io/github/jupyterhub/jupyterhub/coverage.svg?branch=master)](https://codecov.io/github/jupyterhub/jupyterhub?branch=master) @@ -51,7 +52,9 @@ for administration of the Hub and its users. ### Check prerequisites -- [Python](https://www.python.org/downloads/) 3.3 or greater +A Linux/Unix based system with the following: + +- [Python](https://www.python.org/downloads/) 3.4 or greater - [nodejs/npm](https://www.npmjs.com/) Install a recent version of [nodejs/npm](https://docs.npmjs.com/getting-started/installing-node) For example, install it on Linux (Debian/Ubuntu) using: @@ -205,6 +208,20 @@ We use [pytest](http://doc.pytest.org/en/latest/) for **running tests**: pytest jupyterhub/tests ``` +### A note about platform support + +JupyterHub is supported on Linux/Unix based systems. + +JupyterHub officially **does not** support Windows. You may be able to use +JupyterHub on Windows if you use a Spawner and Authenticator that work on +Windows, but the JupyterHub defaults will not. Bugs reported on Windows will not +be accepted, and the test suite will not run on Windows. Small patches that fix +minor Windows compatibility issues (such as basic installation) **may** be accepted, +however. For Windows-based systems, we would recommend running JupyterHub in a +docker container or Linux VM. + +[Additional Reference:](http://www.tornadoweb.org/en/stable/#installation) Tornado's documentation on Windows platform support + ## License We use a shared copyright model that enables all contributors to maintain the diff --git a/bower.json b/bower.json index ce78039e..f810ad29 100644 --- a/bower.json +++ b/bower.json @@ -2,10 +2,10 @@ "name": "jupyterhub-deps", "version": "0.0.0", "dependencies": { - "bootstrap": "components/bootstrap#~3.1", - "font-awesome": "components/font-awesome#~4.1", - "jquery": "components/jquery#~2.0", - "moment": "~2.7", - "requirejs": "~2.1" + "bootstrap": "components/bootstrap#~3.3", + "font-awesome": "components/font-awesome#~4.7", + "jquery": "components/jquery#~3.2", + "moment": "~2.18", + "requirejs": "~2.3" } } diff --git a/docs/source/authenticators-users-basics.md b/docs/source/authenticators-users-basics.md new file mode 100644 index 00000000..d70edc0b --- /dev/null +++ b/docs/source/authenticators-users-basics.md @@ -0,0 +1,75 @@ +# Authentication and Users + +The default Authenticator uses [PAM][] to authenticate system users with +their username and password. The default behavior of this Authenticator +is to allow any user with an account and password on the system to login. + +## Creating a whitelist of users + +You can restrict which users are allowed to login with `Authenticator.whitelist`: + + +```python +c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'} +``` + +Users listed in the whitelist are added to the Hub database when the Hub is +started. + +## Managing Hub administrators + +### Configuring admins (`admin_users`) + +Admin users of JupyterHub, `admin_users`, have the ability to add and remove +users from the user `whitelist` or to take actions on the users' behalf, +such as stopping and restarting their servers. + +A set of initial admin users, `admin_users` can configured be as follows: + +```python +c.Authenticator.admin_users = {'mal', 'zoe'} +``` +Users in the admin list are automatically added to the user `whitelist`, +if they are not already present. + +### Admin access to other users' notebook servers (`admin_access`) + +By default the admin users do not have permission to log in *as other users* +since the default `JupyterHub.admin_access` setting is False. +If `JupyterHub.admin_access` is set to True, then admin users have permission +to log in *as other users* on their respective machines, for debugging. +**You should make sure your users know if admin_access is enabled.** + +Note: additional configuration examples are provided in this guide's +[Configuration Examples section](./config-examples.html). + +### Add or remove users from the Hub + +Users can be added to and removed from the Hub via either the admin panel or +REST API. + +If a user is **added**, the user will be automatically added to the whitelist +and database. Restarting the Hub will not require manually updating the +whitelist in your config file, as the users will be loaded from the database. + +After starting the Hub once, it is not sufficient to **remove** a user from +the whitelist in your config file. You must also remove the user from the Hub's +database, either by deleting the user from the admin page, or you can clear +the `jupyterhub.sqlite` database and start fresh. + +The default `PAMAuthenticator` is one case of a special kind of authenticator, called a +`LocalAuthenticator`, indicating that it manages users on the local system. When you add a user to +the Hub, a `LocalAuthenticator` checks if that user already exists. Normally, there will be an +error telling you that the user doesn't exist. If you set the configuration value + +```python +c.LocalAuthenticator.create_system_users = True +``` + +however, adding a user to the Hub that doesn't already exist on the system will result in the Hub +creating that user via the system `adduser` command line tool. This option is typically used on +hosted deployments of JupyterHub, to avoid the need to manually create all your users before +launching the service. It is not recommended when running JupyterHub in situations where +JupyterHub users maps directly onto UNIX users. + +[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module diff --git a/docs/source/changelog.md b/docs/source/changelog.md index cdad1104..859289d2 100644 --- a/docs/source/changelog.md +++ b/docs/source/changelog.md @@ -1,4 +1,4 @@ -# Change log summary +# Changelog For detailed changes from the prior release, click on the version number, and its link will bring up a GitHub listing of changes. Use `git log` on the @@ -7,6 +7,16 @@ command line for details. ## [Unreleased] 0.8 +#### Added + +#### Changed + +#### Fixed + +#### Removed + +- End support for Python 3.3 + ## 0.7 ### [0.7.2] - 2017-01-09 diff --git a/docs/source/config-basics.md b/docs/source/config-basics.md new file mode 100644 index 00000000..e1b798d4 --- /dev/null +++ b/docs/source/config-basics.md @@ -0,0 +1,69 @@ +# Configuration Basics + +The [getting started document](docs/source/getting-started.md) contains +general information about configuring a JupyterHub deployment and the +[configuration reference](docs/source/configuration-guide.md) provides more +comprehensive detail. + +## JupyterHub configuration + +Configuration parameters may be set by: +- a configuration file `jupyterhub_config.py`, or +- as options from the command line. + +### Generate a default config file + +On startup, JupyterHub will look by default for a configuration file named +`jupyterhub_config.py` in the current working directory. + +To generate a default config file `jupyterhub_config.py`: + +```bash +jupyterhub --generate-config +``` + +This default `jupyterhub_config.py` file contains comments and guidance for all +configuration variables and their default values. + +### Configure using command line options + +To display all command line options that are available for configuration: + + jupyterhub --help-all + +Configuration using the command line options is done when launching JupyterHub. +For example, to start JupyterHub on ``10.0.1.2:443`` with **https**, you +would enter: + + jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert + +All configurable options are technically configurable on the command-line, +even if some are really inconvenient to type. Just replace the desired option, +`c.Class.trait`, with `--Class.trait`. For example, to configure the +`c.Spawner.notebook_dir` trait from the command-line, use the +`--Spawner.notebook_dir` option: + +```bash +jupyterhub --Spawner.notebook_dir='~/assignments' +``` + +### Load a specific config file + +You can load a specific config file with: + +```bash +jupyterhub -f /path/to/jupyterhub_config.py +``` + +See also: [general docs](http://ipython.org/ipython-doc/dev/development/config.html) +on the config system Jupyter uses. + +### Configuration for different deployment environments + +The default authentication and process spawning mechanisms can be replaced, +which allows plugging into a variety of authentication methods or process +control and deployment environments. Some examples, meant as illustration and +testing of this concept, are: + +- Using GitHub OAuth instead of PAM with [OAuthenticator](https://github.com/jupyterhub/oauthenticator) +- Spawning single-user servers with Docker, using the [DockerSpawner](https://github.com/jupyterhub/dockerspawner) diff --git a/docs/source/configuration-guide.rst b/docs/source/configuration-guide.rst index 3a09760d..8973dc42 100644 --- a/docs/source/configuration-guide.rst +++ b/docs/source/configuration-guide.rst @@ -1,12 +1,14 @@ -Configuration Guide -=================== +Configuration Reference +======================= .. toctree:: :maxdepth: 2 + howitworks + websecurity + rest authenticators spawners services - config-examples upgrading - troubleshooting + config-examples diff --git a/docs/source/getting-started.md b/docs/source/getting-started.md deleted file mode 100644 index f5c36179..00000000 --- a/docs/source/getting-started.md +++ /dev/null @@ -1,542 +0,0 @@ -# Getting started with JupyterHub - -This section contains getting started information on the following topics: - -- [Technical Overview](#technical-overview) -- [Installation](#installation) -- [Configuration](#configuration) -- [Networking](#networking) -- [Security](#security) -- [Authentication and users](#authentication-and-users) -- [Spawners and single-user notebook servers](#spawners-and-single-user-notebook-servers) -- [External Services](#external-services) - - -## Technical Overview - -JupyterHub is a set of processes that together provide a single user Jupyter -Notebook server for each person in a group. - -### Three subsystems -Three major subsystems run by the `jupyterhub` command line program: - -- **Single-User Notebook Server**: a dedicated, single-user, Jupyter Notebook server is - started for each user on the system when the user logs in. The object that - starts these servers is called a **Spawner**. -- **Proxy**: the public facing part of JupyterHub that uses a dynamic proxy - to route HTTP requests to the Hub and Single User Notebook Servers. -- **Hub**: manages user accounts, authentication, and coordinates Single User - Notebook Servers using a Spawner. - -![JupyterHub subsystems](images/jhub-parts.png) - - -### Deployment server -To use JupyterHub, you need a Unix server (typically Linux) running somewhere -that is accessible to your team on the network. The JupyterHub server can be -on an internal network at your organization, or it can run on the public -internet (in which case, take care with the Hub's -[security](#security)). - -### Basic operation -Users access JupyterHub through a web browser, by going to the IP address or -the domain name of the server. - -Basic principles of operation: - -* Hub spawns proxy -* Proxy forwards all requests to hub by default -* Hub handles login, and spawns single-user servers on demand -* Hub configures proxy to forward url prefixes to single-user servers - -Different **[authenticators](authenticators.html)** control access -to JupyterHub. The default one (PAM) uses the user accounts on the server where -JupyterHub is running. If you use this, you will need to create a user account -on the system for each user on your team. Using other authenticators, you can -allow users to sign in with e.g. a GitHub account, or with any single-sign-on -system your organization has. - -Next, **[spawners](spawners.html)** control how JupyterHub starts -the individual notebook server for each user. The default spawner will -start a notebook server on the same machine running under their system username. -The other main option is to start each server in a separate container, often -using Docker. - -### Default behavior - -**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.** - -See [Security documentation](#security) for how to configure JupyterHub to use SSL, -or put it behind SSL termination in another proxy server, such as nginx. - ---- - -**Deprecation note:** Removed `--no-ssl` in version 0.7. - -JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to -allow running without SSL using the command `jupyterhub --no-ssl`. The -`--no-ssl` command line option is not needed anymore in version 0.7. - ---- - -To start JupyterHub in its default configuration, type the following at the command line: - -```bash - sudo jupyterhub -``` - -The default Authenticator that ships with JupyterHub authenticates users -with their system name and password (via [PAM][]). -Any user on the system with a password will be allowed to start a single-user notebook server. - -The default Spawner starts servers locally as each user, one dedicated server per user. -These servers listen on localhost, and start in the given user's home directory. - -By default, the **Proxy** listens on all public interfaces on port 8000. -Thus you can reach JupyterHub through either: - -- `http://localhost:8000` -- or any other public IP or domain pointing to your system. - -In their default configuration, the other services, the **Hub** and **Single-User Servers**, -all communicate with each other on localhost only. - -By default, starting JupyterHub will write two files to disk in the current working directory: - -- `jupyterhub.sqlite` is the sqlite database containing all of the state of the **Hub**. - This file allows the **Hub** to remember what users are running and where, - as well as other information enabling you to restart parts of JupyterHub separately. It is - important to note that this database contains *no* sensitive information other than **Hub** - usernames. -- `jupyterhub_cookie_secret` is the encryption key used for securing cookies. - This file needs to persist in order for restarting the Hub server to avoid invalidating cookies. - Conversely, deleting this file and restarting the server effectively invalidates all login cookies. - The cookie secret file is discussed in the [Cookie Secret documentation](#cookie-secret). - -The location of these files can be specified via configuration, discussed below. - -## Installation - -See the project's [README](https://github.com/jupyterhub/jupyterhub/blob/master/README.md) -for help installing JupyterHub. - -### Planning your installation - -Prior to beginning installation, it's helpful to consider some of the following: -- deployment system (bare metal, Docker) -- Authentication (PAM, OAuth, etc.) -- Spawner of singleuser notebook servers (Docker, Batch, etc.) -- Services (nbgrader, etc.) -- JupyterHub database (default SQLite; traditional RDBMS such as PostgreSQL,) - MySQL, or other databases supported by [SQLAlchemy](http://www.sqlalchemy.org)) - -### Folders and File Locations - -It is recommended to put all of the files used by JupyterHub into standard -UNIX filesystem locations. - -* `/srv/jupyterhub` for all security and runtime files -* `/etc/jupyterhub` for all configuration files -* `/var/log` for log files - -## Configuration - -JupyterHub is configured in two ways: - -1. Configuration file -2. Command-line arguments - -### Configuration file -By default, JupyterHub will look for a configuration file (which may not be created yet) -named `jupyterhub_config.py` in the current working directory. -You can create an empty configuration file with: - -```bash -jupyterhub --generate-config -``` - -This empty configuration file has descriptions of all configuration variables and their default -values. You can load a specific config file with: - -```bash -jupyterhub -f /path/to/jupyterhub_config.py -``` - -See also: [general docs](http://ipython.org/ipython-doc/dev/development/config.html) -on the config system Jupyter uses. - -### Command-line arguments -Type the following for brief information about the command-line arguments: - -```bash -jupyterhub -h -``` - -or: - -```bash -jupyterhub --help-all -``` - -for the full command line help. - -All configurable options are technically configurable on the command-line, -even if some are really inconvenient to type. Just replace the desired option, -`c.Class.trait`, with `--Class.trait`. For example, to configure the -`c.Spawner.notebook_dir` trait from the command-line: - -```bash -jupyterhub --Spawner.notebook_dir='~/assignments' -``` - -## Networking - -### Configuring the Proxy's IP address and port -The Proxy's main IP address setting determines where JupyterHub is available to users. -By default, JupyterHub is configured to be available on all network interfaces -(`''`) on port 8000. **Note**: Use of `'*'` is discouraged for IP configuration; -instead, use of `'0.0.0.0'` is preferred. - -Changing the IP address and port can be done with the following command line -arguments: - -```bash -jupyterhub --ip=192.168.1.2 --port=443 -``` - -Or by placing the following lines in a configuration file: - -```python -c.JupyterHub.ip = '192.168.1.2' -c.JupyterHub.port = 443 -``` - -Port 443 is used as an example since 443 is the default port for SSL/HTTPS. - -Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub. -However, more customized scenarios may need additional networking details to -be configured. - - -### Configuring the Proxy's REST API communication IP address and port (optional) -The Hub service talks to the proxy via a REST API on a secondary port, -whose network interface and port can be configured separately. -By default, this REST API listens on port 8081 of localhost only. - -If running the Proxy separate from the Hub, -configure the REST API communication IP address and port with: - -```python -# ideally a private network address -c.JupyterHub.proxy_api_ip = '10.0.1.4' -c.JupyterHub.proxy_api_port = 5432 -``` - -### Configuring the Hub if Spawners or Proxy are remote or isolated in containers -The Hub service also listens only on localhost (port 8080) by default. -The Hub needs needs to be accessible from both the proxy and all Spawners. -When spawning local servers, an IP address setting of localhost is fine. -If *either* the Proxy *or* (more likely) the Spawners will be remote or -isolated in containers, the Hub must listen on an IP that is accessible. - -```python -c.JupyterHub.hub_ip = '10.0.1.4' -c.JupyterHub.hub_port = 54321 -``` - -## Security - -**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.** - ---- - -**Deprecation note:** Removed `--no-ssl` in version 0.7. - -JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to -allow running without SSL using the command `jupyterhub --no-ssl`. The -`--no-ssl` command line option is not needed anymore in version 0.7. - ---- - -Security is the most important aspect of configuring Jupyter. There are four main aspects of the -security configuration: - -1. SSL encryption (to enable HTTPS) -2. Cookie secret (a key for encrypting browser cookies) -3. Proxy authentication token (used for the Hub and other services to authenticate to the Proxy) -4. Periodic security audits - -*Note* that the **Hub** hashes all secrets (e.g., auth tokens) before storing them in its -database. A loss of control over read-access to the database should have no security impact -on your deployment. - -### SSL encryption - -Since JupyterHub includes authentication and allows arbitrary code execution, you should not run -it without SSL (HTTPS). This will require you to obtain an official, trusted SSL certificate or -create a self-signed certificate. Once you have obtained and installed a key and certificate you -need to specify their locations in the configuration file as follows: - -```python -c.JupyterHub.ssl_key = '/path/to/my.key' -c.JupyterHub.ssl_cert = '/path/to/my.cert' -``` - -It is also possible to use letsencrypt (https://letsencrypt.org/) to obtain -a free, trusted SSL certificate. If you run letsencrypt using the default -options, the needed configuration is (replace `mydomain.tld` by your fully -qualified domain name): - -```python -c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem' -c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem' -``` - -If the fully qualified domain name (FQDN) is `example.com`, the following -would be the needed configuration: - -```python -c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem' -c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem' -``` - -Some cert files also contain the key, in which case only the cert is needed. It is important that -these files be put in a secure location on your server, where they are not readable by regular -users. - -Note on **chain certificates**: If you are using a chain certificate, see also -[chained certificate for SSL](troubleshooting.md#chained-certificates-for-ssl) in the JupyterHub troubleshooting FAQ). - -Note: In certain cases, e.g. **behind SSL termination in nginx**, allowing no SSL -running on the hub may be desired. - -### Cookie secret - -The cookie secret is an encryption key, used to encrypt the browser cookies used for -authentication. If this value changes for the Hub, all single-user servers must also be restarted. -Normally, this value is stored in a file, the location of which can be specified in a config file -as follows: - -```python -c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret' -``` - -The content of this file should be 32 random bytes, encoded as hex. -An example would be to generate this file with: - -```bash -openssl rand -hex 32 > /srv/jupyterhub/cookie_secret -``` - -In most deployments of JupyterHub, you should point this to a secure location on the file -system, such as `/srv/jupyterhub/cookie_secret`. If the cookie secret file doesn't exist when -the Hub starts, a new cookie secret is generated and stored in the file. The -file must not be readable by group or other or the server won't start. -The recommended permissions for the cookie secret file are 600 (owner-only rw). - - -If you would like to avoid the need for files, the value can be loaded in the Hub process from -the `JPY_COOKIE_SECRET` environment variable, which is a hex-encoded string. You -can set it this way: - -```bash -export JPY_COOKIE_SECRET=`openssl rand -hex 32` -``` - -For security reasons, this environment variable should only be visible to the Hub. -If you set it dynamically as above, all users will be logged out each time the -Hub starts. - -You can also set the cookie secret in the configuration file itself,`jupyterhub_config.py`, -as a binary string: - -```python -c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING') -``` - -### Proxy authentication token - -The Hub authenticates its requests to the Proxy using a secret token that -the Hub and Proxy agree upon. The value of this string should be a random -string (for example, generated by `openssl rand -hex 32`). You can pass -this value to the Hub and Proxy using either the `CONFIGPROXY_AUTH_TOKEN` -environment variable: - -```bash -export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32` -``` - -This environment variable needs to be visible to the Hub and Proxy. - -Or you can set the value in the configuration file, `jupyterhub_config.py`: - -```python -c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5' -``` - -If you don't set the Proxy authentication token, the Hub will generate a random key itself, which -means that any time you restart the Hub you **must also restart the Proxy**. If the proxy is a -subprocess of the Hub, this should happen automatically (this is the default configuration). - -Another time you must set the Proxy authentication token yourself is if -you want other services, such as [nbgrader](https://github.com/jupyter/nbgrader) -to also be able to connect to the Proxy. - -### Security audits - -We recommend that you do periodic reviews of your deployment's security. It's -good practice to keep JupyterHub, configurable-http-proxy, and nodejs -versions up to date. - -A handy website for testing your deployment is -[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html). - -## Authentication and users - -The default Authenticator uses [PAM][] to authenticate system users with -their username and password. The default behavior of this Authenticator -is to allow any user with an account and password on the system to login. - -### Creating a whitelist of users - -You can restrict which users are allowed to login with `Authenticator.whitelist`: - - -```python -c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'} -``` - -Users listed in the whitelist are added to the Hub database when the Hub is -started. - -### Managing Hub administrators - -#### Configuring admins (`admin_users`) - -Admin users of JupyterHub, `admin_users`, have the ability to add and remove -users from the user `whitelist` or to take actions on the users' behalf, -such as stopping and restarting their servers. - -A set of initial admin users, `admin_users` can configured be as follows: - -```python -c.Authenticator.admin_users = {'mal', 'zoe'} -``` -Users in the admin list are automatically added to the user `whitelist`, -if they are not already present. - -#### Admin access to other users' notebook servers (`admin_access`) - -By default the admin users do not have permission to log in *as other users* -since the default `JupyterHub.admin_access` setting is False. -If `JupyterHub.admin_access` is set to True, then admin users have permission -to log in *as other users* on their respective machines, for debugging. -**You should make sure your users know if admin_access is enabled.** - -Note: additional configuration examples are provided in this guide's -[Configuration Examples section](./config-examples.html). - -### Add or remove users from the Hub - -Users can be added to and removed from the Hub via either the admin panel or -REST API. - -If a user is **added**, the user will be automatically added to the whitelist -and database. Restarting the Hub will not require manually updating the -whitelist in your config file, as the users will be loaded from the database. - -After starting the Hub once, it is not sufficient to **remove** a user from -the whitelist in your config file. You must also remove the user from the Hub's -database, either by deleting the user from the admin page, or you can clear -the `jupyterhub.sqlite` database and start fresh. - -The default `PAMAuthenticator` is one case of a special kind of authenticator, called a -`LocalAuthenticator`, indicating that it manages users on the local system. When you add a user to -the Hub, a `LocalAuthenticator` checks if that user already exists. Normally, there will be an -error telling you that the user doesn't exist. If you set the configuration value - -```python -c.LocalAuthenticator.create_system_users = True -``` - -however, adding a user to the Hub that doesn't already exist on the system will result in the Hub -creating that user via the system `adduser` command line tool. This option is typically used on -hosted deployments of JupyterHub, to avoid the need to manually create all your users before -launching the service. It is not recommended when running JupyterHub in situations where -JupyterHub users maps directly onto UNIX users. - -## Spawners and single-user notebook servers - -Since the single-user server is an instance of `jupyter notebook`, an entire separate -multi-process application, there are many aspect of that server can configure, and a lot of ways -to express that configuration. - -At the JupyterHub level, you can set some values on the Spawner. The simplest of these is -`Spawner.notebook_dir`, which lets you set the root directory for a user's server. This root -notebook directory is the highest level directory users will be able to access in the notebook -dashboard. In this example, the root notebook directory is set to `~/notebooks`, where `~` is -expanded to the user's home directory. - -```python -c.Spawner.notebook_dir = '~/notebooks' -``` - -You can also specify extra command-line arguments to the notebook server with: - -```python -c.Spawner.args = ['--debug', '--profile=PHYS131'] -``` - -This could be used to set the users default page for the single user server: - -```python -c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb'] -``` - -Since the single-user server extends the notebook server application, -it still loads configuration from the `jupyter_notebook_config.py` config file. -Each user may have one of these files in `$HOME/.jupyter/`. -Jupyter also supports loading system-wide config files from `/etc/jupyter/`, -which is the place to put configuration that you want to affect all of your users. - -## External services - -JupyterHub has a REST API that can be used by external services like the -[cull_idle_servers](https://github.com/jupyterhub/jupyterhub/blob/master/examples/cull-idle/cull_idle_servers.py) -script which monitors and kills idle single-user servers periodically. In order to run such an -external service, you need to provide it an API token. In the case of `cull_idle_servers`, it is passed -as the environment variable called `JPY_API_TOKEN`. - -Currently there are two ways of registering that token with JupyterHub. The first one is to use -the `jupyterhub` command to generate a token for a specific hub user: - -```bash -jupyterhub token -``` - -As of [version 0.6.0](./changelog.html), the preferred way of doing this is to first generate an API token: - -```bash -openssl rand -hex 32 -``` - - -and then write it to your JupyterHub configuration file (note that the **key** is the token while the **value** is the username): - -```python -c.JupyterHub.api_tokens = {'token' : 'username'} -``` - -Upon restarting JupyterHub, you should see a message like below in the logs: - -``` -Adding API token for -``` - -Now you can run your script, i.e. `cull_idle_servers`, by providing it the API token and it will authenticate through -the REST API to interact with it. - - -[oauth-setup]: https://github.com/jupyterhub/oauthenticator#setup -[oauthenticator]: https://github.com/jupyterhub/oauthenticator -[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module diff --git a/docs/source/getting-started.rst b/docs/source/getting-started.rst new file mode 100644 index 00000000..fa57ecfc --- /dev/null +++ b/docs/source/getting-started.rst @@ -0,0 +1,13 @@ +Getting Started +=============== + +.. toctree:: + :maxdepth: 2 + + technical-overview + config-basics + networking-basics + security-basics + authenticators-users-basics + spawners-basics + services-basics diff --git a/docs/source/index.rst b/docs/source/index.rst index d19ce874..ca1e02fc 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,67 +1,81 @@ JupyterHub ========== -With JupyterHub you can create a **multi-user Hub** which spawns, manages, -and proxies multiple instances of the single-user -`Jupyter notebook `_ server. -Due to its flexibility and customization options, JupyterHub can be used to -serve notebooks to a class of students, a corporate data science group, or a -scientific research group. - +`JupyterHub`_, a multi-user **Hub**, spawns, manages, and proxies multiple +instances of the single-user `Jupyter notebook`_ server. +JupyterHub can be used to serve notebooks to a class of students, a corporate +data science group, or a scientific research group. .. image:: images/jhub-parts.png :alt: JupyterHub subsystems :width: 40% :align: right - Three subsystems make up JupyterHub: * a multi-user **Hub** (tornado process) * a **configurable http proxy** (node-http-proxy) * multiple **single-user Jupyter notebook servers** (Python/IPython/tornado) -JupyterHub's basic flow of operations includes: +JupyterHub performs the following functions: - The Hub spawns a proxy - The proxy forwards all requests to the Hub by default - The Hub handles user login and spawns single-user servers on demand -- The Hub configures the proxy to forward URL prefixes to the single-user notebook servers +- The Hub configures the proxy to forward URL prefixes to the single-user + notebook servers -For convenient administration of the Hub, its users, and :doc:`services` -(added in version 0.7), JupyterHub also provides a -`REST API `__. +For convenient administration of the Hub, its users, and :doc:`services`, +JupyterHub also provides a +`REST API`_. Contents -------- -**User Guide** +**Installation Guide** * :doc:`quickstart` -* :doc:`getting-started` +* :doc:`quickstart-docker` +* :doc:`installation-basics` + +**Getting Started** + +* :doc:`technical-overview` +* :doc:`config-basics` +* :doc:`networking-basics` +* :doc:`security-basics` +* :doc:`authenticators-users-basics` +* :doc:`spawners-basics` +* :doc:`services-basics` + +**Configuration Reference** + * :doc:`howitworks` * :doc:`websecurity` * :doc:`rest` - - -**Configuration Guide** - * :doc:`authenticators` * :doc:`spawners` * :doc:`services` -* :doc:`config-examples` * :doc:`upgrading` -* :doc:`troubleshooting` - +* :doc:`config-examples` **API Reference** * :doc:`api/index` -**About JupyterHub** +**Troubleshooting** + +* :doc:`troubleshooting` + + +**Changelog** * :doc:`changelog` + + +**About JupyterHub** + * :doc:`contributor-list` * :doc:`gallery-jhub-deployments` @@ -87,9 +101,16 @@ Full Table of Contents .. toctree:: :maxdepth: 2 - user-guide + installation-guide + getting-started configuration-guide api/index + troubleshooting changelog contributor-list gallery-jhub-deployments + + +.. _JupyterHub: https://github.com/jupyterhub/jupyterhub +.. _Jupyter notebook: https://jupyter-notebook.readthedocs.io/en/latest/ +.. _REST API: http://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/master/docs/rest-api.yml#!/default diff --git a/docs/source/installation-basics.md b/docs/source/installation-basics.md new file mode 100644 index 00000000..7e456bc8 --- /dev/null +++ b/docs/source/installation-basics.md @@ -0,0 +1,34 @@ +# Installation Basics + +## Platform support + +JupyterHub is supported on Linux/Unix based systems. + +JupyterHub officially **does not** support Windows. You may be able to use +JupyterHub on Windows if you use a Spawner and Authenticator that work on +Windows, but the JupyterHub defaults will not. Bugs reported on Windows will not +be accepted, and the test suite will not run on Windows. Small patches that fix +minor Windows compatibility issues (such as basic installation) **may** be accepted, +however. For Windows-based systems, we would recommend running JupyterHub in a +docker container or Linux VM. + +[Additional Reference:](http://www.tornadoweb.org/en/stable/#installation) Tornado's documentation on Windows platform support + +## Planning your installation + +Prior to beginning installation, it's helpful to consider some of the following: +- deployment system (bare metal, Docker) +- Authentication (PAM, OAuth, etc.) +- Spawner of singleuser notebook servers (Docker, Batch, etc.) +- Services (nbgrader, etc.) +- JupyterHub database (default SQLite; traditional RDBMS such as PostgreSQL,) + MySQL, or other databases supported by [SQLAlchemy](http://www.sqlalchemy.org)) + +## Folders and File Locations + +It is recommended to put all of the files used by JupyterHub into standard +UNIX filesystem locations. + +* `/srv/jupyterhub` for all security and runtime files +* `/etc/jupyterhub` for all configuration files +* `/var/log` for log files diff --git a/docs/source/installation-guide.rst b/docs/source/installation-guide.rst new file mode 100644 index 00000000..ccda8667 --- /dev/null +++ b/docs/source/installation-guide.rst @@ -0,0 +1,9 @@ +Installation Guide +================== + +.. toctree:: + :maxdepth: 3 + + quickstart + quickstart-docker + installation-basics diff --git a/docs/source/jupyterhub-deployment-aws.md b/docs/source/jupyterhub-deployment-aws.md deleted file mode 100644 index cd60a50a..00000000 --- a/docs/source/jupyterhub-deployment-aws.md +++ /dev/null @@ -1,45 +0,0 @@ -# JupyterHub Deployment on AWS - -Documentation on deploying JupyterHub on an AWS EC2 Instance using NGINX Plus. - ->CAUTION: Document is a work-in-progress. Information found on this page is partially incomplete and may require additional research. - -## Setting Up Amazon EC2 Instance - -### AMI -Choose one of the following Amazon Machine Images that are compatible with NGINX Plus: - -* NGINX Plus – Amazon Linux AMI (HVM) -* NGINX Plus – Ubuntu AMI (HVM) -* NGINX Plus – Amazon Linux AMI (PV) -* NGINX Plus – Ubuntu AMI (PV) - -Refer to the [NGINX AMI Installation Guide](https://www.nginx.com/resources/admin-guide/setting-nginx-plus-environment-amazon-ec2/) for more information. - -### Instance Type & Storage -Instance type selection depends heavily on memory usage. Amazon Compute Optimized instances are recommended. - -As a rule of thumb consider **100-200 MB/user** plus **5x-10x the amount of data you are loading from disk**, depending on the kind of analysis. After selecting your instance, you can add more memory and select memory type (GP2/IO1) in the 'Add Storage' page. - -(Pictured below: c4.2xlarge) - -![Instance Type](images/instance.png) - -### Configure Security Group -The standard HTTPS and HTTP ports (80, 443) need to be opened to allow JupyterHub to be proxied by NGINX. - -Additionally, in order to enable Docker containers to connect to JupyterHub port 8081 will need to be opened. Open a new 'Custom TCP Rule' and set the Source in CIDR Block Notation to: -> /24 - -Below is a reference image for the security group set-up. Depending on specific use-cases, port rules may differ and likely should not be open to 'anywhere'. Your network IP will also differ. - -![Security Group](images/security.png) - -Refer to the [Amazon EC2 Security Groups for Linux Instances Page](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html) for more information. - ----- - -## To-Do Sections -- [x] Setting Up Amazon EC2 Instance -- [ ] Setting Up JupyterHub & Web Server on EC2 VM -- [ ] Setting Up Docker Spawner diff --git a/docs/source/networking-basics.md b/docs/source/networking-basics.md new file mode 100644 index 00000000..728b892e --- /dev/null +++ b/docs/source/networking-basics.md @@ -0,0 +1,54 @@ +# Networking basics + +## Configuring the Proxy's IP address and port +The Proxy's main IP address setting determines where JupyterHub is available to users. +By default, JupyterHub is configured to be available on all network interfaces +(`''`) on port 8000. **Note**: Use of `'*'` is discouraged for IP configuration; +instead, use of `'0.0.0.0'` is preferred. + +Changing the IP address and port can be done with the following command line +arguments: + +```bash +jupyterhub --ip=192.168.1.2 --port=443 +``` + +Or by placing the following lines in a configuration file: + +```python +c.JupyterHub.ip = '192.168.1.2' +c.JupyterHub.port = 443 +``` + +Port 443 is used as an example since 443 is the default port for SSL/HTTPS. + +Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub. +However, more customized scenarios may need additional networking details to +be configured. + + +## Configuring the Proxy's REST API communication IP address and port (optional) +The Hub service talks to the proxy via a REST API on a secondary port, +whose network interface and port can be configured separately. +By default, this REST API listens on port 8081 of localhost only. + +If running the Proxy separate from the Hub, +configure the REST API communication IP address and port with: + +```python +# ideally a private network address +c.JupyterHub.proxy_api_ip = '10.0.1.4' +c.JupyterHub.proxy_api_port = 5432 +``` + +## Configuring the Hub if Spawners or Proxy are remote or isolated in containers +The Hub service also listens only on localhost (port 8080) by default. +The Hub needs needs to be accessible from both the proxy and all Spawners. +When spawning local servers, an IP address setting of localhost is fine. +If *either* the Proxy *or* (more likely) the Spawners will be remote or +isolated in containers, the Hub must listen on an IP that is accessible. + +```python +c.JupyterHub.hub_ip = '10.0.1.4' +c.JupyterHub.hub_port = 54321 +``` diff --git a/docs/source/quickstart-docker.rst b/docs/source/quickstart-docker.rst new file mode 100644 index 00000000..08581b68 --- /dev/null +++ b/docs/source/quickstart-docker.rst @@ -0,0 +1,49 @@ +Using Docker +============ + +.. important:: + + We highly recommend following the `Zero to JupyterHub`_ tutorial for + installing JupyterHub. + +Alternate installation using Docker +----------------------------------- + +A ready to go `docker image `_ +gives a straightforward deployment of JupyterHub. + +.. note:: + + This ``jupyterhub/jupyterhub`` docker image is only an image for running + the Hub service itself. It does not provide the other Jupyter components, + such as Notebook installation, which are needed by the single-user servers. + To run the single-user servers, which may be on the same system as the Hub or + not, Jupyter Notebook version 4 or greater must be installed. + +Starting JupyterHub with docker +------------------------------- + +The JupyterHub docker image can be started with the following command:: + + docker run -d --name jupyterhub jupyterhub/jupyterhub jupyterhub + +This command will create a container named ``jupyterhub`` that you can +**stop and resume** with ``docker stop/start``. + +The Hub service will be listening on all interfaces at port 8000, which makes +this a good choice for **testing JupyterHub on your desktop or laptop**. + +If you want to run docker on a computer that has a public IP then you should +(as in MUST) **secure it with ssl** by adding ssl options to your docker +configuration or using a ssl enabled proxy. + +`Mounting volumes `_ +will allow you to store data outside the docker image (host system) so it will +be persistent, even when you start a new image. + +The command ``docker exec -it jupyterhub bash`` will spawn a root shell in your +docker container. You can use the root shell to **create system users in the container**. +These accounts will be used for authentication in JupyterHub's default +configuration. + +.. _Zero to JupyterHub: https://zero-to-jupyterhub.readthedocs.io/en/latest/ diff --git a/docs/source/quickstart.md b/docs/source/quickstart.md index 823c7e39..0be930a0 100644 --- a/docs/source/quickstart.md +++ b/docs/source/quickstart.md @@ -1,73 +1,60 @@ -# Quickstart - Installation +# Quickstart ## Prerequisites -**Before installing JupyterHub**, you will need: +Before installing JupyterHub, you will need: -- [Python](https://www.python.org/downloads/) 3.3 or greater - - An understanding of using [`pip`](https://pip.pypa.io/en/stable/) or +- a Linux/Unix based system +- [Python](https://www.python.org/downloads/) 3.4 or greater. An understanding + of using [`pip`](https://pip.pypa.io/en/stable/) or [`conda`](http://conda.pydata.org/docs/get-started.html) for installing Python packages is helpful. - -- [nodejs/npm](https://www.npmjs.com/) - - [Install nodejs/npm](https://docs.npmjs.com/getting-started/installing-node), +- [nodejs/npm](https://www.npmjs.com/). [Install nodejs/npm](https://docs.npmjs.com/getting-started/installing-node), using your operating system's package manager. For example, install on Linux - (Debian/Ubuntu) using: + Debian/Ubuntu using: ```bash sudo apt-get install npm nodejs-legacy ``` - - (The `nodejs-legacy` package installs the `node` executable and is currently - required for npm to work on Debian/Ubuntu.) + The `nodejs-legacy` package installs the `node` executable and is currently + required for `npm` to work on Debian/Ubuntu. - TLS certificate and key for HTTPS communication - - Domain name -**Before running the single-user notebook servers** (which may be on the same -system as the Hub or not): +Before running the single-user notebook servers (which may be on the same +system as the Hub or not), you will need: - [Jupyter Notebook](https://jupyter.readthedocs.io/en/latest/install.html) version 4 or greater ## Installation -JupyterHub can be installed with `pip` or `conda` and the proxy with `npm`: +JupyterHub can be installed with `pip` (and the proxy with `npm`) or `conda`: **pip, npm:** + ```bash python3 -m pip install jupyterhub npm install -g configurable-http-proxy +python3 -m pip install notebook # needed if running the notebook servers locally ``` **conda** (one command installs jupyterhub and proxy): + ```bash -conda install -c conda-forge jupyterhub +conda install -c conda-forge jupyterhub # installs jupyterhub and proxy +conda install notebook # needed if running the notebook servers locally ``` -To test your installation: +Test your installation. If installed, these commands should return the packages' +help contents: ```bash jupyterhub -h configurable-http-proxy -h ``` -If you plan to run notebook servers locally, you will need also to install -Jupyter notebook: - -**pip:** -```bash -python3 -m pip install notebook -``` - -**conda:** -```bash -conda install notebook -``` - ## Start the Hub server To start the Hub server, run the command: @@ -79,82 +66,13 @@ jupyterhub Visit `https://localhost:8000` in your browser, and sign in with your unix credentials. -To allow multiple users to sign into the Hub server, you must start `jupyterhub` as a *privileged user*, such as root: +To **allow multiple users to sign in** to the Hub server, you must start +`jupyterhub` as a *privileged user*, such as root: ```bash sudo jupyterhub ``` The [wiki](https://github.com/jupyterhub/jupyterhub/wiki/Using-sudo-to-run-JupyterHub-without-root-privileges) -describes how to run the server as a *less privileged user*, which requires +describes how to run the server as a *less privileged user*. This requires additional configuration of the system. - ----- - -## Basic Configuration - -The [getting started document](docs/source/getting-started.md) contains -detailed information abouts configuring a JupyterHub deployment. - -The JupyterHub **tutorial** provides a video and documentation that explains -and illustrates the fundamental steps for installation and configuration. -[Repo](https://github.com/jupyterhub/jupyterhub-tutorial) -| [Tutorial documentation](http://jupyterhub-tutorial.readthedocs.io/en/latest/) - -#### Generate a default configuration file - -Generate a default config file: - - jupyterhub --generate-config - -#### Customize the configuration, authentication, and process spawning - -Spawn the server on ``10.0.1.2:443`` with **https**: - - jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert - -The authentication and process spawning mechanisms can be replaced, -which should allow plugging into a variety of authentication or process -control environments. Some examples, meant as illustration and testing of this -concept, are: - -- Using GitHub OAuth instead of PAM with [OAuthenticator](https://github.com/jupyterhub/oauthenticator) -- Spawning single-user servers with Docker, using the [DockerSpawner](https://github.com/jupyterhub/dockerspawner) - ----- - -## Alternate Installation using Docker - -A ready to go [docker image for JupyterHub](https://hub.docker.com/r/jupyterhub/jupyterhub/) -gives a straightforward deployment of JupyterHub. - -*Note: This `jupyterhub/jupyterhub` docker image is only an image for running -the Hub service itself. It does not provide the other Jupyter components, such -as Notebook installation, which are needed by the single-user servers. -To run the single-user servers, which may be on the same system as the Hub or -not, Jupyter Notebook version 4 or greater must be installed.* - -#### Starting JupyterHub with docker - -The JupyterHub docker image can be started with the following command: - - docker run -d --name jupyterhub jupyterhub/jupyterhub jupyterhub - -This command will create a container named `jupyterhub` that you can -**stop and resume** with `docker stop/start`. - -The Hub service will be listening on all interfaces at port 8000, which makes -this a good choice for **testing JupyterHub on your desktop or laptop**. - -If you want to run docker on a computer that has a public IP then you should -(as in MUST) **secure it with ssl** by adding ssl options to your docker -configuration or using a ssl enabled proxy. - -[Mounting volumes](https://docs.docker.com/engine/userguide/containers/dockervolumes/) -will allow you to **store data outside the docker image (host system) so it will be persistent**, -even when you start a new image. - -The command `docker exec -it jupyterhub bash` will spawn a root shell in your -docker container. You can **use the root shell to create system users in the container**. -These accounts will be used for authentication in JupyterHub's default -configuration. diff --git a/docs/source/security-basics.md b/docs/source/security-basics.md new file mode 100644 index 00000000..37876fde --- /dev/null +++ b/docs/source/security-basics.md @@ -0,0 +1,146 @@ +# Security + +**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.** + +--- + +**Deprecation note:** Removed `--no-ssl` in version 0.7. + +JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to +allow running without SSL using the command `jupyterhub --no-ssl`. The +`--no-ssl` command line option is not needed anymore in version 0.7. + +--- + +Security is the most important aspect of configuring Jupyter. There are four main aspects of the +security configuration: + +1. SSL encryption (to enable HTTPS) +2. Cookie secret (a key for encrypting browser cookies) +3. Proxy authentication token (used for the Hub and other services to authenticate to the Proxy) +4. Periodic security audits + +*Note* that the **Hub** hashes all secrets (e.g., auth tokens) before storing them in its +database. A loss of control over read-access to the database should have no security impact +on your deployment. + +## SSL encryption + +Since JupyterHub includes authentication and allows arbitrary code execution, you should not run +it without SSL (HTTPS). This will require you to obtain an official, trusted SSL certificate or +create a self-signed certificate. Once you have obtained and installed a key and certificate you +need to specify their locations in the configuration file as follows: + +```python +c.JupyterHub.ssl_key = '/path/to/my.key' +c.JupyterHub.ssl_cert = '/path/to/my.cert' +``` + +It is also possible to use letsencrypt (https://letsencrypt.org/) to obtain +a free, trusted SSL certificate. If you run letsencrypt using the default +options, the needed configuration is (replace `mydomain.tld` by your fully +qualified domain name): + +```python +c.JupyterHub.ssl_key = '/etc/letsencrypt/live/{mydomain.tld}/privkey.pem' +c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/{mydomain.tld}/fullchain.pem' +``` + +If the fully qualified domain name (FQDN) is `example.com`, the following +would be the needed configuration: + +```python +c.JupyterHub.ssl_key = '/etc/letsencrypt/live/example.com/privkey.pem' +c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/example.com/fullchain.pem' +``` + +Some cert files also contain the key, in which case only the cert is needed. It is important that +these files be put in a secure location on your server, where they are not readable by regular +users. + +Note on **chain certificates**: If you are using a chain certificate, see also +[chained certificate for SSL](troubleshooting.md#chained-certificates-for-ssl) in the JupyterHub troubleshooting FAQ). + +Note: In certain cases, e.g. **behind SSL termination in nginx**, allowing no SSL +running on the hub may be desired. + +## Cookie secret + +The cookie secret is an encryption key, used to encrypt the browser cookies used for +authentication. If this value changes for the Hub, all single-user servers must also be restarted. +Normally, this value is stored in a file, the location of which can be specified in a config file +as follows: + +```python +c.JupyterHub.cookie_secret_file = '/srv/jupyterhub/cookie_secret' +``` + +The content of this file should be 32 random bytes, encoded as hex. +An example would be to generate this file with: + +```bash +openssl rand -hex 32 > /srv/jupyterhub/cookie_secret +``` + +In most deployments of JupyterHub, you should point this to a secure location on the file +system, such as `/srv/jupyterhub/cookie_secret`. If the cookie secret file doesn't exist when +the Hub starts, a new cookie secret is generated and stored in the file. The +file must not be readable by group or other or the server won't start. +The recommended permissions for the cookie secret file are 600 (owner-only rw). + + +If you would like to avoid the need for files, the value can be loaded in the Hub process from +the `JPY_COOKIE_SECRET` environment variable, which is a hex-encoded string. You +can set it this way: + +```bash +export JPY_COOKIE_SECRET=`openssl rand -hex 32` +``` + +For security reasons, this environment variable should only be visible to the Hub. +If you set it dynamically as above, all users will be logged out each time the +Hub starts. + +You can also set the cookie secret in the configuration file itself,`jupyterhub_config.py`, +as a binary string: + +```python +c.JupyterHub.cookie_secret = bytes.fromhex('64 CHAR HEX STRING') +``` + +## Proxy authentication token + +The Hub authenticates its requests to the Proxy using a secret token that +the Hub and Proxy agree upon. The value of this string should be a random +string (for example, generated by `openssl rand -hex 32`). You can pass +this value to the Hub and Proxy using either the `CONFIGPROXY_AUTH_TOKEN` +environment variable: + +```bash +export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32` +``` + +This environment variable needs to be visible to the Hub and Proxy. + +Or you can set the value in the configuration file, `jupyterhub_config.py`: + +```python +c.JupyterHub.proxy_auth_token = '0bc02bede919e99a26de1e2a7a5aadfaf6228de836ec39a05a6c6942831d8fe5' +``` + +If you don't set the Proxy authentication token, the Hub will generate a random key itself, which +means that any time you restart the Hub you **must also restart the Proxy**. If the proxy is a +subprocess of the Hub, this should happen automatically (this is the default configuration). + +Another time you must set the Proxy authentication token yourself is if +you want other services, such as [nbgrader](https://github.com/jupyter/nbgrader) +to also be able to connect to the Proxy. + +## Security audits + +We recommend that you do periodic reviews of your deployment's security. It's +good practice to keep JupyterHub, configurable-http-proxy, and nodejs +versions up to date. + +A handy website for testing your deployment is +[Qualsys' SSL analyzer tool](https://www.ssllabs.com/ssltest/analyze.html). diff --git a/docs/source/services-basics.md b/docs/source/services-basics.md new file mode 100644 index 00000000..78420d3e --- /dev/null +++ b/docs/source/services-basics.md @@ -0,0 +1,36 @@ +## External services + +JupyterHub has a REST API that can be used by external services like the +[cull_idle_servers](https://github.com/jupyterhub/jupyterhub/blob/master/examples/cull-idle/cull_idle_servers.py) +script which monitors and kills idle single-user servers periodically. In order to run such an +external service, you need to provide it an API token. In the case of `cull_idle_servers`, it is passed +as the environment variable called `JPY_API_TOKEN`. + +Currently there are two ways of registering that token with JupyterHub. The first one is to use +the `jupyterhub` command to generate a token for a specific hub user: + +```bash +jupyterhub token +``` + +As of [version 0.6.0](./changelog.html), the preferred way of doing this is to first generate an API token: + +```bash +openssl rand -hex 32 +``` + + +and then write it to your JupyterHub configuration file (note that the **key** is the token while the **value** is the username): + +```python +c.JupyterHub.api_tokens = {'token' : 'username'} +``` + +Upon restarting JupyterHub, you should see a message like below in the logs: + +``` +Adding API token for +``` + +Now you can run your script, i.e. `cull_idle_servers`, by providing it the API token and it will authenticate through +the REST API to interact with it. diff --git a/docs/source/spawners-basics.md b/docs/source/spawners-basics.md new file mode 100644 index 00000000..c30d89f6 --- /dev/null +++ b/docs/source/spawners-basics.md @@ -0,0 +1,33 @@ +# Spawners and single-user notebook servers + +Since the single-user server is an instance of `jupyter notebook`, an entire separate +multi-process application, there are many aspect of that server can configure, and a lot of ways +to express that configuration. + +At the JupyterHub level, you can set some values on the Spawner. The simplest of these is +`Spawner.notebook_dir`, which lets you set the root directory for a user's server. This root +notebook directory is the highest level directory users will be able to access in the notebook +dashboard. In this example, the root notebook directory is set to `~/notebooks`, where `~` is +expanded to the user's home directory. + +```python +c.Spawner.notebook_dir = '~/notebooks' +``` + +You can also specify extra command-line arguments to the notebook server with: + +```python +c.Spawner.args = ['--debug', '--profile=PHYS131'] +``` + +This could be used to set the users default page for the single user server: + +```python +c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb'] +``` + +Since the single-user server extends the notebook server application, +it still loads configuration from the `jupyter_notebook_config.py` config file. +Each user may have one of these files in `$HOME/.jupyter/`. +Jupyter also supports loading system-wide config files from `/etc/jupyter/`, +which is the place to put configuration that you want to affect all of your users. diff --git a/docs/source/technical-overview.md b/docs/source/technical-overview.md new file mode 100644 index 00000000..c9813d6e --- /dev/null +++ b/docs/source/technical-overview.md @@ -0,0 +1,104 @@ +## Technical Overview + +JupyterHub is a set of processes that together provide a single user Jupyter +Notebook server for each person in a group. + +### Three subsystems +Three major subsystems run by the `jupyterhub` command line program: + +- **Single-User Notebook Server**: a dedicated, single-user, Jupyter Notebook server is + started for each user on the system when the user logs in. The object that + starts these servers is called a **Spawner**. +- **Proxy**: the public facing part of JupyterHub that uses a dynamic proxy + to route HTTP requests to the Hub and Single User Notebook Servers. +- **Hub**: manages user accounts, authentication, and coordinates Single User + Notebook Servers using a Spawner. + +![JupyterHub subsystems](images/jhub-parts.png) + +### Deployment server + +To use JupyterHub, you need a Unix server (typically Linux) running somewhere +that is accessible to your team on the network. The JupyterHub server can be +on an internal network at your organization, or it can run on the public +internet (in which case, take care with the Hub's +[security](#security)). + +### Basic operation +Users access JupyterHub through a web browser, by going to the IP address or +the domain name of the server. + +Basic principles of operation: + +* Hub spawns proxy +* Proxy forwards all requests to hub by default +* Hub handles login, and spawns single-user servers on demand +* Hub configures proxy to forward url prefixes to single-user servers + +Different **[authenticators](authenticators.html)** control access +to JupyterHub. The default one (PAM) uses the user accounts on the server where +JupyterHub is running. If you use this, you will need to create a user account +on the system for each user on your team. Using other authenticators, you can +allow users to sign in with e.g. a GitHub account, or with any single-sign-on +system your organization has. + +Next, **[spawners](spawners.html)** control how JupyterHub starts +the individual notebook server for each user. The default spawner will +start a notebook server on the same machine running under their system username. +The other main option is to start each server in a separate container, often +using Docker. + +### Default behavior + +**IMPORTANT: You should not run JupyterHub without SSL encryption on a public network.** + +See [Security documentation](#security) for how to configure JupyterHub to use SSL, +or put it behind SSL termination in another proxy server, such as nginx. + +--- + +**Deprecation note:** Removed `--no-ssl` in version 0.7. + +JupyterHub versions 0.5 and 0.6 require extra confirmation via `--no-ssl` to +allow running without SSL using the command `jupyterhub --no-ssl`. The +`--no-ssl` command line option is not needed anymore in version 0.7. + +--- + +To start JupyterHub in its default configuration, type the following at the command line: + +```bash + sudo jupyterhub +``` + +The default Authenticator that ships with JupyterHub authenticates users +with their system name and password (via [PAM][]). +Any user on the system with a password will be allowed to start a single-user notebook server. + +The default Spawner starts servers locally as each user, one dedicated server per user. +These servers listen on localhost, and start in the given user's home directory. + +By default, the **Proxy** listens on all public interfaces on port 8000. +Thus you can reach JupyterHub through either: + +- `http://localhost:8000` +- or any other public IP or domain pointing to your system. + +In their default configuration, the other services, the **Hub** and **Single-User Servers**, +all communicate with each other on localhost only. + +By default, starting JupyterHub will write two files to disk in the current working directory: + +- `jupyterhub.sqlite` is the sqlite database containing all of the state of the **Hub**. + This file allows the **Hub** to remember what users are running and where, + as well as other information enabling you to restart parts of JupyterHub separately. It is + important to note that this database contains *no* sensitive information other than **Hub** + usernames. +- `jupyterhub_cookie_secret` is the encryption key used for securing cookies. + This file needs to persist in order for restarting the Hub server to avoid invalidating cookies. + Conversely, deleting this file and restarting the server effectively invalidates all login cookies. + The cookie secret file is discussed in the [Cookie Secret documentation](#cookie-secret). + +The location of these files can be specified via configuration. + +[PAM]: https://en.wikipedia.org/wiki/Pluggable_authentication_module diff --git a/docs/source/user-guide.rst b/docs/source/user-guide.rst deleted file mode 100644 index cbe1f90d..00000000 --- a/docs/source/user-guide.rst +++ /dev/null @@ -1,11 +0,0 @@ -JupyterHub User Guide -===================== - -.. toctree:: - :maxdepth: 3 - - quickstart - getting-started - howitworks - websecurity - rest diff --git a/examples/bootstrap-script/README.md b/examples/bootstrap-script/README.md new file mode 100644 index 00000000..c35020ab --- /dev/null +++ b/examples/bootstrap-script/README.md @@ -0,0 +1,130 @@ +# Bootstrapping your users + +Before spawning a notebook to the user, it could be useful to +do some preparation work in a bootstrapping process. + +Common use cases are: + +*Providing writeable storage for LDAP users* + +Your Jupyterhub is configured to use the LDAPAuthenticator and DockerSpawer. + +* The user has no file directory on the host since your are using LDAP. +* When a user has no directory and DockerSpawner wants to mount a volume, +the spawner will use docker to create a directory. +Since the docker daemon is running as root, the generated directory for the volume +mount will not be writeable by the `jovyan` user inside of the container. +For the directory to be useful to the user, the permissions on the directory +need to be modified for the user to have write access. + +*Prepopulating Content* + +Another use would be to copy initial content, such as tutorial files or reference + material, into the user's space when a notebook server is newly spawned. + +You can define your own bootstrap process by implementing a `pre_spawn_hook` on any spawner. +The Spawner itself is passed as parameter to your hook and you can easily get the contextual information out of the spawning process. + +If you implement a hook, make sure that it is *idempotent*. It will be executed every time +a notebook server is spawned to the user. That means you should somehow +ensure that things which should run only once are not running again and again. +For example, before you create a directory, check if it exists. + +Bootstrapping examples: + +### Example #1 - Create a user directory + +Create a directory for the user, if none exists + +```python + +# in jupyterhub_config.py +import os +def create_dir_hook(spawner): + username = spawner.user.name # get the username + volume_path = os.path.join('/volumes/jupyterhub', username) + if not os.path.exists(volume_path): + # create a directory with umask 0755 + # hub and container user must have the same UID to be writeable + # still readable by other users on the system + os.mkdir(volume_path, 0o755) + # now do whatever you think your user needs + # ... + pass + +# attach the hook function to the spawner +c.Spawner.pre_spawn_hook = create_dir_hook +``` + +### Example #2 - Run a shell script + +You can specify a plain ole' shell script (or any other executable) to be run +by the bootstrap process. + +For example, you can execute a shell script and as first parameter pass the name +of the user: + +```python + +# in jupyterhub_config.py +from subprocess import check_call +import os +def my_script_hook(spawner): + username = spawner.user.name # get the username + script = os.path.join(os.path.dirname(__file__), 'bootstrap.sh') + check_call([script, username]) + +# attach the hook function to the spawner +c.Spawner.pre_spawn_hook = my_script_hook + +``` + +Here's an example on what you could do in your shell script. See also +`/examples/bootstrap-script/` + +```bash +#!/bin/bash + +# Bootstrap example script +# Copyright (c) Jupyter Development Team. +# Distributed under the terms of the Modified BSD License. + +# - The first parameter for the Bootstrap Script is the USER. +USER=$1 +if ["$USER" == ""]; then + exit 1 +fi +# ---------------------------------------------------------------------------- + + +# This example script will do the following: +# - create one directory for the user $USER in a BASE_DIRECTORY (see below) +# - create a "tutorials" directory within and download and unzip +# the PythonDataScienceHandbook from GitHub + +# Start the Bootstrap Process +echo "bootstrap process running for user $USER ..." + +# Base Directory: All Directories for the user will be below this point +BASE_DIRECTORY=/volumes/jupyterhub/ + +# User Directory: That's the private directory for the user to be created, if none exists +USER_DIRECTORY=$BASE_DIRECTORY/$USER + +if [ -d "$USER_DIRECTORY" ]; then + echo "...directory for user already exists. skipped" + exit 0 # all good. nothing to do. +else + echo "...creating a directory for the user: $USER_DIRECTORY" + mkdir $USER_DIRECTORY + + echo "...initial content loading for user ..." + mkdir $USER_DIRECTORY/tutorials + cd $USER_DIRECTORY/tutorials + wget https://github.com/jakevdp/PythonDataScienceHandbook/archive/master.zip + unzip -o master.zip + rm master.zip +fi + +exit 0 +``` \ No newline at end of file diff --git a/examples/bootstrap-script/bootstrap.sh b/examples/bootstrap-script/bootstrap.sh new file mode 100755 index 00000000..405c8694 --- /dev/null +++ b/examples/bootstrap-script/bootstrap.sh @@ -0,0 +1,48 @@ +#!/bin/bash + +# Bootstrap example script +# Copyright (c) Jupyter Development Team. +# Distributed under the terms of the Modified BSD License. + +# - The first parameter for the Bootstrap Script is the USER. +USER=$1 +if ["$USER" == ""]; then + exit 1 +fi +# ---------------------------------------------------------------------------- + + +# This example script will do the following: +# - create one directory for the user $USER in a BASE_DIRECTORY (see below) +# - create a "tutorials" directory within and download and unzip the PythonDataScienceHandbook from GitHub + +# Start the Bootstrap Process +echo "bootstrap process running for user $USER ..." + +# Base Directory: All Directories for the user will be below this point +BASE_DIRECTORY=/volumes/jupyterhub + +# User Directory: That's the private directory for the user to be created, if none exists +USER_DIRECTORY=$BASE_DIRECTORY/$USER + +if [ -d "$USER_DIRECTORY" ]; then + echo "...directory for user already exists. skipped" + exit 0 # all good. nothing to do. +else + echo "...creating a directory for the user: $USER_DIRECTORY" + mkdir $USER_DIRECTORY + + # mkdir did not succeed? + if [ $? -ne 0 ] ; then + exit 1 + fi + + echo "...initial content loading for user ..." + mkdir $USER_DIRECTORY/tutorials + cd $USER_DIRECTORY/tutorials + wget https://github.com/jakevdp/PythonDataScienceHandbook/archive/master.zip + unzip -o master.zip + rm master.zip +fi + +exit 0 diff --git a/examples/bootstrap-script/jupyterhub_config.py b/examples/bootstrap-script/jupyterhub_config.py new file mode 100644 index 00000000..2bbbdc6d --- /dev/null +++ b/examples/bootstrap-script/jupyterhub_config.py @@ -0,0 +1,26 @@ +# Example for a Spawner.pre_spawn_hook +# create a directory for the user before the spawner starts + +import os +def create_dir_hook(spawner): + username = spawner.user.name # get the username + volume_path = os.path.join('/volumes/jupyterhub', username) + if not os.path.exists(volume_path): + os.mkdir(volume_path, 0o755) + # now do whatever you think your user needs + # ... + +# attach the hook function to the spawner +c.Spawner.pre_spawn_hook = create_dir_hook + +# Use the DockerSpawner to serve your users' notebooks +c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner' +from jupyter_client.localinterfaces import public_ips +c.JupyterHub.hub_ip = public_ips()[0] +c.DockerSpawner.hub_ip_connect = public_ips()[0] +c.DockerSpawner.container_ip = "0.0.0.0" + +# You can now mount the volume to the docker container as we've +# made sure the directory exists +c.DockerSpawner.volumes = { '/volumes/jupyterhub/{username}/': '/home/jovyan/work' } + diff --git a/jupyterhub/_version.py b/jupyterhub/_version.py index f322236f..6c40d402 100644 --- a/jupyterhub/_version.py +++ b/jupyterhub/_version.py @@ -27,7 +27,7 @@ def _check_version(hub_version, singleuser_version, log): if hub_version != singleuser_version: from distutils.version import LooseVersion as V hub_major_minor = V(hub_version).version[:2] - singleuser_major_minor = V(__version__).version[:2] + singleuser_major_minor = V(singleuser_version).version[:2] if singleuser_major_minor == hub_major_minor: # patch-level mismatch or lower, log difference at debug-level # because this should be fine @@ -36,5 +36,7 @@ def _check_version(hub_version, singleuser_version, log): # log warning-level for more significant mismatch, such as 0.8 vs 0.9, etc. log_method = log.warning log_method("jupyterhub version %s != jupyterhub-singleuser version %s", - hub_version, __version__, + hub_version, singleuser_version, ) + else: + log.debug("jupyterhub and jupyterhub-singleuser both on version %s" % hub_version) diff --git a/jupyterhub/apihandlers/base.py b/jupyterhub/apihandlers/base.py index 769b9259..5a4d34a9 100644 --- a/jupyterhub/apihandlers/base.py +++ b/jupyterhub/apihandlers/base.py @@ -13,6 +13,14 @@ from ..utils import url_path_join class APIHandler(BaseHandler): + @property + def content_security_policy(self): + return '; '.join([super().content_security_policy, "default-src 'none'"]) + + def set_default_headers(self): + self.set_header('Content-Type', 'application/json') + super().set_default_headers() + def check_referer(self): """Check Origin for cross-site API requests. @@ -80,7 +88,6 @@ class APIHandler(BaseHandler): reason = getattr(exception, 'reason', '') if reason: status_message = reason - self.set_header('Content-Type', 'application/json') self.write(json.dumps({ 'status': status_code, 'message': message or status_message, diff --git a/jupyterhub/app.py b/jupyterhub/app.py index dbc82415..a3b351a9 100644 --- a/jupyterhub/app.py +++ b/jupyterhub/app.py @@ -1450,7 +1450,6 @@ class JupyterHub(Application): self.exit(1) else: self.log.info("Not starting proxy") - yield self.proxy.add_hub_route(self.hub) # start the service(s) for service_name, service in self._service_map.items(): diff --git a/jupyterhub/auth.py b/jupyterhub/auth.py index 5b1568eb..95eb1018 100644 --- a/jupyterhub/auth.py +++ b/jupyterhub/auth.py @@ -3,9 +3,7 @@ # Copyright (c) IPython Development Team. # Distributed under the terms of the Modified BSD License. -from grp import getgrnam import pipes -import pwd import re from shutil import which import sys @@ -26,6 +24,15 @@ from .utils import url_path_join from .traitlets import Command + +def getgrnam(name): + """Wrapper function to protect against `grp` not being available + on Windows + """ + import grp + return grp.getgrnam(name) + + class Authenticator(LoggingConfigurable): """Base class for implementing an authentication provider for JupyterHub""" @@ -461,6 +468,7 @@ class LocalAuthenticator(Authenticator): @staticmethod def system_user_exists(user): """Check if the user exists on the system""" + import pwd try: pwd.getpwnam(user.name) except KeyError: diff --git a/jupyterhub/handlers/base.py b/jupyterhub/handlers/base.py index 6eacb6c5..c27ae4a4 100644 --- a/jupyterhub/handlers/base.py +++ b/jupyterhub/handlers/base.py @@ -130,11 +130,13 @@ class BaseHandler(RequestHandler): """ headers = self.settings.get('headers', {}) headers.setdefault("X-JupyterHub-Version", __version__) - headers.setdefault("Content-Security-Policy", self.content_security_policy) for header_name, header_content in headers.items(): self.set_header(header_name, header_content) + if 'Content-Security-Policy' not in headers: + self.set_header('Content-Security-Policy', self.content_security_policy) + #--------------------------------------------------------------- # Login and cookie-related #--------------------------------------------------------------- @@ -326,6 +328,7 @@ class BaseHandler(RequestHandler): f = user.spawn(server_name, options) spawner = user.spawners[server_name] + user.proxy_pending = True @gen.coroutine def finish_user_spawn(f=None): @@ -340,8 +343,16 @@ class BaseHandler(RequestHandler): toc = IOLoop.current().time() self.log.info("User %s server took %.3f seconds to start", user.name, toc-tic) self.statsd.timing('spawner.success', (toc - tic) * 1000) - yield self.proxy.add_user(user) - spawner.add_poll_callback(self.user_stopped, user) + try: + yield self.proxy.add_user(user, server_name) + except Exception: + self.log.exception("Failed to add user %s to proxy!", user) + self.log.error("Stopping user %s to avoid inconsistent state") + yield user.stop() + else: + user.spawner.add_poll_callback(self.user_stopped, user) + finally: + user.proxy_pending = False try: yield gen.with_timeout(timedelta(seconds=self.slow_spawn_timeout), f) @@ -537,7 +548,7 @@ class UserSpawnHandler(BaseHandler): # logged in as correct user, spawn the server spawner = current_user.spawner - if spawner._spawn_pending: + if spawner._spawn_pending or spawner._proxy_pending: # spawn has started, but not finished self.statsd.incr('redirects.user_spawn_pending', 1) html = self.render_template("spawn_pending.html", user=current_user) diff --git a/jupyterhub/proxy.py b/jupyterhub/proxy.py index d5201b37..00279ad5 100644 --- a/jupyterhub/proxy.py +++ b/jupyterhub/proxy.py @@ -313,8 +313,7 @@ class Proxy(LoggingConfigurable): # check service routes service_routes = {r['data']['service'] for r in routes.values() if 'service' in r['data']} - for orm_service in db.query(Service).filter( - Service.server is not None): + for orm_service in db.query(Service).filter(Service.server != None): service = service_dict[orm_service.name] if service.server is None: # This should never be True, but seems to be on rare occasion. @@ -430,8 +429,9 @@ class ConfigurableHTTPProxy(Proxy): " I hope there is SSL termination happening somewhere else...") self.log.info("Starting proxy @ %s", public_server.bind_url) self.log.debug("Proxy cmd: %s", cmd) + shell = os.name == 'nt' try: - self.proxy_process = Popen(cmd, env=env, start_new_session=True) + self.proxy_process = Popen(cmd, env=env, start_new_session=True, shell=shell) except FileNotFoundError as e: self.log.error( "Failed to find proxy %r\n" diff --git a/jupyterhub/singleuser.py b/jupyterhub/singleuser.py index d52c0a12..5cc86c9f 100755 --- a/jupyterhub/singleuser.py +++ b/jupyterhub/singleuser.py @@ -4,9 +4,7 @@ # Copyright (c) Jupyter Development Team. # Distributed under the terms of the Modified BSD License. -from distutils.version import LooseVersion as V import os -import re from textwrap import dedent from urllib.parse import urlparse @@ -15,7 +13,7 @@ from jinja2 import ChoiceLoader, FunctionLoader from tornado.httpclient import AsyncHTTPClient from tornado import gen from tornado import ioloop -from tornado.web import HTTPError +from tornado.web import HTTPError, RequestHandler try: import notebook @@ -349,10 +347,17 @@ class SingleUserNotebookApp(NotebookApp): - check version and warn on sufficient mismatch """ client = AsyncHTTPClient() - try: - resp = yield client.fetch(self.hub_api_url) - except Exception: - self.log.exception("Failed to connect to my Hub at %s. Is it running?", self.hub_api_url) + RETRIES = 5 + for i in range(1, RETRIES+1): + try: + resp = yield client.fetch(self.hub_api_url) + except Exception: + self.log.exception("Failed to connect to my Hub at %s (attempt %i/%i). Is it running?", + self.hub_api_url, i, RETRIES) + yield gen.sleep(min(2**i, 16)) + else: + break + else: self.exit(1) hub_version = resp.headers.get('X-JupyterHub-Version') @@ -395,8 +400,14 @@ class SingleUserNotebookApp(NotebookApp): s['hub_prefix'] = self.hub_prefix s['hub_host'] = self.hub_host s['hub_auth'] = self.hub_auth - s['csp_report_uri'] = self.hub_host + url_path_join(self.hub_prefix, 'security/csp-report') - s.setdefault('headers', {})['X-JupyterHub-Version'] = __version__ + csp_report_uri = s['csp_report_uri'] = self.hub_host + url_path_join(self.hub_prefix, 'security/csp-report') + headers = s.setdefault('headers', {}) + headers['X-JupyterHub-Version'] = __version__ + # set CSP header directly to workaround bugs in jupyter/notebook 5.0 + headers.setdefault('Content-Security-Policy', ';'.join([ + "frame-ancestors 'self'", + "report-uri " + csp_report_uri, + ])) super(SingleUserNotebookApp, self).init_webapp() # add OAuth callback @@ -404,9 +415,21 @@ class SingleUserNotebookApp(NotebookApp): urlparse(self.hub_auth.oauth_redirect_uri).path, OAuthCallbackHandler )]) - + + # apply X-JupyterHub-Version to *all* request handlers (even redirects) + self.patch_default_headers() self.patch_templates() + def patch_default_headers(self): + if hasattr(RequestHandler, '_orig_set_default_headers'): + return + RequestHandler._orig_set_default_headers = RequestHandler.set_default_headers + def set_jupyterhub_header(self): + self._orig_set_default_headers() + self.set_header('X-JupyterHub-Version', __version__) + + RequestHandler.set_default_headers = set_jupyterhub_header + def patch_templates(self): """Patch page templates to add Hub-related buttons""" diff --git a/jupyterhub/spawner.py b/jupyterhub/spawner.py index a706e85b..80e5b3d2 100644 --- a/jupyterhub/spawner.py +++ b/jupyterhub/spawner.py @@ -8,17 +8,15 @@ Contains base Spawner class & default implementation import errno import os import pipes -import pwd import shutil import signal import sys -import grp import warnings from subprocess import Popen from tempfile import mkdtemp from tornado import gen -from tornado.ioloop import PeriodicCallback +from tornado.ioloop import PeriodicCallback, IOLoop from traitlets.config import LoggingConfigurable from traitlets import ( @@ -28,7 +26,7 @@ from traitlets import ( from .objects import Server from .traitlets import Command, ByteSpecification -from .utils import random_port, url_path_join +from .utils import random_port, url_path_join, DT_MIN, DT_MAX, DT_SCALE class Spawner(LoggingConfigurable): @@ -367,6 +365,25 @@ class Spawner(LoggingConfigurable): """ ).tag(config=True) + pre_spawn_hook = Any( + help=""" + An optional hook function that you can implement to do some bootstrapping work before + the spawner starts. For example, create a directory for your user or load initial content. + + This can be set independent of any concrete spawner implementation. + + Example: + + from subprocess import check_call + def my_hook(spawner): + username = spawner.user.name + check_call(['./examples/bootstrap-script/bootstrap.sh', username]) + + c.Spawner.pre_spawn_hook = my_hook + + """ + ).tag(config=True) + def load_state(self, state): """Restore state of spawner from database. @@ -537,6 +554,11 @@ class Spawner(LoggingConfigurable): args.extend(self.args) return args + def run_pre_spawn_hook(self): + """Run the pre_spawn_hook if defined""" + if self.pre_spawn_hook: + return self.pre_spawn_hook(self) + @gen.coroutine def start(self): """Start the single-user server @@ -643,17 +665,21 @@ class Spawner(LoggingConfigurable): self.log.exception("Unhandled error in poll callback for %s", self) return status - death_interval = Float(0.1) + death_interval = Float(DT_MIN) @gen.coroutine def wait_for_death(self, timeout=10): """Wait for the single-user server to die, up to timeout seconds""" - for i in range(int(timeout / self.death_interval)): + loop = IOLoop.current() + tic = loop.time() + dt = self.death_interval + while dt > 0: status = yield self.poll() if status is not None: break else: - yield gen.sleep(self.death_interval) + yield gen.sleep(dt) + dt = min(dt * DT_SCALE, DT_MAX, timeout - (loop.time() - tic)) def _try_setcwd(path): @@ -681,6 +707,8 @@ def set_user_setuid(username, chdir=True): Returned preexec_fn will set uid/gid, and attempt to chdir to the target user's home directory. """ + import grp + import pwd user = pwd.getpwnam(username) uid = user.pw_uid gid = user.pw_gid @@ -821,6 +849,7 @@ class LocalProcessSpawner(Spawner): def user_env(self, env): """Augment environment of spawned process with user specific env variables.""" + import pwd env['USER'] = self.user.name home = pwd.getpwnam(self.user.name).pw_dir shell = pwd.getpwnam(self.user.name).pw_shell diff --git a/jupyterhub/tests/test_app.py b/jupyterhub/tests/test_app.py index 2e9c228e..e879efeb 100644 --- a/jupyterhub/tests/test_app.py +++ b/jupyterhub/tests/test_app.py @@ -72,7 +72,7 @@ def test_init_tokens(io_loop): assert api_token is not None user = api_token.user assert user.name == username - + # simulate second startup, reloading same tokens: app = MockHub(db_url=db_file, api_tokens=tokens) io_loop.run_sync(lambda : app.initialize([])) @@ -82,7 +82,7 @@ def test_init_tokens(io_loop): assert api_token is not None user = api_token.user assert user.name == username - + # don't allow failed token insertion to create users: tokens['short'] = 'gman' app = MockHub(db_url=db_file, api_tokens=tokens) @@ -157,3 +157,7 @@ def test_load_groups(io_loop): gold = orm.Group.find(db, name='gold') assert gold is not None assert sorted([ u.name for u in gold.users ]) == sorted(to_load['gold']) + +def test_version(): + if sys.version_info[:2] < (3, 3): + assertRaises(ValueError) diff --git a/jupyterhub/user.py b/jupyterhub/user.py index e3c9615b..f71b992c 100644 --- a/jupyterhub/user.py +++ b/jupyterhub/user.py @@ -186,7 +186,6 @@ class User(HasTraits): self.spawners[''] = spawner # pass get/setattr to ORM user - def __getattr__(self, attr): if hasattr(self.orm_user, attr): return getattr(self.orm_user, attr) @@ -207,7 +206,7 @@ class User(HasTraits): if name not in self.spawners: return False spawner = self.spawners[name] - if spawner._spawn_pending or spawner._stop_pending: + if spawner._spawn_pending or spawner._stop_pending or spawner._proxy_pending: return False # server is not running if spawn or stop is still pending if spawner.server is None: return False @@ -324,6 +323,8 @@ class User(HasTraits): spawner._spawn_pending = True # wait for spawner.start to return try: + # run optional preparation work to bootstrap the notebook + yield gen.maybe_future(self.spawner.run_pre_spawn_hook()) f = spawner.start() # commit any changes in spawner.start (always commit db changes before yield) db.commit() @@ -433,10 +434,13 @@ class User(HasTraits): self.db.delete(orm_token) self.db.commit() finally: - spawner._stop_pending = False # trigger post-spawner hook on authenticator auth = spawner.authenticator - if auth: - yield gen.maybe_future( - auth.post_spawn_stop(self, spawner) - ) + try: + if auth: + yield gen.maybe_future( + auth.post_spawn_stop(self, spawner) + ) + except Exception: + self.log.exception("Error in Authenticator.post_spawn_stop for %s", self) + spawner._stop_pending = False diff --git a/jupyterhub/utils.py b/jupyterhub/utils.py index b556172c..38ece304 100644 --- a/jupyterhub/utils.py +++ b/jupyterhub/utils.py @@ -48,6 +48,12 @@ def can_connect(ip, port): else: return True +# exponential falloff factors: +# start at 100ms, falloff by 2x +# never longer than 5s +DT_MIN = 0.1 +DT_SCALE = 2 +DT_MAX = 5 @gen.coroutine def wait_for_server(ip, port, timeout=10): @@ -56,11 +62,13 @@ def wait_for_server(ip, port, timeout=10): ip = '127.0.0.1' loop = ioloop.IOLoop.current() tic = loop.time() - while loop.time() - tic < timeout: + dt = DT_MIN + while dt > 0: if can_connect(ip, port): return else: - yield gen.sleep(0.1) + yield gen.sleep(dt) + dt = min(dt * DT_SCALE, DT_MAX, timeout - (loop.time() - tic)) raise TimeoutError( "Server at {ip}:{port} didn't respond in {timeout} seconds".format(**locals()) ) @@ -75,7 +83,8 @@ def wait_for_http_server(url, timeout=10): loop = ioloop.IOLoop.current() tic = loop.time() client = AsyncHTTPClient() - while loop.time() - tic < timeout: + dt = DT_MIN + while dt > 0: try: r = yield client.fetch(url, follow_redirects=False) except HTTPError as e: @@ -86,16 +95,17 @@ def wait_for_http_server(url, timeout=10): # but 502 or other proxy error is conceivable app_log.warning( "Server at %s responded with error: %s", url, e.code) - yield gen.sleep(0.1) + yield gen.sleep(dt) else: app_log.debug("Server at %s responded with %s", url, e.code) return e.response except (OSError, socket.error) as e: if e.errno not in {errno.ECONNABORTED, errno.ECONNREFUSED, errno.ECONNRESET}: app_log.warning("Failed to connect to %s (%s)", url, e) - yield gen.sleep(0.1) + yield gen.sleep(dt) else: return r + dt = min(dt * DT_SCALE, DT_MAX, timeout - (loop.time() - tic)) raise TimeoutError( "Server at {url} didn't respond in {timeout} seconds".format(**locals()) diff --git a/setup.py b/setup.py index 45947822..0c751cb2 100755 --- a/setup.py +++ b/setup.py @@ -15,15 +15,16 @@ import shutil import sys v = sys.version_info -if v[:2] < (3,3): - error = "ERROR: JupyterHub requires Python version 3.3 or above." +if v[:2] < (3,4): + error = "ERROR: JupyterHub requires Python version 3.4 or above." print(error, file=sys.stderr) sys.exit(1) - +shell = False if os.name in ('nt', 'dos'): - error = "ERROR: Windows is not supported" - print(error, file=sys.stderr) + shell = True + warning = "WARNING: Windows is not officially supported" + print(warning, file=sys.stderr) # At least we're on the python version we need, move on. @@ -48,10 +49,10 @@ is_repo = os.path.exists(pjoin(here, '.git')) def get_data_files(): """Get data files in share/jupyter""" - + data_files = [] ntrim = len(here + os.path.sep) - + for (d, dirs, filenames) in os.walk(share_jupyter): data_files.append(( d[ntrim:], @@ -99,6 +100,7 @@ setup_args = dict( license = "BSD", platforms = "Linux, Mac OS X", keywords = ['Interactive', 'Interpreter', 'Shell', 'Web'], + python_requires = ">=3.4", classifiers = [ 'Intended Audience :: Developers', 'Intended Audience :: System Administrators', @@ -119,7 +121,7 @@ from distutils.command.build_py import build_py from distutils.command.sdist import sdist -npm_path = ':'.join([ +npm_path = os.pathsep.join([ pjoin(here, 'node_modules', '.bin'), os.environ.get("PATH", os.defpath), ]) @@ -133,27 +135,27 @@ def mtime(path): class BaseCommand(Command): """Dumb empty command because Command needs subclasses to override too much""" user_options = [] - + def initialize_options(self): pass - + def finalize_options(self): pass - + def get_inputs(self): return [] - + def get_outputs(self): return [] class Bower(BaseCommand): description = "fetch static client-side components with bower" - + user_options = [] bower_dir = pjoin(static, 'components') node_modules = pjoin(here, 'node_modules') - + def should_run(self): if not os.path.exists(self.bower_dir): return True @@ -166,26 +168,22 @@ class Bower(BaseCommand): if not os.path.exists(self.node_modules): return True return mtime(self.node_modules) < mtime(pjoin(here, 'package.json')) - + def run(self): if not self.should_run(): print("bower dependencies up to date") return - + if self.should_run_npm(): print("installing build dependencies with npm") - check_call(['npm', 'install', '--progress=false'], cwd=here) + check_call(['npm', 'install', '--progress=false'], cwd=here, shell=shell) os.utime(self.node_modules) - + env = os.environ.copy() env['PATH'] = npm_path - + args = ['bower', 'install', '--allow-root', '--config.interactive=false'] try: - check_call( - ['bower', 'install', '--allow-root', '--config.interactive=false'], - cwd=here, - env=env, - ) + check_call(args, cwd=here, env=env, shell=shell) except OSError as e: print("Failed to run bower: %s" % e, file=sys.stderr) print("You can install js dependencies with `npm install`", file=sys.stderr) @@ -197,11 +195,11 @@ class Bower(BaseCommand): class CSS(BaseCommand): description = "compile CSS from LESS" - + def should_run(self): """Does less need to run?""" # from IPython.html.tasks.py - + css_targets = [pjoin(static, 'css', 'style.min.css')] css_maps = [t + '.map' for t in css_targets] targets = css_targets + css_maps @@ -209,7 +207,7 @@ class CSS(BaseCommand): # some generated files don't exist return True earliest_target = sorted(mtime(t) for t in targets)[0] - + # check if any .less files are newer than the generated targets for (dirpath, dirnames, filenames) in os.walk(static): for f in filenames: @@ -218,30 +216,31 @@ class CSS(BaseCommand): timestamp = mtime(path) if timestamp > earliest_target: return True - + return False - + def run(self): if not self.should_run(): print("CSS up-to-date") return - + self.run_command('js') - + style_less = pjoin(static, 'less', 'style.less') style_css = pjoin(static, 'css', 'style.min.css') sourcemap = style_css + '.map' - + env = os.environ.copy() env['PATH'] = npm_path + args = [ + 'lessc', '--clean-css', + '--source-map-basepath={}'.format(static), + '--source-map={}'.format(sourcemap), + '--source-map-rootpath=../', + style_less, style_css, + ] try: - check_call([ - 'lessc', '--clean-css', - '--source-map-basepath={}'.format(static), - '--source-map={}'.format(sourcemap), - '--source-map-rootpath=../', - style_less, style_css, - ], cwd=here, env=env) + check_call(args, cwd=here, env=env, shell=shell) except OSError as e: print("Failed to run lessc: %s" % e, file=sys.stderr) print("You can install js dependencies with `npm install`", file=sys.stderr) diff --git a/share/jupyter/hub/templates/spawn_pending.html b/share/jupyter/hub/templates/spawn_pending.html index a662be52..0baf75ef 100644 --- a/share/jupyter/hub/templates/spawn_pending.html +++ b/share/jupyter/hub/templates/spawn_pending.html @@ -7,6 +7,7 @@

Your server is starting up.

You will be redirected automatically when it's ready for you.

+

refresh