mirror of
https://github.com/jupyter/docker-stacks.git
synced 2025-10-15 05:52:57 +00:00
debian, miniconda, notebook version, option updates
* Upgrade to latest debian base image * Upgrade to Notebook 4.3 * Upgrade to Miniconda 4.2.12 * Remove USE_HTTPS env var in favor of command line options for key and cert * Add GEN_CERT env var for generating a self-signed certificate * Remove PASSWORD env var in favor of the new Notebook 4.3 default token auth or the more secure a hashed password command line option
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
|
||||
## What it Gives You
|
||||
|
||||
* Jupyter Notebook 4.2.x
|
||||
* Jupyter Notebook 4.3.x
|
||||
* Conda Python 3.x and Python 2.7.x environments
|
||||
* pyspark, pandas, matplotlib, scipy, seaborn, scikit-learn pre-installed
|
||||
* Spark 2.0.2 with Hadoop 2.7 for use in local mode or to connect to a cluster of Spark workers
|
||||
@@ -13,16 +13,18 @@
|
||||
* [tini](https://github.com/krallin/tini) as the container entrypoint and [start-notebook.sh](../base-notebook/start-notebook.sh) as the default command
|
||||
* A [start-singleuser.sh](../base-notebook/start-singleuser.sh) script useful for running a single-user instance of the Notebook server, as required by JupyterHub
|
||||
* A [start.sh](../base-notebook/start.sh) script useful for running alternative commands in the container (e.g. `ipython`, `jupyter kernelgateway`, `jupyter lab`)
|
||||
* Options for HTTPS, password auth, and passwordless `sudo`
|
||||
* Options for a self-signed HTTPS certificate and passwordless `sudo`
|
||||
|
||||
## Basic Use
|
||||
|
||||
The following command starts a container with the Notebook server listening for HTTP connections on port 8888 without authentication configured.
|
||||
The following command starts a container with the Notebook server listening for HTTP connections on port 8888 with a randomly generated authentication token configured.
|
||||
|
||||
```
|
||||
docker run -d -p 8888:8888 jupyter/pyspark-notebook
|
||||
docker run -it --rm -p 8888:8888 jupyter/pyspark-notebook
|
||||
```
|
||||
|
||||
Take note of the authentication token included in the notebook startup log messages. Include it in the URL you visit to access the Notebook server or enter it in the Notebook login form.
|
||||
|
||||
## Using Spark Local Mode
|
||||
|
||||
This configuration is nice for using Spark on small, local data.
|
||||
@@ -100,7 +102,7 @@ Connection to Spark Cluster on Standalone Mode requires the following set of ste
|
||||
|
||||
The Docker container executes a [`start-notebook.sh` script](../base-notebook/start-notebook.sh) script by default. The `start-notebook.sh` script handles the `NB_UID` and `GRANT_SUDO` features documented in the next section, and then executes the `jupyter notebook`.
|
||||
|
||||
You can pass [Jupyter command line options](https://jupyter.readthedocs.io/en/latest/projects/jupyter-command.html) through the `start-notebook.sh` script when launching the container. For example, to secure the Notebook server with a password hashed using `IPython.lib.passwd()`, run the following:
|
||||
You can pass [Jupyter command line options](https://jupyter.readthedocs.io/en/latest/projects/jupyter-command.html) through the `start-notebook.sh` script when launching the container. For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, run the following:
|
||||
|
||||
```
|
||||
docker run -d -p 8888:8888 jupyter/pyspark-notebook start-notebook.sh --NotebookApp.password='sha1:74ba40f8a388:c913541b7ee99d15d5ed31d4226bf7838f83a50e'
|
||||
@@ -112,25 +114,45 @@ For example, to set the base URL of the notebook server, run the following:
|
||||
docker run -d -p 8888:8888 jupyter/pyspark-notebook start-notebook.sh --NotebookApp.base_url=/some/path
|
||||
```
|
||||
|
||||
For example, to disable all authentication mechanisms (not a recommended practice):
|
||||
|
||||
```
|
||||
docker run -d -p 8888:8888 jupyter/pyspark-notebook start-notebook.sh --NotebookApp.token=''
|
||||
```
|
||||
|
||||
You can sidestep the `start-notebook.sh` script and run your own commands in the container. See the *Alternative Commands* section later in this document for more information.
|
||||
|
||||
## Docker Options
|
||||
|
||||
You may customize the execution of the Docker container and the Notebook server it contains with the following optional arguments.
|
||||
You may customize the execution of the Docker container and the command it is running with the following optional arguments.
|
||||
|
||||
* `-e PASSWORD="YOURPASS"` - Configures Jupyter Notebook to require the given plain-text password. Should be combined with `USE_HTTPS` on untrusted networks. **Note** that this option is not as secure as passing a pre-hashed password on the command line as shown above.
|
||||
* `-e USE_HTTPS=yes` - Configures Jupyter Notebook to accept encrypted HTTPS connections. If a `pem` file containing a SSL certificate and key is not provided (see below), the container will generate a self-signed certificate for you.
|
||||
* `-e GEN_CERT=yes` - Generates a self-signed SSL certificate and configures Jupyter Notebook to use it to accept encrypted HTTPS connections.
|
||||
* `-e NB_UID=1000` - Specify the uid of the `jovyan` user. Useful to mount host volumes with specific file ownership. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su jovyan` after adjusting the user id.)
|
||||
* `-e GRANT_SUDO=yes` - Gives the `jovyan` user passwordless `sudo` capability. Useful for installing OS packages. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su jovyan` after adding `jovyan` to sudoers.) **You should only enable `sudo` if you trust the user or if the container is running on an isolated host.**
|
||||
* `-v /some/host/folder/for/work:/home/jovyan/work` - Host mounts the default working directory on the host to preserve work even when the container is destroyed and recreated (e.g., during an upgrade).
|
||||
* `-v /some/host/folder/for/server.pem:/home/jovyan/.local/share/jupyter/notebook.pem` - Mounts a SSL certificate plus key for `USE_HTTPS`. Useful if you have a real certificate for the domain under which you are running the Notebook server.
|
||||
* `-p 4040:4040` - Opens the default port for the [Spark Monitoring and Instrumentation UI](http://spark.apache.org/docs/latest/monitoring.html). Note every new Spark context that is created is put onto an incrementing port (ie. 4040, 4041, 4042, etc.) by default, and it might be necessary to open multiple ports using a command like `docker run -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/pyspark-notebook`. You can also control the port Spark uses for its web UI with the `spark.ui.port` config option.
|
||||
|
||||
## SSL Certificates
|
||||
|
||||
The notebook server configuration in this Docker image expects the `notebook.pem` file mentioned above to contain a base64 encoded SSL key and at least one base64 encoded SSL certificate. The file may contain additional certificates (e.g., intermediate and root certificates).
|
||||
You may mount SSL key and certificate files into a container and configure Jupyter Notebook to use them to accept HTTPS connections. For example, to mount a host folder containing a `notebook.key` and `notebook.crt`:
|
||||
|
||||
If you have your key and certificate(s) as separate files, you must concatenate them together into the single expected PEM file. Alternatively, you can build your own configuration and Docker image in which you pass the key and certificate separately.
|
||||
```
|
||||
docker run -d -p 8888:8888 \
|
||||
-v /some/host/folder:/etc/ssl/notebook \
|
||||
jupyter/pyspark-notebook start-notebook.sh \
|
||||
--NotebookApp.keyfile=/etc/ssl/notebook/notebook.key
|
||||
--NotebookApp.certfile=/etc/ssl/notebook/notebook.crt
|
||||
```
|
||||
|
||||
Alternatively, you may mount a single PEM file containing both the key and certificate. For example:
|
||||
|
||||
```
|
||||
docker run -d -p 8888:8888 \
|
||||
-v /some/host/folder/notebook.pem:/etc/ssl/notebook.pem \
|
||||
jupyter/pyspark-notebook start-notebook.sh \
|
||||
--NotebookApp.certfile=/etc/ssl/notebook.pem
|
||||
```
|
||||
|
||||
In either case, Jupyter Notebook expects the key and certificate to be a base64 encoded text file. The certificate file or PEM may contain one or more certificates (e.g., server, intermediate, and root).
|
||||
|
||||
For additional information about using SSL, see the following:
|
||||
|
||||
@@ -138,6 +160,9 @@ For additional information about using SSL, see the following:
|
||||
* The [jupyter_notebook_config.py](jupyter_notebook_config.py) file for how this Docker image generates a self-signed certificate.
|
||||
* The [Jupyter Notebook documentation](https://jupyter-notebook.readthedocs.io/en/latest/public_server.html#using-ssl-for-encrypted-communication) for best practices about running a public notebook server in general, most of which are encoded in this image.
|
||||
|
||||
|
||||
|
||||
|
||||
## Conda Environments
|
||||
|
||||
The default Python 3.x [Conda environment](http://conda.pydata.org/docs/using/envs.html) resides in `/opt/conda`. A second Python 2.x Conda environment exists in `/opt/conda/envs/python2`. You can [switch to the python2 environment](http://conda.pydata.org/docs/using/envs.html#change-environments-activate-deactivate) in a shell by entering the following:
|
||||
|
Reference in New Issue
Block a user