Limit markdown line length to 200

This commit is contained in:
Ayaz Salikhov
2021-06-21 12:59:21 +03:00
parent 34992fea4c
commit 334381dfe2
22 changed files with 454 additions and 293 deletions

View File

@@ -1,12 +1,15 @@
# Common Features
A container launched from any Jupyter Docker Stacks image runs a Jupyter Notebook server by default. The container does so by executing a `start-notebook.sh` script. This script configures the internal container environment and then runs `jupyter notebook`, passing it any command line arguments received.
A container launched from any Jupyter Docker Stacks image runs a Jupyter Notebook server by default.
The container does so by executing a `start-notebook.sh` script.
This script configures the internal container environment and then runs `jupyter notebook`, passing it any command line arguments received.
This page describes the options supported by the startup script as well as how to bypass it to run alternative commands.
## Notebook Options
You can pass [Jupyter command line options](https://jupyter-notebook.readthedocs.io/en/stable/config.html#options) to the `start-notebook.sh` script when launching the container. For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, you can run the following:
You can pass [Jupyter command line options](https://jupyter-notebook.readthedocs.io/en/stable/config.html#options) to the `start-notebook.sh` script when launching the container.
For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, you can run the following:
```bash
docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp.password='sha1:74ba40f8a388:c913541b7ee99d15d5ed31d4226bf7838f83a50e'
@@ -21,21 +24,60 @@ docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp
## Docker Options
You may instruct the `start-notebook.sh` script to customize the container environment before launching
the notebook server. You do so by passing arguments to the `docker run` command.
the notebook server.
You do so by passing arguments to the `docker run` command.
- `-e NB_USER=jovyan` - Instructs the startup script to change the default container username from `jovyan` to the provided value. Causes the script to rename the `jovyan` user home folder. For this option to take effect, you must run the container with `--user root`, set the working directory `-w /home/${NB_USER}` and set the environment variable `-e CHOWN_HOME=yes` (see below for detail). This feature is useful when mounting host volumes with specific home folder.
- `-e NB_UID=1000` - Instructs the startup script to switch the numeric user ID of `${NB_USER}` to the given value. This feature is useful when mounting host volumes with specific owner permissions. For this option to take effect, you must run the container with `--user root`. (The startup script will `su ${NB_USER}` after adjusting the user ID.) You might consider using modern Docker options `--user` and `--group-add` instead. See the last bullet below for details.
- `-e NB_GID=100` - Instructs the startup script to change the primary group of`${NB_USER}` to `${NB_GID}` (the new group is added with a name of `${NB_GROUP}` if it is defined, otherwise the group is named `${NB_USER}`). This feature is useful when mounting host volumes with specific group permissions. For this option to take effect, you must run the container with `--user root`. (The startup script will `su ${NB_USER}` after adjusting the group ID.) You might consider using modern Docker options `--user` and `--group-add` instead. See the last bullet below for details. The user is added to supplemental group `users` (gid 100) in order to allow write access to the home directory and `/opt/conda`. If you override the user/group logic, ensure the user stays in group `users` if you want them to be able to modify files in the image.
- `-e NB_GROUP=<name>` - The name used for `${NB_GID}`, which defaults to `${NB_USER}`. This is only used if `${NB_GID}` is specified and completely optional: there is only cosmetic effect.
- `-e NB_UMASK=<umask>` - Configures Jupyter to use a different umask value from default, i.e. `022`. For example, if setting umask to `002`, new files will be readable and writable by group members instead of just writable by the owner. Wikipedia has a good article about [umask](https://en.wikipedia.org/wiki/Umask). Feel free to read it in order to choose the value that better fits your needs. Default value should fit most situations. Note that `NB_UMASK` when set only applies to the Jupyter process itself - you cannot use it to set a umask for additional files created during run-hooks e.g. via `pip` or `conda` - if you need to set a umask for these you must set `umask` for each command.
- `-e CHOWN_HOME=yes` - Instructs the startup script to change the `${NB_USER}` home directory owner and group to the current value of `${NB_UID}` and `${NB_GID}`. This change will take effect even if the user home directory is mounted from the host using `-v` as described below. The change is **not** applied recursively by default. You can change modify the `chown` behavior by setting `CHOWN_HOME_OPTS` (e.g., `-e CHOWN_HOME_OPTS='-R'`).
- `-e CHOWN_EXTRA="<some dir>,<some other dir>"` - Instructs the startup script to change the owner and group of each comma-separated container directory to the current value of `${NB_UID}` and `${NB_GID}`. The change is **not** applied recursively by default. You can change modify the `chown` behavior by setting `CHOWN_EXTRA_OPTS` (e.g., `-e CHOWN_EXTRA_OPTS='-R'`).
- `-e GRANT_SUDO=yes` - Instructs the startup script to grant the `NB_USER` user passwordless `sudo` capability. You do **not** need this option to allow the user to `conda` or `pip` install additional packages. This option is useful, however, when you wish to give `${NB_USER}` the ability to install OS packages with `apt` or modify other root-owned files in the container. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su ${NB_USER}` after adding `${NB_USER}` to sudoers.) **You should only enable `sudo` if you trust the user or if the container is running on an isolated host.**
- `-e NB_USER=jovyan` - Instructs the startup script to change the default container username from `jovyan` to the provided value.
Causes the script to rename the `jovyan` user home folder.
For this option to take effect, you must run the container with `--user root`, set the working directory `-w /home/${NB_USER}` and set the environment variable `-e CHOWN_HOME=yes` (see below for detail).
This feature is useful when mounting host volumes with specific home folder.
- `-e NB_UID=1000` - Instructs the startup script to switch the numeric user ID of `${NB_USER}` to the given value.
This feature is useful when mounting host volumes with specific owner permissions.
For this option to take effect, you must run the container with `--user root`.
(The startup script will `su ${NB_USER}` after adjusting the user ID.)
You might consider using modern Docker options `--user` and `--group-add` instead.
See the last bullet below for details.
- `-e NB_GID=100` - Instructs the startup script to change the primary group of`${NB_USER}` to `${NB_GID}`
(the new group is added with a name of `${NB_GROUP}` if it is defined, otherwise the group is named `${NB_USER}`).
This feature is useful when mounting host volumes with specific group permissions.
For this option to take effect, you must run the container with `--user root`.
(The startup script will `su ${NB_USER}` after adjusting the group ID.)
You might consider using modern Docker options `--user` and `--group-add` instead.
See the last bullet below for details.
The user is added to supplemental group `users` (gid 100) in order to allow write access to the home directory and `/opt/conda`.
If you override the user/group logic, ensure the user stays in group `users` if you want them to be able to modify files in the image.
- `-e NB_GROUP=<name>` - The name used for `${NB_GID}`, which defaults to `${NB_USER}`.
This is only used if `${NB_GID}` is specified and completely optional: there is only cosmetic effect.
- `-e NB_UMASK=<umask>` - Configures Jupyter to use a different umask value from default, i.e. `022`.
For example, if setting umask to `002`, new files will be readable and writable by group members instead of just writable by the owner.
Wikipedia has a good article about [umask](https://en.wikipedia.org/wiki/Umask).
Feel free to read it in order to choose the value that better fits your needs.
Default value should fit most situations.
Note that `NB_UMASK` when set only applies to the Jupyter process itself - you cannot use it to set a umask for additional files created during run-hooks
e.g. via `pip` or `conda` - if you need to set a umask for these you must set `umask` for each command.
- `-e CHOWN_HOME=yes` - Instructs the startup script to change the `${NB_USER}` home directory owner and group to the current value of `${NB_UID}` and `${NB_GID}`.
This change will take effect even if the user home directory is mounted from the host using `-v` as described below.
The change is **not** applied recursively by default.
You can change modify the `chown` behavior by setting `CHOWN_HOME_OPTS` (e.g., `-e CHOWN_HOME_OPTS='-R'`).
- `-e CHOWN_EXTRA="<some dir>,<some other dir>"` - Instructs the startup script to change the owner and group of each comma-separated container directory to the current value of `${NB_UID}` and `${NB_GID}`.
The change is **not** applied recursively by default.
You can change modify the `chown` behavior by setting `CHOWN_EXTRA_OPTS` (e.g., `-e CHOWN_EXTRA_OPTS='-R'`).
- `-e GRANT_SUDO=yes` - Instructs the startup script to grant the `NB_USER` user passwordless `sudo` capability.
You do **not** need this option to allow the user to `conda` or `pip` install additional packages.
This option is useful, however, when you wish to give `${NB_USER}` the ability to install OS packages with `apt` or modify other root-owned files in the container.
For this option to take effect, you must run the container with `--user root`.
(The `start-notebook.sh` script will `su ${NB_USER}` after adding `${NB_USER}` to sudoers.)
**You should only enable `sudo` if you trust the user or if the container is running on an isolated host.**
- `-e GEN_CERT=yes` - Instructs the startup script to generates a self-signed SSL certificate and configure Jupyter Notebook to use it to accept encrypted HTTPS connections.
- `-e JUPYTER_ENABLE_LAB=yes` - Instructs the startup script to run `jupyter lab` instead of the default `jupyter notebook` command. Useful in container orchestration environments where setting environment variables is easier than change command line parameters.
- `-e RESTARTABLE=yes` - Runs Jupyter in a loop so that quitting Jupyter does not cause the container to exit. This may be useful when you need to install extensions that require restarting Jupyter.
- `-v /some/host/folder/for/work:/home/jovyan/work` - Mounts a host machine directory as folder in the container. Useful when you want to preserve notebooks and other work even after the container is destroyed. **You must grant the within-container notebook user or group (`NB_UID` or `NB_GID`) write access to the host directory (e.g., `sudo chown 1000 /some/host/folder/for/work`).**
- `--user 5000 --group-add users` - Launches the container with a specific user ID and adds that user to the `users` group so that it can modify files in the default home directory and `/opt/conda`. You can use these arguments as alternatives to setting `${NB_UID}` and `${NB_GID}`.
- `-e JUPYTER_ENABLE_LAB=yes` - Instructs the startup script to run `jupyter lab` instead of the default `jupyter notebook` command.
Useful in container orchestration environments where setting environment variables is easier than change command line parameters.
- `-e RESTARTABLE=yes` - Runs Jupyter in a loop so that quitting Jupyter does not cause the container to exit.
This may be useful when you need to install extensions that require restarting Jupyter.
- `-v /some/host/folder/for/work:/home/jovyan/work` - Mounts a host machine directory as folder in the container.
Useful when you want to preserve notebooks and other work even after the container is destroyed.
**You must grant the within-container notebook user or group (`NB_UID` or `NB_GID`) write access to the host directory (e.g., `sudo chown 1000 /some/host/folder/for/work`).**
- `--user 5000 --group-add users` - Launches the container with a specific user ID and adds that user to the `users` group so that it can modify files in the default home directory and `/opt/conda`.
You can use these arguments as alternatives to setting `${NB_UID}` and `${NB_GID}`.
## Startup Hooks
@@ -52,7 +94,8 @@ script for execution details.
## SSL Certificates
You may mount SSL key and certificate files into a container and configure Jupyter Notebook to use them to accept HTTPS connections. For example, to mount a host folder containing a `notebook.key` and `notebook.crt` and use them, you might run the following:
You may mount SSL key and certificate files into a container and configure Jupyter Notebook to use them to accept HTTPS connections.
For example, to mount a host folder containing a `notebook.key` and `notebook.crt` and use them, you might run the following:
```bash
docker run -d -p 8888:8888 \
@@ -62,7 +105,8 @@ docker run -d -p 8888:8888 \
--NotebookApp.certfile=/etc/ssl/notebook/notebook.crt
```
Alternatively, you may mount a single PEM file containing both the key and certificate. For example:
Alternatively, you may mount a single PEM file containing both the key and certificate.
For example:
```bash
docker run -d -p 8888:8888 \
@@ -71,11 +115,13 @@ docker run -d -p 8888:8888 \
--NotebookApp.certfile=/etc/ssl/notebook.pem
```
In either case, Jupyter Notebook expects the key and certificate to be a base64 encoded text file. The certificate file or PEM may contain one or more certificates (e.g., server, intermediate, and root).
In either case, Jupyter Notebook expects the key and certificate to be a base64 encoded text file.
The certificate file or PEM may contain one or more certificates (e.g., server, intermediate, and root).
For additional information about using SSL, see the following:
- The [docker-stacks/examples](https://github.com/jupyter/docker-stacks/tree/master/examples) for information about how to use [Let's Encrypt](https://letsencrypt.org/) certificates when you run these stacks on a publicly visible domain.
- The [docker-stacks/examples](https://github.com/jupyter/docker-stacks/tree/master/examples) for information about how to use
[Let's Encrypt](https://letsencrypt.org/) certificates when you run these stacks on a publicly visible domain.
- The [jupyter_notebook_config.py](https://github.com/jupyter/docker-stacks/blob/master/base-notebook/jupyter_notebook_config.py) file for how this Docker image generates a self-signed certificate.
- The [Jupyter Notebook documentation](https://jupyter-notebook.readthedocs.io/en/latest/public_server.html#securing-a-notebook-server) for best practices about securing a public notebook server in general.
@@ -83,7 +129,9 @@ For additional information about using SSL, see the following:
### start.sh
The `start-notebook.sh` script actually inherits most of its option handling capability from a more generic `start.sh` script. The `start.sh` script supports all of the features described above, but allows you to specify an arbitrary command to execute. For example, to run the text-based `ipython` console in a container, do the following:
The `start-notebook.sh` script actually inherits most of its option handling capability from a more generic `start.sh` script.
The `start.sh` script supports all of the features described above, but allows you to specify an arbitrary command to execute.
For example, to run the text-based `ipython` console in a container, do the following:
```bash
docker run -it --rm jupyter/base-notebook start.sh ipython
@@ -99,13 +147,17 @@ This script is particularly useful when you derive a new Dockerfile from this im
### Others
You can bypass the provided scripts and specify an arbitrary start command. If you do, keep in mind that features supported by the `start.sh` script and its kin will not function (e.g., `GRANT_SUDO`).
You can bypass the provided scripts and specify an arbitrary start command.
If you do, keep in mind that features supported by the `start.sh` script and its kin will not function (e.g., `GRANT_SUDO`).
## Conda Environments
The default Python 3.x [Conda environment](https://conda.io/projects/conda/en/latest/user-guide/concepts/environments.html) resides in `/opt/conda`. The `/opt/conda/bin` directory is part of the default `jovyan` user's `${PATH}`. That directory is also whitelisted for use in `sudo` commands by the `start.sh` script.
The default Python 3.x [Conda environment](https://conda.io/projects/conda/en/latest/user-guide/concepts/environments.html) resides in `/opt/conda`.
The `/opt/conda/bin` directory is part of the default `jovyan` user's `${PATH}`.
That directory is also whitelisted for use in `sudo` commands by the `start.sh` script.
The `jovyan` user has full read/write access to the `/opt/conda` directory. You can use either `pip`, `conda` or `mamba` to install new packages without any additional permissions.
The `jovyan` user has full read/write access to the `/opt/conda` directory.
You can use either `pip`, `conda` or `mamba` to install new packages without any additional permissions.
```bash
# install a package into the default (python 3.x) environment and cleanup after the installation

View File

@@ -1,19 +1,16 @@
# Contributed Recipes
Users sometimes share interesting ways of using the Jupyter Docker Stacks. We encourage users to
[contribute these recipes](../contributing/recipes.md) to the documentation in case they prove
Users sometimes share interesting ways of using the Jupyter Docker Stacks.
We encourage users to [contribute these recipes](../contributing/recipes.md) to the documentation in case they prove
useful to other members of the community by submitting a pull request to `docs/using/recipes.md`.
The sections below capture this knowledge.
## Using `sudo` within a container
Password authentication is disabled for the `NB_USER` (e.g., `jovyan`). This choice was made to
avoid distributing images with a weak default password that users ~might~ will forget to change
before running a container on a publicly accessible host.
Password authentication is disabled for the `NB_USER` (e.g., `jovyan`).
This choice was made to avoid distributing images with a weak default password that users ~might~ will forget to change before running a container on a publicly accessible host.
You can grant the within-container `NB_USER` passwordless `sudo` access by adding
`-e GRANT_SUDO=yes` and `--user root` to your Docker command line or appropriate container
orchestrator config.
You can grant the within-container `NB_USER` passwordless `sudo` access by adding `-e GRANT_SUDO=yes` and `--user root` to your Docker command line or appropriate container orchestrator config.
For example:
@@ -21,8 +18,8 @@ For example:
docker run -it -e GRANT_SUDO=yes --user root jupyter/minimal-notebook
```
**You should only enable `sudo` if you trust the user and/or if the container is running on an
isolated host.** See [Docker security documentation](https://docs.docker.com/engine/security/userns-remap/) for more information about running containers as `root`.
**You should only enable `sudo` if you trust the user and/or if the container is running on an isolated host.**
See [Docker security documentation](https://docs.docker.com/engine/security/userns-remap/) for more information about running containers as `root`.
## Using `pip install` or `conda install` in a Child Docker image
@@ -44,7 +41,8 @@ docker build --rm -t jupyter/my-datascience-notebook .
```
To use a requirements.txt file, first create your `requirements.txt` file with the listing of
packages desired. Next, create a new Dockerfile like the one shown below.
packages desired.
Next, create a new Dockerfile like the one shown below.
```dockerfile
# Start from a core stack version
@@ -73,9 +71,8 @@ Ref: [docker-stacks/commit/79169618d571506304934a7b29039085e77db78c](https://git
## Add a Python 2.x environment
Python 2.x was removed from all images on August 10th, 2017, starting in tag `cc9feab481f7`. You can
add a Python 2.x environment by defining your own Dockerfile inheriting from one of the images like
so:
Python 2.x was removed from all images on August 10th, 2017, starting in tag `cc9feab481f7`.
You can add a Python 2.x environment by defining your own Dockerfile inheriting from one of the images like so:
```dockerfile
# Choose your desired base image
@@ -150,7 +147,8 @@ Run jupyterlab using a command such as
## Dask JupyterLab Extension
[Dask JupyterLab Extension](https://github.com/dask/dask-labextension) provides a JupyterLab extension to manage Dask clusters, as well as embed Dask's dashboard plots directly into JupyterLab panes. Create the Dockerfile as:
[Dask JupyterLab Extension](https://github.com/dask/dask-labextension) provides a JupyterLab extension to manage Dask clusters, as well as embed Dask's dashboard plots directly into JupyterLab panes.
Create the Dockerfile as:
```dockerfile
# Start from a core stack version
@@ -208,8 +206,8 @@ Credit: [Paolo D.](https://github.com/pdonorio) based on
## xgboost
You need to install conda's gcc for Python xgboost to work properly. Otherwise, you'll get an
exception about libgomp.so.1 missing GOMP_4.0.
You need to install conda's gcc for Python xgboost to work properly.
Otherwise, you'll get an exception about libgomp.so.1 missing GOMP_4.0.
```bash
conda install --quiet --yes gcc && \
@@ -233,25 +231,25 @@ Sometimes it is useful to run the Jupyter instance behind a nginx proxy, for ins
- you may have many different services in addition to Jupyter running on the same server, and want
to nginx to help improve server performance in manage the connections
Here is a [quick example NGINX configuration](https://gist.github.com/cboettig/8643341bd3c93b62b5c2)
to get started. You'll need a server, a `.crt` and `.key` file for your server, and `docker` &
`docker-compose` installed. Then just download the files at that gist and run `docker-compose up -d`
to test it out. Customize the `nginx.conf` file to set the desired paths and add other services.
Here is a [quick example NGINX configuration](https://gist.github.com/cboettig/8643341bd3c93b62b5c2) to get started.
You'll need a server, a `.crt` and `.key` file for your server, and `docker` & `docker-compose` installed.
Then just download the files at that gist and run `docker-compose up -d` to test it out.
Customize the `nginx.conf` file to set the desired paths and add other services.
## Host volume mounts and notebook errors
If you are mounting a host directory as `/home/jovyan/work` in your container and you receive
permission errors or connection errors when you create a notebook, be sure that the `jovyan` user
(UID=1000 by default) has read/write access to the directory on the host. Alternatively, specify the
UID of the `jovyan` user on container startup using the `-e NB_UID` option described in the
(UID=1000 by default) has read/write access to the directory on the host.
Alternatively, specify the UID of the `jovyan` user on container startup using the `-e NB_UID` option described in the
[Common Features, Docker Options section](../using/common.html#Docker-Options)
Ref: <https://github.com/jupyter/docker-stacks/issues/199>
## Manpage installation
Most containers, including our Ubuntu base image, ship without manpages installed to save space. You
can use the following dockerfile to inherit from one of our images to enable manpages:
Most containers, including our Ubuntu base image, ship without manpages installed to save space.
You can use the following dockerfile to inherit from one of our images to enable manpages:
```dockerfile
# Choose your desired base image
@@ -467,21 +465,29 @@ RUN pip install --quiet --no-cache-dir jupyter_dashboards faker && \
USER root
# Ensure we overwrite the kernel config so that toree connects to cluster
RUN jupyter toree install --sys-prefix --spark_opts="--master yarn --deploy-mode client --driver-memory 512m --executor-memory 512m --executor-cores 1 --driver-java-options -Dhdp.version=2.5.3.0-37 --conf spark.hadoop.yarn.timeline-service.enabled=false"
RUN jupyter toree install --sys-prefix --spark_opts="\
--master yarn
--deploy-mode client
--driver-memory 512m
--executor-memory 512m
--executor-cores 1
--driver-java-options
-Dhdp.version=2.5.3.0-37
--conf spark.hadoop.yarn.timeline-service.enabled=false
"
USER ${NB_UID}
```
Credit: [britishbadger](https://github.com/britishbadger) from
[docker-stacks/issues/369](https://github.com/jupyter/docker-stacks/issues/369)
Credit: [britishbadger](https://github.com/britishbadger) from [docker-stacks/issues/369](https://github.com/jupyter/docker-stacks/issues/369)
## Run Jupyter Notebook/Lab inside an already secured environment (i.e., with no token)
(Adapted from [issue 728](https://github.com/jupyter/docker-stacks/issues/728))
The default security is very good. There are use cases, encouraged by containers, where the jupyter
container and the system it runs within, lie inside the security boundary. In these use cases it is
convenient to launch the server without a password or token. In this case, you should use the
`start.sh` script to launch the server with no token:
The default security is very good.
There are use cases, encouraged by containers, where the jupyter container and the system it runs within, lie inside the security boundary.
In these use cases it is convenient to launch the server without a password or token.
In this case, you should use the `start.sh` script to launch the server with no token:
For jupyterlab:
@@ -517,7 +523,8 @@ Ref: <https://github.com/jupyter/docker-stacks/issues/675>
## Enable auto-sklearn notebooks
Using `auto-sklearn` requires `swig`, which the other notebook images lack, so it cant be experimented with. Also, there is no Conda package for `auto-sklearn`.
Using `auto-sklearn` requires `swig`, which the other notebook images lack, so it cant be experimented with.
Also, there is no Conda package for `auto-sklearn`.
```dockerfile
ARG BASE_CONTAINER=jupyter/scipy-notebook
@@ -539,7 +546,8 @@ RUN pip install --quiet --no-cache-dir auto-sklearn && \
## Enable Delta Lake in Spark notebooks
Please note that the [Delta Lake](https://delta.io/) packages are only available for Spark version > `3.0`. By adding the properties to `spark-defaults.conf`, the user no longer needs to enable Delta support in each notebook.
Please note that the [Delta Lake](https://delta.io/) packages are only available for Spark version > `3.0`.
By adding the properties to `spark-defaults.conf`, the user no longer needs to enable Delta support in each notebook.
```dockerfile
FROM jupyter/pyspark-notebook:latest

View File

@@ -9,9 +9,13 @@ This section provides details about the second.
## Using the Docker CLI
You can launch a local Docker container from the Jupyter Docker Stacks using the [Docker command line interface](https://docs.docker.com/engine/reference/commandline/cli/). There are numerous ways to configure containers using the CLI. The following are some common patterns.
You can launch a local Docker container from the Jupyter Docker Stacks using the [Docker command line interface](https://docs.docker.com/engine/reference/commandline/cli/).
There are numerous ways to configure containers using the CLI.
The following are some common patterns.
**Example 1** This command pulls the `jupyter/scipy-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host. It then starts a container running a Jupyter Notebook server and exposes the server on host port 8888. The server logs appear in the terminal and include a URL to the notebook server.
**Example 1** This command pulls the `jupyter/scipy-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host.
It then starts a container running a Jupyter Notebook server and exposes the server on host port 8888.
The server logs appear in the terminal and include a URL to the notebook server.
```bash
$ docker run -p 8888:8888 jupyter/scipy-notebook:33add21fab64
@@ -52,7 +56,9 @@ $ docker rm d67fe77f1a84
d67fe77f1a84
```
**Example 2** This command pulls the `jupyter/r-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host. It then starts a container running a Jupyter Notebook server and exposes the server on host port 10000. The server logs appear in the terminal and include a URL to the notebook server, but with the internal container port (8888) instead of the the correct host port (10000).
**Example 2** This command pulls the `jupyter/r-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host.
It then starts a container running a Jupyter Notebook server and exposes the server on host port 10000.
The server logs appear in the terminal and include a URL to the notebook server, but with the internal container port (8888) instead of the the correct host port (10000).
```bash
$ docker run --rm -p 10000:8888 -v "${PWD}":/home/jovyan/work jupyter/r-notebook:33add21fab64
@@ -74,9 +80,12 @@ Executing the command: jupyter notebook
http://localhost:8888/?token=3b8dce890cb65570fb0d9c4a41ae067f7604873bd604f5ac
```
Pressing `Ctrl-C` shuts down the notebook server and immediately destroys the Docker container. Files written to `~/work` in the container remain touched. Any other changes made in the container are lost.
Pressing `Ctrl-C` shuts down the notebook server and immediately destroys the Docker container.
Files written to `~/work` in the container remain touched.
Any other changes made in the container are lost.
**Example 3** This command pulls the `jupyter/all-spark-notebook` image currently tagged `latest` from Docker Hub if an image tagged `latest` is not already present on the local host. It then starts a container named `notebook` running a JupyterLab server and exposes the server on a randomly selected port.
**Example 3** This command pulls the `jupyter/all-spark-notebook` image currently tagged `latest` from Docker Hub if an image tagged `latest` is not already present on the local host.
It then starts a container named `notebook` running a JupyterLab server and exposes the server on a randomly selected port.
```bash
docker run -d -P --name notebook jupyter/all-spark-notebook
@@ -112,12 +121,23 @@ notebook
## Using Binder
[Binder](https://mybinder.org/) is a service that allows you to create and share custom computing environments for projects in version control. You can use any of the Jupyter Docker Stacks images as a basis for a Binder-compatible Dockerfile. See the [docker-stacks example](https://mybinder.readthedocs.io/en/latest/sample_repos.html#using-a-docker-image-from-the-jupyter-docker-stacks-repository) and [Using a Dockerfile](https://mybinder.readthedocs.io/en/latest/tutorials/dockerfile.html) sections in the [Binder documentation](https://mybinder.readthedocs.io/en/latest/index.html) for instructions.
[Binder](https://mybinder.org/) is a service that allows you to create and share custom computing environments for projects in version control.
You can use any of the Jupyter Docker Stacks images as a basis for a Binder-compatible Dockerfile.
See the
[docker-stacks example](https://mybinder.readthedocs.io/en/latest/sample_repos.html#using-a-docker-image-from-the-jupyter-docker-stacks-repository) and
[Using a Dockerfile](https://mybinder.readthedocs.io/en/latest/tutorials/dockerfile.html) sections in the
[Binder documentation](https://mybinder.readthedocs.io/en/latest/index.html) for instructions.
## Using JupyterHub
You can configure JupyterHub to launcher Docker containers from the Jupyter Docker Stacks images. If you've been following the [Zero to JupyterHub with Kubernetes](https://zero-to-jupyterhub.readthedocs.io/en/latest/) guide, see the [Use an existing Docker image](https://zero-to-jupyterhub.readthedocs.io/en/latest/jupyterhub/customizing/user-environment.html#choose-and-use-an-existing-docker-image) section for details. If you have a custom JupyterHub deployment, see the [Picking or building a Docker image](https://github.com/jupyterhub/dockerspawner#picking-or-building-a-docker-image) instructions for the [dockerspawner](https://github.com/jupyterhub/dockerspawner) instead.
You can configure JupyterHub to launcher Docker containers from the Jupyter Docker Stacks images.
If you've been following the [Zero to JupyterHub with Kubernetes](https://zero-to-jupyterhub.readthedocs.io/en/latest/) guide,
see the [Use an existing Docker image](https://zero-to-jupyterhub.readthedocs.io/en/latest/jupyterhub/customizing/user-environment.html#choose-and-use-an-existing-docker-image) section for details.
If you have a custom JupyterHub deployment, see the [Picking or building a Docker image](https://github.com/jupyterhub/dockerspawner#picking-or-building-a-docker-image)
instructions for the [dockerspawner](https://github.com/jupyterhub/dockerspawner) instead.
## Using Other Tools and Services
You can use the Jupyter Docker Stacks with any Docker-compatible technology (e.g., [Docker Compose](https://docs.docker.com/compose/), [docker-py](https://github.com/docker/docker-py), your favorite cloud container service). See the documentation of the tool, library, or service for details about how to reference, configure, and launch containers from these images.
You can use the Jupyter Docker Stacks with any Docker-compatible technology
(e.g., [Docker Compose](https://docs.docker.com/compose/), [docker-py](https://github.com/docker/docker-py), your favorite cloud container service).
See the documentation of the tool, library, or service for details about how to reference, configure, and launch containers from these images.

View File

@@ -13,10 +13,8 @@ This section provides details about the first.
## Core Stacks
The Jupyter team maintains a set of Docker image definitions in the
<https://github.com/jupyter/docker-stacks> GitHub
repository. The following sections describe these images including their contents, relationships,
and versioning strategy.
The Jupyter team maintains a set of Docker image definitions in the <https://github.com/jupyter/docker-stacks> GitHub repository.
The following sections describe these images including their contents, relationships, and versioning strategy.
### jupyter/base-notebook
@@ -24,8 +22,8 @@ and versioning strategy.
[Dockerfile commit history](https://github.com/jupyter/docker-stacks/commits/master/base-notebook/Dockerfile) |
[Docker Hub image tags](https://hub.docker.com/r/jupyter/base-notebook/tags/)
`jupyter/base-notebook` is a small image supporting the
[options common across all core stacks](common.md). It is the basis for all other stacks.
`jupyter/base-notebook` is a small image supporting the [options common across all core stacks](common.md).
It is the basis for all other stacks.
- Minimally-functional Jupyter Notebook server (e.g., no LaTeX support for saving notebooks as PDFs)
- [Miniforge](https://github.com/conda-forge/miniforge) Python 3.x in `/opt/conda` with two package managers
@@ -37,8 +35,7 @@ and versioning strategy.
with ownership over the `/home/jovyan` and `/opt/conda` paths
- `tini` as the container entrypoint and a `start-notebook.sh` script as the default command
- A `start-singleuser.sh` script useful for launching containers in JupyterHub
- A `start.sh` script useful for running alternative commands in the container (e.g. `ipython`,
`jupyter kernelgateway`, `jupyter lab`)
- A `start.sh` script useful for running alternative commands in the container (e.g. `ipython`, `jupyter kernelgateway`, `jupyter lab`)
- Options for a self-signed HTTPS certificate and passwordless sudo
### jupyter/minimal-notebook
@@ -188,29 +185,26 @@ communities.
### Image Relationships
The following diagram depicts the build dependency tree of the core images. (i.e., the `FROM`
statements in their Dockerfiles). Any given image inherits the complete content of all ancestor
images pointing to it.
The following diagram depicts the build dependency tree of the core images. (i.e., the `FROM` statements in their Dockerfiles).
Any given image inherits the complete content of all ancestor images pointing to it.
[![Image inheritance
diagram](../images/inherit.svg)](http://interactive.blockdiag.com/?compression=deflate&src=eJyFzTEPgjAQhuHdX9Gws5sQjGzujsaYKxzmQrlr2msMGv-71K0srO_3XGud9NNA8DSfgzESCFlBSdi0xkvQAKTNugw4QnL6GIU10hvX-Zh7Z24OLLq2SjaxpvP10lX35vCf6pOxELFmUbQiUz4oQhYzMc3gCrRt2cWe_FKosmSjyFHC6OS1AwdQWCtyj7sfh523_BI9hKlQ25YdOFdv5fcH0kiEMA)
### Builds
Pull requests to the `jupyter/docker-stacks` repository trigger builds of all images on GitHub
Actions. These images are for testing purposes only and are not saved for further use. When pull requests
merge to master, all images rebuild on Docker Hub and become available to `docker pull` from
Docker Hub.
Pull requests to the `jupyter/docker-stacks` repository trigger builds of all images on GitHub Actions.
These images are for testing purposes only and are not saved for further use.
When pull requests merge to master, all images rebuild on Docker Hub and become available to `docker pull` from Docker Hub.
### Versioning
The `latest` tag in each Docker Hub repository tracks the master branch `HEAD` reference on GitHub.
`latest` is a moving target, by definition, and will have backward-incompatible changes regularly.
Every image on Docker Hub also receives a 12-character tag which corresponds with the git commit SHA
that triggered the image build. You can inspect the state of the `jupyter/docker-stacks` repository
for that commit to review the definition of the image (e.g., images with tag `33add21fab64` were built
from <https://github.com/jupyter/docker-stacks/tree/33add21fab64>.
Every image on Docker Hub also receives a 12-character tag which corresponds with the git commit SHA that triggered the image build.
You can inspect the state of the `jupyter/docker-stacks` repository for that commit to review the definition of the image
(e.g., images with tag `33add21fab64` were built from <https://github.com/jupyter/docker-stacks/tree/33add21fab64>.
You must refer to git-SHA image tags when stability and reproducibility are important in your work.
(e.g. `FROM jupyter/scipy-notebook:33add21fab64`, `docker run -it --rm jupyter/scipy-notebook:33add21fab64`).
@@ -220,18 +214,17 @@ You should only use `latest` when a one-off container instance is acceptable
## Community Stacks
The core stacks are just a tiny sample of what's possible when combining Jupyter with other
technologies. We encourage members of the Jupyter community to create their own stacks based on the
technologies.
We encourage members of the Jupyter community to create their own stacks based on the
core images and link them below.
- [csharp-notebook is a community Jupyter Docker Stack image. Try C# in Jupyter Notebooks](https://github.com/tlinnet/csharp-notebook).
The image includes more than 200 Jupyter Notebooks with example C# code and can readily be tried
online via mybinder.org. Click here to launch
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/tlinnet/csharp-notebook/master).
The image includes more than 200 Jupyter Notebooks with example C# code and can readily be tried online via mybinder.org.
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/tlinnet/csharp-notebook/master).
- [education-notebook is a community Jupyter Docker Stack image](https://github.com/umsi-mads/education-notebook).
The image includes nbgrader and RISE on top of the datascience-notebook image. Click here to
launch it on
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/umsi-mads/education-notebook/master).
The image includes nbgrader and RISE on top of the datascience-notebook image.
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/umsi-mads/education-notebook/master).
- **crosscompass/ihaskell-notebook**
@@ -242,36 +235,34 @@ core images and link them below.
`crosscompass/ihaskell-notebook` is based on [IHaskell](https://github.com/gibiansky/IHaskell).
Includes popular packages and example notebooks.
Try it on
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/jamesdbrock/learn-you-a-haskell-notebook/master?urlpath=lab/tree/ihaskell_examples/ihaskell/IHaskell.ipynb)
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/jamesdbrock/learn-you-a-haskell-notebook/master?urlpath=lab/tree/ihaskell_examples/ihaskell/IHaskell.ipynb)
- [java-notebook is a community Jupyter Docker Stack image](https://github.com/jbindinga/java-notebook).
The image includes [IJava](https://github.com/SpencerPark/IJava) kernel on top of the
minimal-notebook image. Click here to launch it on
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/jbindinga/java-notebook/master).
The image includes [IJava](https://github.com/SpencerPark/IJava) kernel on top of the minimal-notebook image.
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/jbindinga/java-notebook/master).
- [sage-notebook](https://github.com/sharpTrick/sage-notebook) is a community Jupyter Docker Stack
image with the [sagemath](https://www.sagemath.org) kernel on top of the minimal-notebook image. Click
here to launch it on
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sharpTrick/sage-notebook/master).
- [sage-notebook](https://github.com/sharpTrick/sage-notebook)
is a community Jupyter Docker Stack image with the [sagemath](https://www.sagemath.org) kernel on top of the minimal-notebook image.
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sharpTrick/sage-notebook/master).
- [GPU-Jupyter](https://github.com/iot-salzburg/gpu-jupyter/): Leverage Jupyter Notebooks with the
power of your NVIDIA GPU and perform GPU calculations using Tensorflow and Pytorch in
collaborative notebooks. This is done by generating a Dockerfile, that consists of the
**nvidia/cuda** base image, the well-maintained **docker-stacks** that is integrated as submodule
and GPU-able libraries like **Tensorflow**, **Keras** and **PyTorch** on top of it.
power of your NVIDIA GPU and perform GPU calculations using Tensorflow and Pytorch in collaborative notebooks.
This is done by generating a Dockerfile, that consists of the **nvidia/cuda** base image,
the well-maintained **docker-stacks** that is integrated as submodule and
GPU-able libraries like **Tensorflow**, **Keras** and **PyTorch** on top of it.
- [PRP GPU Jupyter repo](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/-/tree/prp) and [Registry](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/container_registry): PRP (Pacific Research Platform) maintained registry for jupyter stack based on NVIDIA CUDA-enabled image. Added the PRP image with Pytorch and some other python packages, and GUI Desktop notebook based on <https://github.com/jupyterhub/jupyter-remote-desktop-proxy>.
- [PRP GPU Jupyter repo](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/-/tree/prp) and [Registry](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/container_registry)
PRP (Pacific Research Platform) maintained registry for jupyter stack based on NVIDIA CUDA-enabled image.
Added the PRP image with Pytorch and some other python packages, and GUI Desktop notebook based on <https://github.com/jupyterhub/jupyter-remote-desktop-proxy>.
- [cgspatial-notebook](https://github.com/SCiO-systems/cgspatial-notebook) is a community Jupyter
Docker Stack image. The image includes major geospatial Python & R libraries on top of the
datascience-notebook image. Try it on
binder:[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/SCiO-systems/cgspatial-notebook/master)
- [cgspatial-notebook](https://github.com/SCiO-systems/cgspatial-notebook) is a community Jupyter Docker Stack image.
The image includes major geospatial Python & R libraries on top of the datascience-notebook image.
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/SCiO-systems/cgspatial-notebook/master)
- [kotlin-notebook](https://github.com/knonm/kotlin-notebook) is a community Jupyter
Docker Stack image. The image includes [Kotlin kernel for Jupyter/IPython](https://github.com/Kotlin/kotlin-jupyter) on top of the
`base-notebook` image. Try it on
Binder: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/knonm/kotlin-notebook/main)
- [kotlin-notebook](https://github.com/knonm/kotlin-notebook) is a community Jupyter Docker Stack image.
The image includes [Kotlin kernel for Jupyter/IPython](https://github.com/Kotlin/kotlin-jupyter) on top of the
`base-notebook` image.
Try it on [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/knonm/kotlin-notebook/main)
See the [contributing guide](../contributing/stacks.md) for information about how to create your own
Jupyter Docker Stack.

View File

@@ -6,13 +6,18 @@ This page provides details about features specific to one or more images.
### Specific Docker Image Options
- `-p 4040:4040` - The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images open [SparkUI (Spark Monitoring and Instrumentation UI)](https://spark.apache.org/docs/latest/monitoring.html) at default port `4040`, this option map `4040` port inside docker container to `4040` port on host machine . Note every new spark context that is created is put onto an incrementing port (ie. 4040, 4041, 4042, etc.), and it might be necessary to open multiple ports. For example: `docker run -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/pyspark-notebook`.
- `-p 4040:4040` - The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images open
[SparkUI (Spark Monitoring and Instrumentation UI)](https://spark.apache.org/docs/latest/monitoring.html) at default port `4040`,
this option map `4040` port inside docker container to `4040` port on host machine.
Note every new spark context that is created is put onto an incrementing port (ie. 4040, 4041, 4042, etc.), and it might be necessary to open multiple ports.
For example: `docker run -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/pyspark-notebook`.
### Build an Image with a Different Version of Spark
You can build a `pyspark-notebook` image (and also the downstream `all-spark-notebook` image) with a different version of Spark by overriding the default value of the following arguments at build time.
- Spark distribution is defined by the combination of the Spark and the Hadoop version and verified by the package checksum, see [Download Apache Spark](https://spark.apache.org/downloads.html) and the [archive repo](https://archive.apache.org/dist/spark/) for more information.
- Spark distribution is defined by the combination of the Spark and the Hadoop version and verified by the package checksum,
see [Download Apache Spark](https://spark.apache.org/downloads.html) and the [archive repo](https://archive.apache.org/dist/spark/) for more information.
- `spark_version`: The Spark version to install (`3.0.0`).
- `hadoop_version`: The Hadoop version (`3.2`).
- `spark_checksum`: The package checksum (`BFE4540...`).
@@ -46,7 +51,8 @@ docker run -it --rm jupyter/pyspark-notebook:spark-2.4.7 pyspark --version
### Usage Examples
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks. The following sections provide some examples of how to get started using them.
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks.
The following sections provide some examples of how to get started using them.
#### Using Spark Local Mode
@@ -133,16 +139,18 @@ Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs
deployed, run the same version of Spark.
1. [Deploy Spark in Standalone Mode](https://spark.apache.org/docs/latest/spark-standalone.html).
2. Run the Docker container with `--net=host` in a location that is network addressable by all of
your Spark workers. (This is a [Spark networking
requirement](https://spark.apache.org/docs/latest/cluster-overview.html#components).)
- NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`. See <https://github.com/jupyter/docker-stacks/issues/64> for details.
your Spark workers.
(This is a [Spark networking requirement](https://spark.apache.org/docs/latest/cluster-overview.html#components).)
- NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`.
See <https://github.com/jupyter/docker-stacks/issues/64> for details.
**Note**: In the following examples we are using the Spark master URL `spark://master:7077` that shall be replaced by the URL of the Spark master.
##### Standalone Mode in Python
The **same Python version** needs to be used on the notebook (where the driver is located) and on the Spark workers.
The python version used at driver and worker side can be adjusted by setting the environment variables `PYSPARK_PYTHON` and / or `PYSPARK_DRIVER_PYTHON`, see [Spark Configuration][spark-conf] for more information.
The python version used at driver and worker side can be adjusted by setting the environment variables `PYSPARK_PYTHON` and / or `PYSPARK_DRIVER_PYTHON`,
see [Spark Configuration][spark-conf] for more information.
```python
from pyspark.sql import SparkSession