mirror of
https://github.com/jupyter/docker-stacks.git
synced 2025-10-17 15:02:57 +00:00
Limit markdown line length to 200
This commit is contained in:
@@ -1,12 +1,15 @@
|
||||
# Common Features
|
||||
|
||||
A container launched from any Jupyter Docker Stacks image runs a Jupyter Notebook server by default. The container does so by executing a `start-notebook.sh` script. This script configures the internal container environment and then runs `jupyter notebook`, passing it any command line arguments received.
|
||||
A container launched from any Jupyter Docker Stacks image runs a Jupyter Notebook server by default.
|
||||
The container does so by executing a `start-notebook.sh` script.
|
||||
This script configures the internal container environment and then runs `jupyter notebook`, passing it any command line arguments received.
|
||||
|
||||
This page describes the options supported by the startup script as well as how to bypass it to run alternative commands.
|
||||
|
||||
## Notebook Options
|
||||
|
||||
You can pass [Jupyter command line options](https://jupyter-notebook.readthedocs.io/en/stable/config.html#options) to the `start-notebook.sh` script when launching the container. For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, you can run the following:
|
||||
You can pass [Jupyter command line options](https://jupyter-notebook.readthedocs.io/en/stable/config.html#options) to the `start-notebook.sh` script when launching the container.
|
||||
For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, you can run the following:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp.password='sha1:74ba40f8a388:c913541b7ee99d15d5ed31d4226bf7838f83a50e'
|
||||
@@ -21,21 +24,60 @@ docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp
|
||||
## Docker Options
|
||||
|
||||
You may instruct the `start-notebook.sh` script to customize the container environment before launching
|
||||
the notebook server. You do so by passing arguments to the `docker run` command.
|
||||
the notebook server.
|
||||
You do so by passing arguments to the `docker run` command.
|
||||
|
||||
- `-e NB_USER=jovyan` - Instructs the startup script to change the default container username from `jovyan` to the provided value. Causes the script to rename the `jovyan` user home folder. For this option to take effect, you must run the container with `--user root`, set the working directory `-w /home/${NB_USER}` and set the environment variable `-e CHOWN_HOME=yes` (see below for detail). This feature is useful when mounting host volumes with specific home folder.
|
||||
- `-e NB_UID=1000` - Instructs the startup script to switch the numeric user ID of `${NB_USER}` to the given value. This feature is useful when mounting host volumes with specific owner permissions. For this option to take effect, you must run the container with `--user root`. (The startup script will `su ${NB_USER}` after adjusting the user ID.) You might consider using modern Docker options `--user` and `--group-add` instead. See the last bullet below for details.
|
||||
- `-e NB_GID=100` - Instructs the startup script to change the primary group of`${NB_USER}` to `${NB_GID}` (the new group is added with a name of `${NB_GROUP}` if it is defined, otherwise the group is named `${NB_USER}`). This feature is useful when mounting host volumes with specific group permissions. For this option to take effect, you must run the container with `--user root`. (The startup script will `su ${NB_USER}` after adjusting the group ID.) You might consider using modern Docker options `--user` and `--group-add` instead. See the last bullet below for details. The user is added to supplemental group `users` (gid 100) in order to allow write access to the home directory and `/opt/conda`. If you override the user/group logic, ensure the user stays in group `users` if you want them to be able to modify files in the image.
|
||||
- `-e NB_GROUP=<name>` - The name used for `${NB_GID}`, which defaults to `${NB_USER}`. This is only used if `${NB_GID}` is specified and completely optional: there is only cosmetic effect.
|
||||
- `-e NB_UMASK=<umask>` - Configures Jupyter to use a different umask value from default, i.e. `022`. For example, if setting umask to `002`, new files will be readable and writable by group members instead of just writable by the owner. Wikipedia has a good article about [umask](https://en.wikipedia.org/wiki/Umask). Feel free to read it in order to choose the value that better fits your needs. Default value should fit most situations. Note that `NB_UMASK` when set only applies to the Jupyter process itself - you cannot use it to set a umask for additional files created during run-hooks e.g. via `pip` or `conda` - if you need to set a umask for these you must set `umask` for each command.
|
||||
- `-e CHOWN_HOME=yes` - Instructs the startup script to change the `${NB_USER}` home directory owner and group to the current value of `${NB_UID}` and `${NB_GID}`. This change will take effect even if the user home directory is mounted from the host using `-v` as described below. The change is **not** applied recursively by default. You can change modify the `chown` behavior by setting `CHOWN_HOME_OPTS` (e.g., `-e CHOWN_HOME_OPTS='-R'`).
|
||||
- `-e CHOWN_EXTRA="<some dir>,<some other dir>"` - Instructs the startup script to change the owner and group of each comma-separated container directory to the current value of `${NB_UID}` and `${NB_GID}`. The change is **not** applied recursively by default. You can change modify the `chown` behavior by setting `CHOWN_EXTRA_OPTS` (e.g., `-e CHOWN_EXTRA_OPTS='-R'`).
|
||||
- `-e GRANT_SUDO=yes` - Instructs the startup script to grant the `NB_USER` user passwordless `sudo` capability. You do **not** need this option to allow the user to `conda` or `pip` install additional packages. This option is useful, however, when you wish to give `${NB_USER}` the ability to install OS packages with `apt` or modify other root-owned files in the container. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su ${NB_USER}` after adding `${NB_USER}` to sudoers.) **You should only enable `sudo` if you trust the user or if the container is running on an isolated host.**
|
||||
- `-e NB_USER=jovyan` - Instructs the startup script to change the default container username from `jovyan` to the provided value.
|
||||
Causes the script to rename the `jovyan` user home folder.
|
||||
For this option to take effect, you must run the container with `--user root`, set the working directory `-w /home/${NB_USER}` and set the environment variable `-e CHOWN_HOME=yes` (see below for detail).
|
||||
This feature is useful when mounting host volumes with specific home folder.
|
||||
- `-e NB_UID=1000` - Instructs the startup script to switch the numeric user ID of `${NB_USER}` to the given value.
|
||||
This feature is useful when mounting host volumes with specific owner permissions.
|
||||
For this option to take effect, you must run the container with `--user root`.
|
||||
(The startup script will `su ${NB_USER}` after adjusting the user ID.)
|
||||
You might consider using modern Docker options `--user` and `--group-add` instead.
|
||||
See the last bullet below for details.
|
||||
- `-e NB_GID=100` - Instructs the startup script to change the primary group of`${NB_USER}` to `${NB_GID}`
|
||||
(the new group is added with a name of `${NB_GROUP}` if it is defined, otherwise the group is named `${NB_USER}`).
|
||||
This feature is useful when mounting host volumes with specific group permissions.
|
||||
For this option to take effect, you must run the container with `--user root`.
|
||||
(The startup script will `su ${NB_USER}` after adjusting the group ID.)
|
||||
You might consider using modern Docker options `--user` and `--group-add` instead.
|
||||
See the last bullet below for details.
|
||||
The user is added to supplemental group `users` (gid 100) in order to allow write access to the home directory and `/opt/conda`.
|
||||
If you override the user/group logic, ensure the user stays in group `users` if you want them to be able to modify files in the image.
|
||||
- `-e NB_GROUP=<name>` - The name used for `${NB_GID}`, which defaults to `${NB_USER}`.
|
||||
This is only used if `${NB_GID}` is specified and completely optional: there is only cosmetic effect.
|
||||
- `-e NB_UMASK=<umask>` - Configures Jupyter to use a different umask value from default, i.e. `022`.
|
||||
For example, if setting umask to `002`, new files will be readable and writable by group members instead of just writable by the owner.
|
||||
Wikipedia has a good article about [umask](https://en.wikipedia.org/wiki/Umask).
|
||||
Feel free to read it in order to choose the value that better fits your needs.
|
||||
Default value should fit most situations.
|
||||
Note that `NB_UMASK` when set only applies to the Jupyter process itself - you cannot use it to set a umask for additional files created during run-hooks
|
||||
e.g. via `pip` or `conda` - if you need to set a umask for these you must set `umask` for each command.
|
||||
- `-e CHOWN_HOME=yes` - Instructs the startup script to change the `${NB_USER}` home directory owner and group to the current value of `${NB_UID}` and `${NB_GID}`.
|
||||
This change will take effect even if the user home directory is mounted from the host using `-v` as described below.
|
||||
The change is **not** applied recursively by default.
|
||||
You can change modify the `chown` behavior by setting `CHOWN_HOME_OPTS` (e.g., `-e CHOWN_HOME_OPTS='-R'`).
|
||||
- `-e CHOWN_EXTRA="<some dir>,<some other dir>"` - Instructs the startup script to change the owner and group of each comma-separated container directory to the current value of `${NB_UID}` and `${NB_GID}`.
|
||||
The change is **not** applied recursively by default.
|
||||
You can change modify the `chown` behavior by setting `CHOWN_EXTRA_OPTS` (e.g., `-e CHOWN_EXTRA_OPTS='-R'`).
|
||||
- `-e GRANT_SUDO=yes` - Instructs the startup script to grant the `NB_USER` user passwordless `sudo` capability.
|
||||
You do **not** need this option to allow the user to `conda` or `pip` install additional packages.
|
||||
This option is useful, however, when you wish to give `${NB_USER}` the ability to install OS packages with `apt` or modify other root-owned files in the container.
|
||||
For this option to take effect, you must run the container with `--user root`.
|
||||
(The `start-notebook.sh` script will `su ${NB_USER}` after adding `${NB_USER}` to sudoers.)
|
||||
**You should only enable `sudo` if you trust the user or if the container is running on an isolated host.**
|
||||
- `-e GEN_CERT=yes` - Instructs the startup script to generates a self-signed SSL certificate and configure Jupyter Notebook to use it to accept encrypted HTTPS connections.
|
||||
- `-e JUPYTER_ENABLE_LAB=yes` - Instructs the startup script to run `jupyter lab` instead of the default `jupyter notebook` command. Useful in container orchestration environments where setting environment variables is easier than change command line parameters.
|
||||
- `-e RESTARTABLE=yes` - Runs Jupyter in a loop so that quitting Jupyter does not cause the container to exit. This may be useful when you need to install extensions that require restarting Jupyter.
|
||||
- `-v /some/host/folder/for/work:/home/jovyan/work` - Mounts a host machine directory as folder in the container. Useful when you want to preserve notebooks and other work even after the container is destroyed. **You must grant the within-container notebook user or group (`NB_UID` or `NB_GID`) write access to the host directory (e.g., `sudo chown 1000 /some/host/folder/for/work`).**
|
||||
- `--user 5000 --group-add users` - Launches the container with a specific user ID and adds that user to the `users` group so that it can modify files in the default home directory and `/opt/conda`. You can use these arguments as alternatives to setting `${NB_UID}` and `${NB_GID}`.
|
||||
- `-e JUPYTER_ENABLE_LAB=yes` - Instructs the startup script to run `jupyter lab` instead of the default `jupyter notebook` command.
|
||||
Useful in container orchestration environments where setting environment variables is easier than change command line parameters.
|
||||
- `-e RESTARTABLE=yes` - Runs Jupyter in a loop so that quitting Jupyter does not cause the container to exit.
|
||||
This may be useful when you need to install extensions that require restarting Jupyter.
|
||||
- `-v /some/host/folder/for/work:/home/jovyan/work` - Mounts a host machine directory as folder in the container.
|
||||
Useful when you want to preserve notebooks and other work even after the container is destroyed.
|
||||
**You must grant the within-container notebook user or group (`NB_UID` or `NB_GID`) write access to the host directory (e.g., `sudo chown 1000 /some/host/folder/for/work`).**
|
||||
- `--user 5000 --group-add users` - Launches the container with a specific user ID and adds that user to the `users` group so that it can modify files in the default home directory and `/opt/conda`.
|
||||
You can use these arguments as alternatives to setting `${NB_UID}` and `${NB_GID}`.
|
||||
|
||||
## Startup Hooks
|
||||
|
||||
@@ -52,7 +94,8 @@ script for execution details.
|
||||
|
||||
## SSL Certificates
|
||||
|
||||
You may mount SSL key and certificate files into a container and configure Jupyter Notebook to use them to accept HTTPS connections. For example, to mount a host folder containing a `notebook.key` and `notebook.crt` and use them, you might run the following:
|
||||
You may mount SSL key and certificate files into a container and configure Jupyter Notebook to use them to accept HTTPS connections.
|
||||
For example, to mount a host folder containing a `notebook.key` and `notebook.crt` and use them, you might run the following:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8888:8888 \
|
||||
@@ -62,7 +105,8 @@ docker run -d -p 8888:8888 \
|
||||
--NotebookApp.certfile=/etc/ssl/notebook/notebook.crt
|
||||
```
|
||||
|
||||
Alternatively, you may mount a single PEM file containing both the key and certificate. For example:
|
||||
Alternatively, you may mount a single PEM file containing both the key and certificate.
|
||||
For example:
|
||||
|
||||
```bash
|
||||
docker run -d -p 8888:8888 \
|
||||
@@ -71,11 +115,13 @@ docker run -d -p 8888:8888 \
|
||||
--NotebookApp.certfile=/etc/ssl/notebook.pem
|
||||
```
|
||||
|
||||
In either case, Jupyter Notebook expects the key and certificate to be a base64 encoded text file. The certificate file or PEM may contain one or more certificates (e.g., server, intermediate, and root).
|
||||
In either case, Jupyter Notebook expects the key and certificate to be a base64 encoded text file.
|
||||
The certificate file or PEM may contain one or more certificates (e.g., server, intermediate, and root).
|
||||
|
||||
For additional information about using SSL, see the following:
|
||||
|
||||
- The [docker-stacks/examples](https://github.com/jupyter/docker-stacks/tree/master/examples) for information about how to use [Let's Encrypt](https://letsencrypt.org/) certificates when you run these stacks on a publicly visible domain.
|
||||
- The [docker-stacks/examples](https://github.com/jupyter/docker-stacks/tree/master/examples) for information about how to use
|
||||
[Let's Encrypt](https://letsencrypt.org/) certificates when you run these stacks on a publicly visible domain.
|
||||
- The [jupyter_notebook_config.py](https://github.com/jupyter/docker-stacks/blob/master/base-notebook/jupyter_notebook_config.py) file for how this Docker image generates a self-signed certificate.
|
||||
- The [Jupyter Notebook documentation](https://jupyter-notebook.readthedocs.io/en/latest/public_server.html#securing-a-notebook-server) for best practices about securing a public notebook server in general.
|
||||
|
||||
@@ -83,7 +129,9 @@ For additional information about using SSL, see the following:
|
||||
|
||||
### start.sh
|
||||
|
||||
The `start-notebook.sh` script actually inherits most of its option handling capability from a more generic `start.sh` script. The `start.sh` script supports all of the features described above, but allows you to specify an arbitrary command to execute. For example, to run the text-based `ipython` console in a container, do the following:
|
||||
The `start-notebook.sh` script actually inherits most of its option handling capability from a more generic `start.sh` script.
|
||||
The `start.sh` script supports all of the features described above, but allows you to specify an arbitrary command to execute.
|
||||
For example, to run the text-based `ipython` console in a container, do the following:
|
||||
|
||||
```bash
|
||||
docker run -it --rm jupyter/base-notebook start.sh ipython
|
||||
@@ -99,13 +147,17 @@ This script is particularly useful when you derive a new Dockerfile from this im
|
||||
|
||||
### Others
|
||||
|
||||
You can bypass the provided scripts and specify an arbitrary start command. If you do, keep in mind that features supported by the `start.sh` script and its kin will not function (e.g., `GRANT_SUDO`).
|
||||
You can bypass the provided scripts and specify an arbitrary start command.
|
||||
If you do, keep in mind that features supported by the `start.sh` script and its kin will not function (e.g., `GRANT_SUDO`).
|
||||
|
||||
## Conda Environments
|
||||
|
||||
The default Python 3.x [Conda environment](https://conda.io/projects/conda/en/latest/user-guide/concepts/environments.html) resides in `/opt/conda`. The `/opt/conda/bin` directory is part of the default `jovyan` user's `${PATH}`. That directory is also whitelisted for use in `sudo` commands by the `start.sh` script.
|
||||
The default Python 3.x [Conda environment](https://conda.io/projects/conda/en/latest/user-guide/concepts/environments.html) resides in `/opt/conda`.
|
||||
The `/opt/conda/bin` directory is part of the default `jovyan` user's `${PATH}`.
|
||||
That directory is also whitelisted for use in `sudo` commands by the `start.sh` script.
|
||||
|
||||
The `jovyan` user has full read/write access to the `/opt/conda` directory. You can use either `pip`, `conda` or `mamba` to install new packages without any additional permissions.
|
||||
The `jovyan` user has full read/write access to the `/opt/conda` directory.
|
||||
You can use either `pip`, `conda` or `mamba` to install new packages without any additional permissions.
|
||||
|
||||
```bash
|
||||
# install a package into the default (python 3.x) environment and cleanup after the installation
|
||||
|
@@ -1,19 +1,16 @@
|
||||
# Contributed Recipes
|
||||
|
||||
Users sometimes share interesting ways of using the Jupyter Docker Stacks. We encourage users to
|
||||
[contribute these recipes](../contributing/recipes.md) to the documentation in case they prove
|
||||
Users sometimes share interesting ways of using the Jupyter Docker Stacks.
|
||||
We encourage users to [contribute these recipes](../contributing/recipes.md) to the documentation in case they prove
|
||||
useful to other members of the community by submitting a pull request to `docs/using/recipes.md`.
|
||||
The sections below capture this knowledge.
|
||||
|
||||
## Using `sudo` within a container
|
||||
|
||||
Password authentication is disabled for the `NB_USER` (e.g., `jovyan`). This choice was made to
|
||||
avoid distributing images with a weak default password that users ~might~ will forget to change
|
||||
before running a container on a publicly accessible host.
|
||||
Password authentication is disabled for the `NB_USER` (e.g., `jovyan`).
|
||||
This choice was made to avoid distributing images with a weak default password that users ~might~ will forget to change before running a container on a publicly accessible host.
|
||||
|
||||
You can grant the within-container `NB_USER` passwordless `sudo` access by adding
|
||||
`-e GRANT_SUDO=yes` and `--user root` to your Docker command line or appropriate container
|
||||
orchestrator config.
|
||||
You can grant the within-container `NB_USER` passwordless `sudo` access by adding `-e GRANT_SUDO=yes` and `--user root` to your Docker command line or appropriate container orchestrator config.
|
||||
|
||||
For example:
|
||||
|
||||
@@ -21,8 +18,8 @@ For example:
|
||||
docker run -it -e GRANT_SUDO=yes --user root jupyter/minimal-notebook
|
||||
```
|
||||
|
||||
**You should only enable `sudo` if you trust the user and/or if the container is running on an
|
||||
isolated host.** See [Docker security documentation](https://docs.docker.com/engine/security/userns-remap/) for more information about running containers as `root`.
|
||||
**You should only enable `sudo` if you trust the user and/or if the container is running on an isolated host.**
|
||||
See [Docker security documentation](https://docs.docker.com/engine/security/userns-remap/) for more information about running containers as `root`.
|
||||
|
||||
## Using `pip install` or `conda install` in a Child Docker image
|
||||
|
||||
@@ -44,7 +41,8 @@ docker build --rm -t jupyter/my-datascience-notebook .
|
||||
```
|
||||
|
||||
To use a requirements.txt file, first create your `requirements.txt` file with the listing of
|
||||
packages desired. Next, create a new Dockerfile like the one shown below.
|
||||
packages desired.
|
||||
Next, create a new Dockerfile like the one shown below.
|
||||
|
||||
```dockerfile
|
||||
# Start from a core stack version
|
||||
@@ -73,9 +71,8 @@ Ref: [docker-stacks/commit/79169618d571506304934a7b29039085e77db78c](https://git
|
||||
|
||||
## Add a Python 2.x environment
|
||||
|
||||
Python 2.x was removed from all images on August 10th, 2017, starting in tag `cc9feab481f7`. You can
|
||||
add a Python 2.x environment by defining your own Dockerfile inheriting from one of the images like
|
||||
so:
|
||||
Python 2.x was removed from all images on August 10th, 2017, starting in tag `cc9feab481f7`.
|
||||
You can add a Python 2.x environment by defining your own Dockerfile inheriting from one of the images like so:
|
||||
|
||||
```dockerfile
|
||||
# Choose your desired base image
|
||||
@@ -150,7 +147,8 @@ Run jupyterlab using a command such as
|
||||
|
||||
## Dask JupyterLab Extension
|
||||
|
||||
[Dask JupyterLab Extension](https://github.com/dask/dask-labextension) provides a JupyterLab extension to manage Dask clusters, as well as embed Dask's dashboard plots directly into JupyterLab panes. Create the Dockerfile as:
|
||||
[Dask JupyterLab Extension](https://github.com/dask/dask-labextension) provides a JupyterLab extension to manage Dask clusters, as well as embed Dask's dashboard plots directly into JupyterLab panes.
|
||||
Create the Dockerfile as:
|
||||
|
||||
```dockerfile
|
||||
# Start from a core stack version
|
||||
@@ -208,8 +206,8 @@ Credit: [Paolo D.](https://github.com/pdonorio) based on
|
||||
|
||||
## xgboost
|
||||
|
||||
You need to install conda's gcc for Python xgboost to work properly. Otherwise, you'll get an
|
||||
exception about libgomp.so.1 missing GOMP_4.0.
|
||||
You need to install conda's gcc for Python xgboost to work properly.
|
||||
Otherwise, you'll get an exception about libgomp.so.1 missing GOMP_4.0.
|
||||
|
||||
```bash
|
||||
conda install --quiet --yes gcc && \
|
||||
@@ -233,25 +231,25 @@ Sometimes it is useful to run the Jupyter instance behind a nginx proxy, for ins
|
||||
- you may have many different services in addition to Jupyter running on the same server, and want
|
||||
to nginx to help improve server performance in manage the connections
|
||||
|
||||
Here is a [quick example NGINX configuration](https://gist.github.com/cboettig/8643341bd3c93b62b5c2)
|
||||
to get started. You'll need a server, a `.crt` and `.key` file for your server, and `docker` &
|
||||
`docker-compose` installed. Then just download the files at that gist and run `docker-compose up -d`
|
||||
to test it out. Customize the `nginx.conf` file to set the desired paths and add other services.
|
||||
Here is a [quick example NGINX configuration](https://gist.github.com/cboettig/8643341bd3c93b62b5c2) to get started.
|
||||
You'll need a server, a `.crt` and `.key` file for your server, and `docker` & `docker-compose` installed.
|
||||
Then just download the files at that gist and run `docker-compose up -d` to test it out.
|
||||
Customize the `nginx.conf` file to set the desired paths and add other services.
|
||||
|
||||
## Host volume mounts and notebook errors
|
||||
|
||||
If you are mounting a host directory as `/home/jovyan/work` in your container and you receive
|
||||
permission errors or connection errors when you create a notebook, be sure that the `jovyan` user
|
||||
(UID=1000 by default) has read/write access to the directory on the host. Alternatively, specify the
|
||||
UID of the `jovyan` user on container startup using the `-e NB_UID` option described in the
|
||||
(UID=1000 by default) has read/write access to the directory on the host.
|
||||
Alternatively, specify the UID of the `jovyan` user on container startup using the `-e NB_UID` option described in the
|
||||
[Common Features, Docker Options section](../using/common.html#Docker-Options)
|
||||
|
||||
Ref: <https://github.com/jupyter/docker-stacks/issues/199>
|
||||
|
||||
## Manpage installation
|
||||
|
||||
Most containers, including our Ubuntu base image, ship without manpages installed to save space. You
|
||||
can use the following dockerfile to inherit from one of our images to enable manpages:
|
||||
Most containers, including our Ubuntu base image, ship without manpages installed to save space.
|
||||
You can use the following dockerfile to inherit from one of our images to enable manpages:
|
||||
|
||||
```dockerfile
|
||||
# Choose your desired base image
|
||||
@@ -467,21 +465,29 @@ RUN pip install --quiet --no-cache-dir jupyter_dashboards faker && \
|
||||
|
||||
USER root
|
||||
# Ensure we overwrite the kernel config so that toree connects to cluster
|
||||
RUN jupyter toree install --sys-prefix --spark_opts="--master yarn --deploy-mode client --driver-memory 512m --executor-memory 512m --executor-cores 1 --driver-java-options -Dhdp.version=2.5.3.0-37 --conf spark.hadoop.yarn.timeline-service.enabled=false"
|
||||
RUN jupyter toree install --sys-prefix --spark_opts="\
|
||||
--master yarn
|
||||
--deploy-mode client
|
||||
--driver-memory 512m
|
||||
--executor-memory 512m
|
||||
--executor-cores 1
|
||||
--driver-java-options
|
||||
-Dhdp.version=2.5.3.0-37
|
||||
--conf spark.hadoop.yarn.timeline-service.enabled=false
|
||||
"
|
||||
USER ${NB_UID}
|
||||
```
|
||||
|
||||
Credit: [britishbadger](https://github.com/britishbadger) from
|
||||
[docker-stacks/issues/369](https://github.com/jupyter/docker-stacks/issues/369)
|
||||
Credit: [britishbadger](https://github.com/britishbadger) from [docker-stacks/issues/369](https://github.com/jupyter/docker-stacks/issues/369)
|
||||
|
||||
## Run Jupyter Notebook/Lab inside an already secured environment (i.e., with no token)
|
||||
|
||||
(Adapted from [issue 728](https://github.com/jupyter/docker-stacks/issues/728))
|
||||
|
||||
The default security is very good. There are use cases, encouraged by containers, where the jupyter
|
||||
container and the system it runs within, lie inside the security boundary. In these use cases it is
|
||||
convenient to launch the server without a password or token. In this case, you should use the
|
||||
`start.sh` script to launch the server with no token:
|
||||
The default security is very good.
|
||||
There are use cases, encouraged by containers, where the jupyter container and the system it runs within, lie inside the security boundary.
|
||||
In these use cases it is convenient to launch the server without a password or token.
|
||||
In this case, you should use the `start.sh` script to launch the server with no token:
|
||||
|
||||
For jupyterlab:
|
||||
|
||||
@@ -517,7 +523,8 @@ Ref: <https://github.com/jupyter/docker-stacks/issues/675>
|
||||
|
||||
## Enable auto-sklearn notebooks
|
||||
|
||||
Using `auto-sklearn` requires `swig`, which the other notebook images lack, so it cant be experimented with. Also, there is no Conda package for `auto-sklearn`.
|
||||
Using `auto-sklearn` requires `swig`, which the other notebook images lack, so it cant be experimented with.
|
||||
Also, there is no Conda package for `auto-sklearn`.
|
||||
|
||||
```dockerfile
|
||||
ARG BASE_CONTAINER=jupyter/scipy-notebook
|
||||
@@ -539,7 +546,8 @@ RUN pip install --quiet --no-cache-dir auto-sklearn && \
|
||||
|
||||
## Enable Delta Lake in Spark notebooks
|
||||
|
||||
Please note that the [Delta Lake](https://delta.io/) packages are only available for Spark version > `3.0`. By adding the properties to `spark-defaults.conf`, the user no longer needs to enable Delta support in each notebook.
|
||||
Please note that the [Delta Lake](https://delta.io/) packages are only available for Spark version > `3.0`.
|
||||
By adding the properties to `spark-defaults.conf`, the user no longer needs to enable Delta support in each notebook.
|
||||
|
||||
```dockerfile
|
||||
FROM jupyter/pyspark-notebook:latest
|
||||
|
@@ -9,9 +9,13 @@ This section provides details about the second.
|
||||
|
||||
## Using the Docker CLI
|
||||
|
||||
You can launch a local Docker container from the Jupyter Docker Stacks using the [Docker command line interface](https://docs.docker.com/engine/reference/commandline/cli/). There are numerous ways to configure containers using the CLI. The following are some common patterns.
|
||||
You can launch a local Docker container from the Jupyter Docker Stacks using the [Docker command line interface](https://docs.docker.com/engine/reference/commandline/cli/).
|
||||
There are numerous ways to configure containers using the CLI.
|
||||
The following are some common patterns.
|
||||
|
||||
**Example 1** This command pulls the `jupyter/scipy-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host. It then starts a container running a Jupyter Notebook server and exposes the server on host port 8888. The server logs appear in the terminal and include a URL to the notebook server.
|
||||
**Example 1** This command pulls the `jupyter/scipy-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host.
|
||||
It then starts a container running a Jupyter Notebook server and exposes the server on host port 8888.
|
||||
The server logs appear in the terminal and include a URL to the notebook server.
|
||||
|
||||
```bash
|
||||
$ docker run -p 8888:8888 jupyter/scipy-notebook:33add21fab64
|
||||
@@ -52,7 +56,9 @@ $ docker rm d67fe77f1a84
|
||||
d67fe77f1a84
|
||||
```
|
||||
|
||||
**Example 2** This command pulls the `jupyter/r-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host. It then starts a container running a Jupyter Notebook server and exposes the server on host port 10000. The server logs appear in the terminal and include a URL to the notebook server, but with the internal container port (8888) instead of the the correct host port (10000).
|
||||
**Example 2** This command pulls the `jupyter/r-notebook` image tagged `33add21fab64` from Docker Hub if it is not already present on the local host.
|
||||
It then starts a container running a Jupyter Notebook server and exposes the server on host port 10000.
|
||||
The server logs appear in the terminal and include a URL to the notebook server, but with the internal container port (8888) instead of the the correct host port (10000).
|
||||
|
||||
```bash
|
||||
$ docker run --rm -p 10000:8888 -v "${PWD}":/home/jovyan/work jupyter/r-notebook:33add21fab64
|
||||
@@ -74,9 +80,12 @@ Executing the command: jupyter notebook
|
||||
http://localhost:8888/?token=3b8dce890cb65570fb0d9c4a41ae067f7604873bd604f5ac
|
||||
```
|
||||
|
||||
Pressing `Ctrl-C` shuts down the notebook server and immediately destroys the Docker container. Files written to `~/work` in the container remain touched. Any other changes made in the container are lost.
|
||||
Pressing `Ctrl-C` shuts down the notebook server and immediately destroys the Docker container.
|
||||
Files written to `~/work` in the container remain touched.
|
||||
Any other changes made in the container are lost.
|
||||
|
||||
**Example 3** This command pulls the `jupyter/all-spark-notebook` image currently tagged `latest` from Docker Hub if an image tagged `latest` is not already present on the local host. It then starts a container named `notebook` running a JupyterLab server and exposes the server on a randomly selected port.
|
||||
**Example 3** This command pulls the `jupyter/all-spark-notebook` image currently tagged `latest` from Docker Hub if an image tagged `latest` is not already present on the local host.
|
||||
It then starts a container named `notebook` running a JupyterLab server and exposes the server on a randomly selected port.
|
||||
|
||||
```bash
|
||||
docker run -d -P --name notebook jupyter/all-spark-notebook
|
||||
@@ -112,12 +121,23 @@ notebook
|
||||
|
||||
## Using Binder
|
||||
|
||||
[Binder](https://mybinder.org/) is a service that allows you to create and share custom computing environments for projects in version control. You can use any of the Jupyter Docker Stacks images as a basis for a Binder-compatible Dockerfile. See the [docker-stacks example](https://mybinder.readthedocs.io/en/latest/sample_repos.html#using-a-docker-image-from-the-jupyter-docker-stacks-repository) and [Using a Dockerfile](https://mybinder.readthedocs.io/en/latest/tutorials/dockerfile.html) sections in the [Binder documentation](https://mybinder.readthedocs.io/en/latest/index.html) for instructions.
|
||||
[Binder](https://mybinder.org/) is a service that allows you to create and share custom computing environments for projects in version control.
|
||||
You can use any of the Jupyter Docker Stacks images as a basis for a Binder-compatible Dockerfile.
|
||||
See the
|
||||
[docker-stacks example](https://mybinder.readthedocs.io/en/latest/sample_repos.html#using-a-docker-image-from-the-jupyter-docker-stacks-repository) and
|
||||
[Using a Dockerfile](https://mybinder.readthedocs.io/en/latest/tutorials/dockerfile.html) sections in the
|
||||
[Binder documentation](https://mybinder.readthedocs.io/en/latest/index.html) for instructions.
|
||||
|
||||
## Using JupyterHub
|
||||
|
||||
You can configure JupyterHub to launcher Docker containers from the Jupyter Docker Stacks images. If you've been following the [Zero to JupyterHub with Kubernetes](https://zero-to-jupyterhub.readthedocs.io/en/latest/) guide, see the [Use an existing Docker image](https://zero-to-jupyterhub.readthedocs.io/en/latest/jupyterhub/customizing/user-environment.html#choose-and-use-an-existing-docker-image) section for details. If you have a custom JupyterHub deployment, see the [Picking or building a Docker image](https://github.com/jupyterhub/dockerspawner#picking-or-building-a-docker-image) instructions for the [dockerspawner](https://github.com/jupyterhub/dockerspawner) instead.
|
||||
You can configure JupyterHub to launcher Docker containers from the Jupyter Docker Stacks images.
|
||||
If you've been following the [Zero to JupyterHub with Kubernetes](https://zero-to-jupyterhub.readthedocs.io/en/latest/) guide,
|
||||
see the [Use an existing Docker image](https://zero-to-jupyterhub.readthedocs.io/en/latest/jupyterhub/customizing/user-environment.html#choose-and-use-an-existing-docker-image) section for details.
|
||||
If you have a custom JupyterHub deployment, see the [Picking or building a Docker image](https://github.com/jupyterhub/dockerspawner#picking-or-building-a-docker-image)
|
||||
instructions for the [dockerspawner](https://github.com/jupyterhub/dockerspawner) instead.
|
||||
|
||||
## Using Other Tools and Services
|
||||
|
||||
You can use the Jupyter Docker Stacks with any Docker-compatible technology (e.g., [Docker Compose](https://docs.docker.com/compose/), [docker-py](https://github.com/docker/docker-py), your favorite cloud container service). See the documentation of the tool, library, or service for details about how to reference, configure, and launch containers from these images.
|
||||
You can use the Jupyter Docker Stacks with any Docker-compatible technology
|
||||
(e.g., [Docker Compose](https://docs.docker.com/compose/), [docker-py](https://github.com/docker/docker-py), your favorite cloud container service).
|
||||
See the documentation of the tool, library, or service for details about how to reference, configure, and launch containers from these images.
|
||||
|
@@ -13,10 +13,8 @@ This section provides details about the first.
|
||||
|
||||
## Core Stacks
|
||||
|
||||
The Jupyter team maintains a set of Docker image definitions in the
|
||||
<https://github.com/jupyter/docker-stacks> GitHub
|
||||
repository. The following sections describe these images including their contents, relationships,
|
||||
and versioning strategy.
|
||||
The Jupyter team maintains a set of Docker image definitions in the <https://github.com/jupyter/docker-stacks> GitHub repository.
|
||||
The following sections describe these images including their contents, relationships, and versioning strategy.
|
||||
|
||||
### jupyter/base-notebook
|
||||
|
||||
@@ -24,8 +22,8 @@ and versioning strategy.
|
||||
[Dockerfile commit history](https://github.com/jupyter/docker-stacks/commits/master/base-notebook/Dockerfile) |
|
||||
[Docker Hub image tags](https://hub.docker.com/r/jupyter/base-notebook/tags/)
|
||||
|
||||
`jupyter/base-notebook` is a small image supporting the
|
||||
[options common across all core stacks](common.md). It is the basis for all other stacks.
|
||||
`jupyter/base-notebook` is a small image supporting the [options common across all core stacks](common.md).
|
||||
It is the basis for all other stacks.
|
||||
|
||||
- Minimally-functional Jupyter Notebook server (e.g., no LaTeX support for saving notebooks as PDFs)
|
||||
- [Miniforge](https://github.com/conda-forge/miniforge) Python 3.x in `/opt/conda` with two package managers
|
||||
@@ -37,8 +35,7 @@ and versioning strategy.
|
||||
with ownership over the `/home/jovyan` and `/opt/conda` paths
|
||||
- `tini` as the container entrypoint and a `start-notebook.sh` script as the default command
|
||||
- A `start-singleuser.sh` script useful for launching containers in JupyterHub
|
||||
- A `start.sh` script useful for running alternative commands in the container (e.g. `ipython`,
|
||||
`jupyter kernelgateway`, `jupyter lab`)
|
||||
- A `start.sh` script useful for running alternative commands in the container (e.g. `ipython`, `jupyter kernelgateway`, `jupyter lab`)
|
||||
- Options for a self-signed HTTPS certificate and passwordless sudo
|
||||
|
||||
### jupyter/minimal-notebook
|
||||
@@ -188,29 +185,26 @@ communities.
|
||||
|
||||
### Image Relationships
|
||||
|
||||
The following diagram depicts the build dependency tree of the core images. (i.e., the `FROM`
|
||||
statements in their Dockerfiles). Any given image inherits the complete content of all ancestor
|
||||
images pointing to it.
|
||||
The following diagram depicts the build dependency tree of the core images. (i.e., the `FROM` statements in their Dockerfiles).
|
||||
Any given image inherits the complete content of all ancestor images pointing to it.
|
||||
|
||||
[](http://interactive.blockdiag.com/?compression=deflate&src=eJyFzTEPgjAQhuHdX9Gws5sQjGzujsaYKxzmQrlr2msMGv-71K0srO_3XGud9NNA8DSfgzESCFlBSdi0xkvQAKTNugw4QnL6GIU10hvX-Zh7Z24OLLq2SjaxpvP10lX35vCf6pOxELFmUbQiUz4oQhYzMc3gCrRt2cWe_FKosmSjyFHC6OS1AwdQWCtyj7sfh523_BI9hKlQ25YdOFdv5fcH0kiEMA)
|
||||
|
||||
### Builds
|
||||
|
||||
Pull requests to the `jupyter/docker-stacks` repository trigger builds of all images on GitHub
|
||||
Actions. These images are for testing purposes only and are not saved for further use. When pull requests
|
||||
merge to master, all images rebuild on Docker Hub and become available to `docker pull` from
|
||||
Docker Hub.
|
||||
Pull requests to the `jupyter/docker-stacks` repository trigger builds of all images on GitHub Actions.
|
||||
These images are for testing purposes only and are not saved for further use.
|
||||
When pull requests merge to master, all images rebuild on Docker Hub and become available to `docker pull` from Docker Hub.
|
||||
|
||||
### Versioning
|
||||
|
||||
The `latest` tag in each Docker Hub repository tracks the master branch `HEAD` reference on GitHub.
|
||||
`latest` is a moving target, by definition, and will have backward-incompatible changes regularly.
|
||||
|
||||
Every image on Docker Hub also receives a 12-character tag which corresponds with the git commit SHA
|
||||
that triggered the image build. You can inspect the state of the `jupyter/docker-stacks` repository
|
||||
for that commit to review the definition of the image (e.g., images with tag `33add21fab64` were built
|
||||
from <https://github.com/jupyter/docker-stacks/tree/33add21fab64>.
|
||||
Every image on Docker Hub also receives a 12-character tag which corresponds with the git commit SHA that triggered the image build.
|
||||
You can inspect the state of the `jupyter/docker-stacks` repository for that commit to review the definition of the image
|
||||
(e.g., images with tag `33add21fab64` were built from <https://github.com/jupyter/docker-stacks/tree/33add21fab64>.
|
||||
|
||||
You must refer to git-SHA image tags when stability and reproducibility are important in your work.
|
||||
(e.g. `FROM jupyter/scipy-notebook:33add21fab64`, `docker run -it --rm jupyter/scipy-notebook:33add21fab64`).
|
||||
@@ -220,18 +214,17 @@ You should only use `latest` when a one-off container instance is acceptable
|
||||
## Community Stacks
|
||||
|
||||
The core stacks are just a tiny sample of what's possible when combining Jupyter with other
|
||||
technologies. We encourage members of the Jupyter community to create their own stacks based on the
|
||||
technologies.
|
||||
We encourage members of the Jupyter community to create their own stacks based on the
|
||||
core images and link them below.
|
||||
|
||||
- [csharp-notebook is a community Jupyter Docker Stack image. Try C# in Jupyter Notebooks](https://github.com/tlinnet/csharp-notebook).
|
||||
The image includes more than 200 Jupyter Notebooks with example C# code and can readily be tried
|
||||
online via mybinder.org. Click here to launch
|
||||
[](https://mybinder.org/v2/gh/tlinnet/csharp-notebook/master).
|
||||
The image includes more than 200 Jupyter Notebooks with example C# code and can readily be tried online via mybinder.org.
|
||||
Try it on [](https://mybinder.org/v2/gh/tlinnet/csharp-notebook/master).
|
||||
|
||||
- [education-notebook is a community Jupyter Docker Stack image](https://github.com/umsi-mads/education-notebook).
|
||||
The image includes nbgrader and RISE on top of the datascience-notebook image. Click here to
|
||||
launch it on
|
||||
[](https://mybinder.org/v2/gh/umsi-mads/education-notebook/master).
|
||||
The image includes nbgrader and RISE on top of the datascience-notebook image.
|
||||
Try it on [](https://mybinder.org/v2/gh/umsi-mads/education-notebook/master).
|
||||
|
||||
- **crosscompass/ihaskell-notebook**
|
||||
|
||||
@@ -242,36 +235,34 @@ core images and link them below.
|
||||
`crosscompass/ihaskell-notebook` is based on [IHaskell](https://github.com/gibiansky/IHaskell).
|
||||
Includes popular packages and example notebooks.
|
||||
|
||||
Try it on
|
||||
[](https://mybinder.org/v2/gh/jamesdbrock/learn-you-a-haskell-notebook/master?urlpath=lab/tree/ihaskell_examples/ihaskell/IHaskell.ipynb)
|
||||
Try it on [](https://mybinder.org/v2/gh/jamesdbrock/learn-you-a-haskell-notebook/master?urlpath=lab/tree/ihaskell_examples/ihaskell/IHaskell.ipynb)
|
||||
|
||||
- [java-notebook is a community Jupyter Docker Stack image](https://github.com/jbindinga/java-notebook).
|
||||
The image includes [IJava](https://github.com/SpencerPark/IJava) kernel on top of the
|
||||
minimal-notebook image. Click here to launch it on
|
||||
[](https://mybinder.org/v2/gh/jbindinga/java-notebook/master).
|
||||
The image includes [IJava](https://github.com/SpencerPark/IJava) kernel on top of the minimal-notebook image.
|
||||
Try it on [](https://mybinder.org/v2/gh/jbindinga/java-notebook/master).
|
||||
|
||||
- [sage-notebook](https://github.com/sharpTrick/sage-notebook) is a community Jupyter Docker Stack
|
||||
image with the [sagemath](https://www.sagemath.org) kernel on top of the minimal-notebook image. Click
|
||||
here to launch it on
|
||||
[](https://mybinder.org/v2/gh/sharpTrick/sage-notebook/master).
|
||||
- [sage-notebook](https://github.com/sharpTrick/sage-notebook)
|
||||
is a community Jupyter Docker Stack image with the [sagemath](https://www.sagemath.org) kernel on top of the minimal-notebook image.
|
||||
Try it on [](https://mybinder.org/v2/gh/sharpTrick/sage-notebook/master).
|
||||
|
||||
- [GPU-Jupyter](https://github.com/iot-salzburg/gpu-jupyter/): Leverage Jupyter Notebooks with the
|
||||
power of your NVIDIA GPU and perform GPU calculations using Tensorflow and Pytorch in
|
||||
collaborative notebooks. This is done by generating a Dockerfile, that consists of the
|
||||
**nvidia/cuda** base image, the well-maintained **docker-stacks** that is integrated as submodule
|
||||
and GPU-able libraries like **Tensorflow**, **Keras** and **PyTorch** on top of it.
|
||||
power of your NVIDIA GPU and perform GPU calculations using Tensorflow and Pytorch in collaborative notebooks.
|
||||
This is done by generating a Dockerfile, that consists of the **nvidia/cuda** base image,
|
||||
the well-maintained **docker-stacks** that is integrated as submodule and
|
||||
GPU-able libraries like **Tensorflow**, **Keras** and **PyTorch** on top of it.
|
||||
|
||||
- [PRP GPU Jupyter repo](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/-/tree/prp) and [Registry](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/container_registry): PRP (Pacific Research Platform) maintained registry for jupyter stack based on NVIDIA CUDA-enabled image. Added the PRP image with Pytorch and some other python packages, and GUI Desktop notebook based on <https://github.com/jupyterhub/jupyter-remote-desktop-proxy>.
|
||||
- [PRP GPU Jupyter repo](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/-/tree/prp) and [Registry](https://gitlab.nautilus.optiputer.net/prp/jupyter-stack/container_registry)
|
||||
PRP (Pacific Research Platform) maintained registry for jupyter stack based on NVIDIA CUDA-enabled image.
|
||||
Added the PRP image with Pytorch and some other python packages, and GUI Desktop notebook based on <https://github.com/jupyterhub/jupyter-remote-desktop-proxy>.
|
||||
|
||||
- [cgspatial-notebook](https://github.com/SCiO-systems/cgspatial-notebook) is a community Jupyter
|
||||
Docker Stack image. The image includes major geospatial Python & R libraries on top of the
|
||||
datascience-notebook image. Try it on
|
||||
binder:[](https://mybinder.org/v2/gh/SCiO-systems/cgspatial-notebook/master)
|
||||
- [cgspatial-notebook](https://github.com/SCiO-systems/cgspatial-notebook) is a community Jupyter Docker Stack image.
|
||||
The image includes major geospatial Python & R libraries on top of the datascience-notebook image.
|
||||
Try it on [](https://mybinder.org/v2/gh/SCiO-systems/cgspatial-notebook/master)
|
||||
|
||||
- [kotlin-notebook](https://github.com/knonm/kotlin-notebook) is a community Jupyter
|
||||
Docker Stack image. The image includes [Kotlin kernel for Jupyter/IPython](https://github.com/Kotlin/kotlin-jupyter) on top of the
|
||||
`base-notebook` image. Try it on
|
||||
Binder: [](https://mybinder.org/v2/gh/knonm/kotlin-notebook/main)
|
||||
- [kotlin-notebook](https://github.com/knonm/kotlin-notebook) is a community Jupyter Docker Stack image.
|
||||
The image includes [Kotlin kernel for Jupyter/IPython](https://github.com/Kotlin/kotlin-jupyter) on top of the
|
||||
`base-notebook` image.
|
||||
Try it on [](https://mybinder.org/v2/gh/knonm/kotlin-notebook/main)
|
||||
|
||||
See the [contributing guide](../contributing/stacks.md) for information about how to create your own
|
||||
Jupyter Docker Stack.
|
||||
|
@@ -6,13 +6,18 @@ This page provides details about features specific to one or more images.
|
||||
|
||||
### Specific Docker Image Options
|
||||
|
||||
- `-p 4040:4040` - The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images open [SparkUI (Spark Monitoring and Instrumentation UI)](https://spark.apache.org/docs/latest/monitoring.html) at default port `4040`, this option map `4040` port inside docker container to `4040` port on host machine . Note every new spark context that is created is put onto an incrementing port (ie. 4040, 4041, 4042, etc.), and it might be necessary to open multiple ports. For example: `docker run -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/pyspark-notebook`.
|
||||
- `-p 4040:4040` - The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images open
|
||||
[SparkUI (Spark Monitoring and Instrumentation UI)](https://spark.apache.org/docs/latest/monitoring.html) at default port `4040`,
|
||||
this option map `4040` port inside docker container to `4040` port on host machine.
|
||||
Note every new spark context that is created is put onto an incrementing port (ie. 4040, 4041, 4042, etc.), and it might be necessary to open multiple ports.
|
||||
For example: `docker run -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/pyspark-notebook`.
|
||||
|
||||
### Build an Image with a Different Version of Spark
|
||||
|
||||
You can build a `pyspark-notebook` image (and also the downstream `all-spark-notebook` image) with a different version of Spark by overriding the default value of the following arguments at build time.
|
||||
|
||||
- Spark distribution is defined by the combination of the Spark and the Hadoop version and verified by the package checksum, see [Download Apache Spark](https://spark.apache.org/downloads.html) and the [archive repo](https://archive.apache.org/dist/spark/) for more information.
|
||||
- Spark distribution is defined by the combination of the Spark and the Hadoop version and verified by the package checksum,
|
||||
see [Download Apache Spark](https://spark.apache.org/downloads.html) and the [archive repo](https://archive.apache.org/dist/spark/) for more information.
|
||||
- `spark_version`: The Spark version to install (`3.0.0`).
|
||||
- `hadoop_version`: The Hadoop version (`3.2`).
|
||||
- `spark_checksum`: The package checksum (`BFE4540...`).
|
||||
@@ -46,7 +51,8 @@ docker run -it --rm jupyter/pyspark-notebook:spark-2.4.7 pyspark --version
|
||||
|
||||
### Usage Examples
|
||||
|
||||
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks. The following sections provide some examples of how to get started using them.
|
||||
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks.
|
||||
The following sections provide some examples of how to get started using them.
|
||||
|
||||
#### Using Spark Local Mode
|
||||
|
||||
@@ -133,16 +139,18 @@ Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs
|
||||
deployed, run the same version of Spark.
|
||||
1. [Deploy Spark in Standalone Mode](https://spark.apache.org/docs/latest/spark-standalone.html).
|
||||
2. Run the Docker container with `--net=host` in a location that is network addressable by all of
|
||||
your Spark workers. (This is a [Spark networking
|
||||
requirement](https://spark.apache.org/docs/latest/cluster-overview.html#components).)
|
||||
- NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`. See <https://github.com/jupyter/docker-stacks/issues/64> for details.
|
||||
your Spark workers.
|
||||
(This is a [Spark networking requirement](https://spark.apache.org/docs/latest/cluster-overview.html#components).)
|
||||
- NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`.
|
||||
See <https://github.com/jupyter/docker-stacks/issues/64> for details.
|
||||
|
||||
**Note**: In the following examples we are using the Spark master URL `spark://master:7077` that shall be replaced by the URL of the Spark master.
|
||||
|
||||
##### Standalone Mode in Python
|
||||
|
||||
The **same Python version** needs to be used on the notebook (where the driver is located) and on the Spark workers.
|
||||
The python version used at driver and worker side can be adjusted by setting the environment variables `PYSPARK_PYTHON` and / or `PYSPARK_DRIVER_PYTHON`, see [Spark Configuration][spark-conf] for more information.
|
||||
The python version used at driver and worker side can be adjusted by setting the environment variables `PYSPARK_PYTHON` and / or `PYSPARK_DRIVER_PYTHON`,
|
||||
see [Spark Configuration][spark-conf] for more information.
|
||||
|
||||
```python
|
||||
from pyspark.sql import SparkSession
|
||||
|
Reference in New Issue
Block a user