Merge branch 'master' into asalikhov/rename_cloud_to_hub

This commit is contained in:
Ayaz Salikhov
2021-05-07 01:44:12 +03:00
61 changed files with 1857 additions and 1462 deletions

View File

@@ -166,11 +166,13 @@ ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0", "--allow-root"]
```
And build the image as:
```bash
docker build -t jupyter/scipy-dasklabextension:latest .
```
Once built, run using the command:
```bash
docker run -it --rm -p 8888:8888 -p 8787:8787 jupyter/scipy-dasklabextension:latest
```
@@ -273,6 +275,7 @@ ARG BASE_CONTAINER=ubuntu:focal-20200423@sha256:238e696992ba9913d24cfc3727034985
```
For Ubuntu 18.04 (bionic) and earlier, you may also require to workaround for a mandb bug, which was fixed in mandb >= 2.8.6.1:
```dockerfile
# https://git.savannah.gnu.org/cgit/man-db.git/commit/?id=8197d7824f814c5d4b992b4c8730b5b0f7ec589a
# http://launchpadlibrarian.net/435841763/man-db_2.8.5-2_2.8.6-1.diff.gz

View File

@@ -13,8 +13,8 @@ You can launch a local Docker container from the Jupyter Docker Stacks using the
**Example 1** This command pulls the `jupyter/scipy-notebook` image tagged `2c80cf3537ca` from Docker Hub if it is not already present on the local host. It then starts a container running a Jupyter Notebook server and exposes the server on host port 8888. The server logs appear in the terminal and include a URL to the notebook server.
```
docker run -p 8888:8888 jupyter/scipy-notebook:2c80cf3537ca
```bash
$ docker run -p 8888:8888 jupyter/scipy-notebook:2c80cf3537ca
Executing the command: jupyter notebook
[I 15:33:00.567 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
@@ -35,27 +35,27 @@ Executing the command: jupyter notebook
Pressing `Ctrl-C` shuts down the notebook server but leaves the container intact on disk for later restart or permanent deletion using commands like the following:
```
```bash
# list containers
docker ps -a
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d67fe77f1a84 jupyter/base-notebook "tini -- start-noteb…" 44 seconds ago Exited (0) 39 seconds ago cocky_mirzakhani
# start the stopped container
docker start -a d67fe77f1a84
$ docker start -a d67fe77f1a84
Executing the command: jupyter notebook
[W 16:45:02.020 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
...
# remove the stopped container
docker rm d67fe77f1a84
$ docker rm d67fe77f1a84
d67fe77f1a84
```
**Example 2** This command pulls the `jupyter/r-notebook` image tagged `e5c5a7d3e52d` from Docker Hub if it is not already present on the local host. It then starts a container running a Jupyter Notebook server and exposes the server on host port 10000. The server logs appear in the terminal and include a URL to the notebook server, but with the internal container port (8888) instead of the the correct host port (10000).
```
docker run --rm -p 10000:8888 -v "$PWD":/home/jovyan/work jupyter/r-notebook:e5c5a7d3e52d
```bash
$ docker run --rm -p 10000:8888 -v "$PWD":/home/jovyan/work jupyter/r-notebook:e5c5a7d3e52d
Executing the command: jupyter notebook
[I 19:31:09.573 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
@@ -78,29 +78,29 @@ Pressing `Ctrl-C` shuts down the notebook server and immediately destroys the Do
**Example 3** This command pulls the `jupyter/all-spark-notebook` image currently tagged `latest` from Docker Hub if an image tagged `latest` is not already present on the local host. It then starts a container named `notebook` running a JupyterLab server and exposes the server on a randomly selected port.
```
```bash
docker run -d -P --name notebook jupyter/all-spark-notebook
```
The assigned port and notebook server token are visible using other Docker commands.
```
```bash
# get the random host port assigned to the container port 8888
docker port notebook 8888
$ docker port notebook 8888
0.0.0.0:32769
# get the notebook token from the logs
docker logs --tail 3 notebook
$ docker logs --tail 3 notebook
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:8888/?token=15914ca95f495075c0aa7d0e060f1a78b6d94f70ea373b00
```
Together, the URL to visit on the host machine to access the server in this case is http://localhost:32769?token=15914ca95f495075c0aa7d0e060f1a78b6d94f70ea373b00.
Together, the URL to visit on the host machine to access the server in this case is <http://localhost:32769?token=15914ca95f495075c0aa7d0e060f1a78b6d94f70ea373b00>.
The container runs in the background until stopped and/or removed by additional Docker commands.
```
```bash
# stop the container
docker stop notebook
notebook

View File

@@ -12,7 +12,7 @@ This page provides details about features specific to one or more images.
You can build a `pyspark-notebook` image (and also the downstream `all-spark-notebook` image) with a different version of Spark by overriding the default value of the following arguments at build time.
* Spark distribution is defined by the combination of the Spark and the Hadoop version and verified by the package checksum, see [Download Apache Spark](https://spark.apache.org/downloads.html) for more information. At this time the build will only work with the set of versions available on the Apache Spark download page, so it will not work with the archived versions.
* Spark distribution is defined by the combination of the Spark and the Hadoop version and verified by the package checksum, see [Download Apache Spark](https://spark.apache.org/downloads.html) and the [archive repo](https://archive.apache.org/dist/spark/) for more information.
* `spark_version`: The Spark version to install (`3.0.0`).
* `hadoop_version`: The Hadoop version (`3.2`).
* `spark_checksum`: The package checksum (`BFE4540...`).
@@ -52,7 +52,7 @@ The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support t
Spark **local mode** is useful for experimentation on small data when you do not have a Spark cluster available.
##### In Python
##### Local Mode in Python
In a Python notebook.
@@ -69,7 +69,7 @@ rdd.sum()
# 5050
```
##### In R
##### Local Mode in R
In a R notebook with [SparkR][sparkr].
@@ -107,7 +107,7 @@ sdf_len(sc, 100, repartition = 1) %>%
# 5050
```
##### In Scala
##### Local Mode in Scala
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
options in a `%%init_spark` magic cell.
@@ -136,11 +136,11 @@ Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs
your Spark workers. (This is a [Spark networking
requirement](http://spark.apache.org/docs/latest/cluster-overview.html#components).)
* NOTE: When using `--net=host`, you must also use the flags `--pid=host -e
TINI_SUBREAPER=true`. See https://github.com/jupyter/docker-stacks/issues/64 for details.
TINI_SUBREAPER=true`. See <https://github.com/jupyter/docker-stacks/issues/64> for details.
**Note**: In the following examples we are using the Spark master URL `spark://master:7077` that shall be replaced by the URL of the Spark master.
##### In Python
##### Standalone Mode in Python
The **same Python version** need to be used on the notebook (where the driver is located) and on the Spark workers.
The python version used at driver and worker side can be adjusted by setting the environment variables `PYSPARK_PYTHON` and / or `PYSPARK_DRIVER_PYTHON`, see [Spark Configuration][spark-conf] for more information.
@@ -158,7 +158,7 @@ rdd.sum()
# 5050
```
##### In R
##### Standalone Mode in R
In a R notebook with [SparkR][sparkr].
@@ -195,7 +195,7 @@ sdf_len(sc, 100, repartition = 1) %>%
# 5050
```
##### In Scala
##### Standalone Mode in Scala
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
options in a `%%init_spark` magic cell.