Merge branch 'jupyter:master' into master

This commit is contained in:
Darek
2022-07-04 06:32:56 -04:00
committed by GitHub
10 changed files with 10 additions and 106 deletions

View File

@@ -28,7 +28,7 @@ repos:
# Autoformat: Python code # Autoformat: Python code
- repo: https://github.com/psf/black - repo: https://github.com/psf/black
rev: 22.3.0 rev: 22.6.0
hooks: hooks:
- id: black - id: black
args: [--target-version=py39] args: [--target-version=py39]
@@ -52,7 +52,7 @@ repos:
# Autoformat: YAML, JSON, Markdown, etc. # Autoformat: YAML, JSON, Markdown, etc.
- repo: https://github.com/pre-commit/mirrors-prettier - repo: https://github.com/pre-commit/mirrors-prettier
rev: v2.6.2 rev: v2.7.1
hooks: hooks:
- id: prettier - id: prettier

View File

@@ -42,11 +42,3 @@ RUN arch=$(uname -m) && \
mamba clean --all -f -y && \ mamba clean --all -f -y && \
fix-permissions "${CONDA_DIR}" && \ fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}" fix-permissions "/home/${NB_USER}"
# Spylon-kernel
RUN mamba install --quiet --yes 'spylon-kernel' && \
mamba clean --all -f -y && \
python -m spylon_kernel install --sys-prefix && \
rm -rf "/home/${NB_USER}/.local" && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"

View File

@@ -1,4 +1,4 @@
# Jupyter Notebook Python, Scala, R, Spark Stack # Jupyter Notebook Python, R, Spark Stack
[![docker pulls](https://img.shields.io/docker/pulls/jupyter/all-spark-notebook.svg)](https://hub.docker.com/r/jupyter/all-spark-notebook/) [![docker pulls](https://img.shields.io/docker/pulls/jupyter/all-spark-notebook.svg)](https://hub.docker.com/r/jupyter/all-spark-notebook/)
[![docker stars](https://img.shields.io/docker/stars/jupyter/all-spark-notebook.svg)](https://hub.docker.com/r/jupyter/all-spark-notebook/) [![docker stars](https://img.shields.io/docker/stars/jupyter/all-spark-notebook.svg)](https://hub.docker.com/r/jupyter/all-spark-notebook/)

View File

@@ -10,12 +10,13 @@ This page describes the options supported by the startup script and how to bypas
You can pass [Jupyter server options](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html) to the `start-notebook.sh` script when launching the container. You can pass [Jupyter server options](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html) to the `start-notebook.sh` script when launching the container.
1. For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, 1. For example, to secure the Notebook server with a [custom password](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html#preparing-a-hashed-password)
hashed using `jupyter_server.auth.security.passwd()` instead of the default token,
you can run the following (this hash was generated for `my-password` password): you can run the following (this hash was generated for `my-password` password):
```bash ```bash
docker run -it --rm -p 8888:8888 jupyter/base-notebook \ docker run -it --rm -p 8888:8888 jupyter/base-notebook \
start-notebook.sh --NotebookApp.password='sha1:7cca89c48283:e3c1f9fbc06d1d2aa59555dfd5beed925e30dd2c' start-notebook.sh --NotebookApp.password='argon2:$argon2id$v=19$m=10240,t=10,p=8$JdAN3fe9J45NvK/EPuGCvA$O/tbxglbwRpOFuBNTYrymAEH6370Q2z+eS1eF4GM6Do'
``` ```
2. To set the [base URL](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html#running-the-notebook-with-a-customized-url-prefix) of the notebook server, you can run the following: 2. To set the [base URL](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html#running-the-notebook-with-a-customized-url-prefix) of the notebook server, you can run the following:

View File

@@ -513,7 +513,7 @@ By adding the properties to `spark-defaults.conf`, the user no longer needs to e
```dockerfile ```dockerfile
FROM jupyter/pyspark-notebook:latest FROM jupyter/pyspark-notebook:latest
ARG DELTA_CORE_VERSION="1.2.0" ARG DELTA_CORE_VERSION="1.2.1"
RUN pip install --quiet --no-cache-dir delta-spark==${DELTA_CORE_VERSION} && \ RUN pip install --quiet --no-cache-dir delta-spark==${DELTA_CORE_VERSION} && \
fix-permissions "${HOME}" && \ fix-permissions "${HOME}" && \
fix-permissions "${CONDA_DIR}" fix-permissions "${CONDA_DIR}"

View File

@@ -175,7 +175,7 @@ communities.
[Dockerfile commit history](https://github.com/jupyter/docker-stacks/commits/master/all-spark-notebook/Dockerfile) | [Dockerfile commit history](https://github.com/jupyter/docker-stacks/commits/master/all-spark-notebook/Dockerfile) |
[Docker Hub image tags](https://hub.docker.com/r/jupyter/all-spark-notebook/tags/) [Docker Hub image tags](https://hub.docker.com/r/jupyter/all-spark-notebook/tags/)
`jupyter/all-spark-notebook` includes Python, R, and Scala support for Apache Spark. `jupyter/all-spark-notebook` includes Python and R support for Apache Spark.
- Everything in `jupyter/pyspark-notebook` and its ancestor images - Everything in `jupyter/pyspark-notebook` and its ancestor images
- [IRKernel](https://irkernel.github.io/) to support R code in Jupyter notebooks - [IRKernel](https://irkernel.github.io/) to support R code in Jupyter notebooks
@@ -183,7 +183,6 @@ communities.
[sparklyr](https://spark.rstudio.com), [sparklyr](https://spark.rstudio.com),
[ggplot2](https://ggplot2.tidyverse.org) [ggplot2](https://ggplot2.tidyverse.org)
packages packages
- [spylon-kernel](https://github.com/vericast/spylon-kernel) to support Scala code in Jupyter notebooks
### Image Relationships ### Image Relationships

View File

@@ -76,7 +76,7 @@ docker run -it --rm jupyter/pyspark-notebook:spark-2.4.7 pyspark --version
### Usage Examples ### Usage Examples
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks. The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python and R notebooks.
The following sections provide some examples of how to get started using them. The following sections provide some examples of how to get started using them.
#### Using Spark Local Mode #### Using Spark Local Mode
@@ -144,24 +144,6 @@ sdf_len(sc, 100, repartition = 1) %>%
# 5050 # 5050
``` ```
##### Local Mode in Scala
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
options in a `%%init_spark` magic cell.
```python
%%init_spark
# Configure Spark to use a local master
launcher.master = "local"
```
```scala
// Sum of the first 100 whole numbers
val rdd = sc.parallelize(0 to 100)
rdd.sum()
// 5050
```
#### Connecting to a Spark Cluster in Standalone Mode #### Connecting to a Spark Cluster in Standalone Mode
Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs/latest/spark-standalone.html)** requires the following set of steps: Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs/latest/spark-standalone.html)** requires the following set of steps:
@@ -235,24 +217,6 @@ sdf_len(sc, 100, repartition = 1) %>%
# 5050 # 5050
``` ```
##### Standalone Mode in Scala
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
options in a `%%init_spark` magic cell.
```python
%%init_spark
# Configure Spark to use a local master
launcher.master = "spark://master:7077"
```
```scala
// Sum of the first 100 whole numbers
val rdd = sc.parallelize(0 to 100)
rdd.sum()
// 5050
```
### Define Spark Dependencies ### Define Spark Dependencies
```{note} ```{note}

View File

@@ -1,51 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%init_spark\n",
"# Spark session & context\n",
"launcher.master = \"local\"\n",
"launcher.conf.spark.executor.cores = 1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"// Sum of the first 100 whole numbers\n",
"val rdd = sc.parallelize(0 to 100)\n",
"rdd.sum()\n",
"// 5050"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "spylon-kernel",
"language": "scala",
"name": "spylon-kernel"
},
"language_info": {
"codemirror_mode": "text/x-scala",
"file_extension": ".scala",
"help_links": [
{
"text": "MetaKernel Magics",
"url": "https://metakernel.readthedocs.io/en/latest/source/README.html"
}
],
"mimetype": "text/x-scala",
"name": "scala",
"pygments_lexer": "scala",
"version": "0.4.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -15,7 +15,7 @@ THIS_DIR = Path(__file__).parent.resolve()
@pytest.mark.parametrize( @pytest.mark.parametrize(
"test_file", "test_file",
# TODO: add local_sparklyr # TODO: add local_sparklyr
["local_pyspark", "local_spylon", "local_sparkR", "issue_1168"], ["local_pyspark", "local_sparkR", "issue_1168"],
) )
def test_nbconvert(container: TrackedContainer, test_file: str) -> None: def test_nbconvert(container: TrackedContainer, test_file: str) -> None:
"""Check if Spark notebooks can be executed""" """Check if Spark notebooks can be executed"""

View File

@@ -55,7 +55,6 @@ PACKAGE_MAPPING = {
"pytables": "tables", "pytables": "tables",
"scikit-image": "skimage", "scikit-image": "skimage",
"scikit-learn": "sklearn", "scikit-learn": "sklearn",
"spylon-kernel": "spylon_kernel",
# R # R
"randomforest": "randomForest", "randomforest": "randomForest",
"rcurl": "RCurl", "rcurl": "RCurl",