mirror of
https://github.com/jupyter/docker-stacks.git
synced 2025-10-12 04:22:58 +00:00
Merge branch 'jupyter:master' into master
This commit is contained in:
@@ -28,7 +28,7 @@ repos:
|
||||
|
||||
# Autoformat: Python code
|
||||
- repo: https://github.com/psf/black
|
||||
rev: 22.3.0
|
||||
rev: 22.6.0
|
||||
hooks:
|
||||
- id: black
|
||||
args: [--target-version=py39]
|
||||
@@ -52,7 +52,7 @@ repos:
|
||||
|
||||
# Autoformat: YAML, JSON, Markdown, etc.
|
||||
- repo: https://github.com/pre-commit/mirrors-prettier
|
||||
rev: v2.6.2
|
||||
rev: v2.7.1
|
||||
hooks:
|
||||
- id: prettier
|
||||
|
||||
|
@@ -42,11 +42,3 @@ RUN arch=$(uname -m) && \
|
||||
mamba clean --all -f -y && \
|
||||
fix-permissions "${CONDA_DIR}" && \
|
||||
fix-permissions "/home/${NB_USER}"
|
||||
|
||||
# Spylon-kernel
|
||||
RUN mamba install --quiet --yes 'spylon-kernel' && \
|
||||
mamba clean --all -f -y && \
|
||||
python -m spylon_kernel install --sys-prefix && \
|
||||
rm -rf "/home/${NB_USER}/.local" && \
|
||||
fix-permissions "${CONDA_DIR}" && \
|
||||
fix-permissions "/home/${NB_USER}"
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# Jupyter Notebook Python, Scala, R, Spark Stack
|
||||
# Jupyter Notebook Python, R, Spark Stack
|
||||
|
||||
[](https://hub.docker.com/r/jupyter/all-spark-notebook/)
|
||||
[](https://hub.docker.com/r/jupyter/all-spark-notebook/)
|
||||
|
@@ -10,12 +10,13 @@ This page describes the options supported by the startup script and how to bypas
|
||||
|
||||
You can pass [Jupyter server options](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html) to the `start-notebook.sh` script when launching the container.
|
||||
|
||||
1. For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token,
|
||||
1. For example, to secure the Notebook server with a [custom password](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html#preparing-a-hashed-password)
|
||||
hashed using `jupyter_server.auth.security.passwd()` instead of the default token,
|
||||
you can run the following (this hash was generated for `my-password` password):
|
||||
|
||||
```bash
|
||||
docker run -it --rm -p 8888:8888 jupyter/base-notebook \
|
||||
start-notebook.sh --NotebookApp.password='sha1:7cca89c48283:e3c1f9fbc06d1d2aa59555dfd5beed925e30dd2c'
|
||||
start-notebook.sh --NotebookApp.password='argon2:$argon2id$v=19$m=10240,t=10,p=8$JdAN3fe9J45NvK/EPuGCvA$O/tbxglbwRpOFuBNTYrymAEH6370Q2z+eS1eF4GM6Do'
|
||||
```
|
||||
|
||||
2. To set the [base URL](https://jupyter-server.readthedocs.io/en/latest/operators/public-server.html#running-the-notebook-with-a-customized-url-prefix) of the notebook server, you can run the following:
|
||||
|
@@ -513,7 +513,7 @@ By adding the properties to `spark-defaults.conf`, the user no longer needs to e
|
||||
```dockerfile
|
||||
FROM jupyter/pyspark-notebook:latest
|
||||
|
||||
ARG DELTA_CORE_VERSION="1.2.0"
|
||||
ARG DELTA_CORE_VERSION="1.2.1"
|
||||
RUN pip install --quiet --no-cache-dir delta-spark==${DELTA_CORE_VERSION} && \
|
||||
fix-permissions "${HOME}" && \
|
||||
fix-permissions "${CONDA_DIR}"
|
||||
|
@@ -175,7 +175,7 @@ communities.
|
||||
[Dockerfile commit history](https://github.com/jupyter/docker-stacks/commits/master/all-spark-notebook/Dockerfile) |
|
||||
[Docker Hub image tags](https://hub.docker.com/r/jupyter/all-spark-notebook/tags/)
|
||||
|
||||
`jupyter/all-spark-notebook` includes Python, R, and Scala support for Apache Spark.
|
||||
`jupyter/all-spark-notebook` includes Python and R support for Apache Spark.
|
||||
|
||||
- Everything in `jupyter/pyspark-notebook` and its ancestor images
|
||||
- [IRKernel](https://irkernel.github.io/) to support R code in Jupyter notebooks
|
||||
@@ -183,7 +183,6 @@ communities.
|
||||
[sparklyr](https://spark.rstudio.com),
|
||||
[ggplot2](https://ggplot2.tidyverse.org)
|
||||
packages
|
||||
- [spylon-kernel](https://github.com/vericast/spylon-kernel) to support Scala code in Jupyter notebooks
|
||||
|
||||
### Image Relationships
|
||||
|
||||
|
@@ -76,7 +76,7 @@ docker run -it --rm jupyter/pyspark-notebook:spark-2.4.7 pyspark --version
|
||||
|
||||
### Usage Examples
|
||||
|
||||
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python, R, and Scala notebooks.
|
||||
The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support the use of [Apache Spark](https://spark.apache.org/) in Python and R notebooks.
|
||||
The following sections provide some examples of how to get started using them.
|
||||
|
||||
#### Using Spark Local Mode
|
||||
@@ -144,24 +144,6 @@ sdf_len(sc, 100, repartition = 1) %>%
|
||||
# 5050
|
||||
```
|
||||
|
||||
##### Local Mode in Scala
|
||||
|
||||
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
|
||||
options in a `%%init_spark` magic cell.
|
||||
|
||||
```python
|
||||
%%init_spark
|
||||
# Configure Spark to use a local master
|
||||
launcher.master = "local"
|
||||
```
|
||||
|
||||
```scala
|
||||
// Sum of the first 100 whole numbers
|
||||
val rdd = sc.parallelize(0 to 100)
|
||||
rdd.sum()
|
||||
// 5050
|
||||
```
|
||||
|
||||
#### Connecting to a Spark Cluster in Standalone Mode
|
||||
|
||||
Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs/latest/spark-standalone.html)** requires the following set of steps:
|
||||
@@ -235,24 +217,6 @@ sdf_len(sc, 100, repartition = 1) %>%
|
||||
# 5050
|
||||
```
|
||||
|
||||
##### Standalone Mode in Scala
|
||||
|
||||
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
|
||||
options in a `%%init_spark` magic cell.
|
||||
|
||||
```python
|
||||
%%init_spark
|
||||
# Configure Spark to use a local master
|
||||
launcher.master = "spark://master:7077"
|
||||
```
|
||||
|
||||
```scala
|
||||
// Sum of the first 100 whole numbers
|
||||
val rdd = sc.parallelize(0 to 100)
|
||||
rdd.sum()
|
||||
// 5050
|
||||
```
|
||||
|
||||
### Define Spark Dependencies
|
||||
|
||||
```{note}
|
||||
|
@@ -1,51 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%%init_spark\n",
|
||||
"# Spark session & context\n",
|
||||
"launcher.master = \"local\"\n",
|
||||
"launcher.conf.spark.executor.cores = 1"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"// Sum of the first 100 whole numbers\n",
|
||||
"val rdd = sc.parallelize(0 to 100)\n",
|
||||
"rdd.sum()\n",
|
||||
"// 5050"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "spylon-kernel",
|
||||
"language": "scala",
|
||||
"name": "spylon-kernel"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": "text/x-scala",
|
||||
"file_extension": ".scala",
|
||||
"help_links": [
|
||||
{
|
||||
"text": "MetaKernel Magics",
|
||||
"url": "https://metakernel.readthedocs.io/en/latest/source/README.html"
|
||||
}
|
||||
],
|
||||
"mimetype": "text/x-scala",
|
||||
"name": "scala",
|
||||
"pygments_lexer": "scala",
|
||||
"version": "0.4.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
@@ -15,7 +15,7 @@ THIS_DIR = Path(__file__).parent.resolve()
|
||||
@pytest.mark.parametrize(
|
||||
"test_file",
|
||||
# TODO: add local_sparklyr
|
||||
["local_pyspark", "local_spylon", "local_sparkR", "issue_1168"],
|
||||
["local_pyspark", "local_sparkR", "issue_1168"],
|
||||
)
|
||||
def test_nbconvert(container: TrackedContainer, test_file: str) -> None:
|
||||
"""Check if Spark notebooks can be executed"""
|
||||
|
@@ -55,7 +55,6 @@ PACKAGE_MAPPING = {
|
||||
"pytables": "tables",
|
||||
"scikit-image": "skimage",
|
||||
"scikit-learn": "sklearn",
|
||||
"spylon-kernel": "spylon_kernel",
|
||||
# R
|
||||
"randomforest": "randomForest",
|
||||
"rcurl": "RCurl",
|
||||
|
Reference in New Issue
Block a user