Outline, include some info from READMEs

2025-10-09 02:54:05 +00:00 · 2017-12-30 22:13:32 -05:00
parent 118198ac65
commit f70d52da58
6 changed files with 312 additions and 31 deletions
--- a/9
+++ b/9
@@ -1,8 +1,8 @@
 # Copyright (c) Jupyter Development Team.
 # Distributed under the terms of the Modified BSD License.
-.PHONY: help test
+.PHONY: docs help test

-# Use bash for inline if-statements in test target
+# Use bash for inline if-statements in arch_patch target
 SHELL:=bash
 OWNER:=jupyter
 ARCH:=$(shell uname -m)
@@ -54,9 +54,12 @@ dev/%: PORT?=8888
 dev/%: ## run a foreground container for a stack
 	docker run -it --rm -p $(PORT):8888 $(DARGS) $(OWNER)/$(notdir $@) $(ARGS)

-dev-env: # install libraries required to build docs and run tests
+dev-env: ## install libraries required to build docs and run tests
 	conda env create -f environment.yml

+docs: ## build HTML documentation
+	make -C docs html
+
 test/%: ## run tests against a stack
 	@TEST_IMAGE="$(OWNER)/$(notdir $@)" pytest test

--- a/docs/conf.py
+++ b/docs/conf.py
@@ -21,17 +21,23 @@
 # import sys
 # sys.path.insert(0, os.path.abspath('.'))

+# For conversion from markdown to html
+import recommonmark.parser
+from recommonmark.transform import AutoStructify
+

 # -- General configuration ------------------------------------------------

 # If your documentation needs a minimal Sphinx version, state it here.
 #
-# needs_sphinx = '1.0'
+needs_sphinx = '1.4'

 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
-extensions = []
+extensions = [
+    'jupyter_alabaster_theme'
+]

 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
@@ -78,12 +84,20 @@ pygments_style = 'sphinx'
 todo_include_todos = False


+# -- Source -------------------------------------------------------------
+
+source_parsers = {
+    '.md': 'recommonmark.parser.CommonMarkParser',
+}
+
+source_suffix = ['.rst', '.md']
+
 # -- Options for HTML output ----------------------------------------------

 # The theme to use for HTML and HTML Help pages.  See the documentation for
 # a list of builtin themes.
 #
-html_theme = 'alabaster'
+html_theme = 'jupyter_alabaster_theme'

 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
@@ -96,18 +110,6 @@ html_theme = 'alabaster'
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ['_static']

-# Custom sidebar templates, must be a dictionary that maps document names
-# to template names.
-#
-# This is required for the alabaster theme
-# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
-html_sidebars = {
-    '**': [
-        'relations.html',  # needs 'show_related': True theme option to display
-        'searchbox.html',
-    ]
-}
-

 # -- Options for HTMLHelp output ------------------------------------------

--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -0,0 +1,177 @@
+# Options and Configuration
+
+## Notebook Options
+
+The Docker container executes a [`start-notebook.sh` script](./start-notebook.sh) script by default. The `start-notebook.sh` script handles the `NB_UID`, `NB_GID` and `GRANT_SUDO` features documented in the next section, and then executes the `jupyter notebook`.
+
+You can pass [Jupyter command line options](https://jupyter.readthedocs.io/en/latest/projects/jupyter-command.html) through the `start-notebook.sh` script when launching the container. For example, to secure the Notebook server with a custom password hashed using `IPython.lib.passwd()` instead of the default token, run the following:
+
+```
+docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp.password='sha1:74ba40f8a388:c913541b7ee99d15d5ed31d4226bf7838f83a50e'
+```
+
+For example, to set the base URL of the notebook server, run the following:
+
+```
+docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp.base_url=/some/path
+```
+
+For example, to disable all authentication mechanisms (which is not a recommended practice):
+
+```
+docker run -d -p 8888:8888 jupyter/base-notebook start-notebook.sh --NotebookApp.token=''
+```
+
+You can sidestep the `start-notebook.sh` script and run your own commands in the container. See the *Alternative Commands* section later in this document for more information.
+
+## Docker Options
+
+You may customize the execution of the Docker container and the command it is running with the following optional arguments.
+
+* `-e GEN_CERT=yes` - Generates a self-signed SSL certificate and configures Jupyter Notebook to use it to accept encrypted HTTPS connections.
+* `-e NB_UID=1000` - Specify the uid of the `jovyan` user. Useful to mount host volumes with specific file ownership. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su jovyan` after adjusting the user id.)
+* `-e NB_GID=100` - Specify the gid of the `jovyan` user. Useful to mount host volumes with specific file ownership. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su jovyan` after adjusting the group id.)
+* `-e GRANT_SUDO=yes` - Gives the `jovyan` user passwordless `sudo` capability. Useful for installing OS packages. For this option to take effect, you must run the container with `--user root`. (The `start-notebook.sh` script will `su jovyan` after adding `jovyan` to sudoers.) **You should only enable `sudo` if you trust the user or if the container is running on an isolated host.**
+* `-v /some/host/folder/for/work:/home/jovyan/work` - Mounts a host machine directory as folder in the container. Useful when you want to preserve notebooks and other work even after the container is destroyed. **You must grant the within-container notebook user or group (`NB_UID` or `NB_GID`) write access to the host directory (e.g., `sudo chown 1000 /some/host/folder/for/work`).**
+* `--group-add users` - use this argument if you are also specifying
+  a specific user id to launch the container (`-u 5000`), rather than launching the container as root and relying on *NB_UID* and *NB_GID* to set the user and group.
+
+## SSL Certificates
+
+You may mount SSL key and certificate files into a container and configure Jupyter Notebook to use them to accept HTTPS connections. For example, to mount a host folder containing a `notebook.key` and `notebook.crt`:
+
+```
+docker run -d -p 8888:8888 \
+    -v /some/host/folder:/etc/ssl/notebook \
+    jupyter/base-notebook start-notebook.sh \
+    --NotebookApp.keyfile=/etc/ssl/notebook/notebook.key
+    --NotebookApp.certfile=/etc/ssl/notebook/notebook.crt
+```
+
+Alternatively, you may mount a single PEM file containing both the key and certificate. For example:
+
+```
+docker run -d -p 8888:8888 \
+    -v /some/host/folder/notebook.pem:/etc/ssl/notebook.pem \
+    jupyter/base-notebook start-notebook.sh \
+    --NotebookApp.certfile=/etc/ssl/notebook.pem
+```
+
+In either case, Jupyter Notebook expects the key and certificate to be a base64 encoded text file. The certificate file or PEM may contain one or more certificates (e.g., server, intermediate, and root).
+
+For additional information about using SSL, see the following:
+
+* The [docker-stacks/examples](https://github.com/jupyter/docker-stacks/tree/master/examples) for information about how to use [Let's Encrypt](https://letsencrypt.org/) certificates when you run these stacks on a publicly visible domain.
+* The [jupyter_notebook_config.py](jupyter_notebook_config.py) file for how this Docker image generates a self-signed certificate.
+* The [Jupyter Notebook documentation](https://jupyter-notebook.readthedocs.io/en/latest/public_server.html#securing-a-notebook-server) for best practices about securing a public notebook server in general.
+
+## Conda Environments
+
+The default Python 3.x [Conda environment](http://conda.pydata.org/docs/using/envs.html) resides in `/opt/conda`. The `/opt/conda/bin` directory is part of the default `jovyan` user's `$PATH`. That directory is also whitelisted for use in `sudo` commands by the `start.sh` script.
+
+The `jovyan` user has full read/write access to the `/opt/conda` directory. You can use either `conda` or `pip` to install new packages without any additional permissions.
+
+```
+# install a package into the default (python 3.x) environment
+pip install some-package
+conda install some-package
+```
+
+## Alternative Commands
+
+### start.sh
+
+The `start.sh` script supports the same features as the default `start-notebook.sh` script (e.g., `GRANT_SUDO`), but allows you to specify an arbitrary command to execute. For example, to run the text-based `ipython` console in a container, do the following:
+
+```
+docker run -it --rm jupyter/base-notebook start.sh ipython
+```
+
+Or, to run JupyterLab instead of the classic notebook, run the following:
+
+```
+docker run -it --rm -p 8888:8888 jupyter/base-notebook start.sh jupyter lab
+```
+
+This script is particularly useful when you derive a new Dockerfile from this image and install additional Jupyter applications with subcommands like `jupyter console`, `jupyter kernelgateway`, etc.
+
+### Others
+
+You can bypass the provided scripts and specify your an arbitrary start command. If you do, keep in mind that certain features documented above will not function (e.g., `GRANT_SUDO`).
+
+## Image Specifics
+
+## Spark and PySpark
+
+### Using Spark Local Mode
+
+This configuration is nice for using Spark on small, local data.
+
+0. Run the container as shown above.
+2. Open a Python 2 or 3 notebook.
+3. Create a `SparkContext` configured for local mode.
+
+For example, the first few cells in the notebook might read:
+
+```python
+import pyspark
+sc = pyspark.SparkContext('local[*]')
+
+# do something to prove it works
+rdd = sc.parallelize(range(1000))
+rdd.takeSample(False, 5)
+```
+
+### Connecting to a Spark Cluster on Mesos
+
+This configuration allows your compute cluster to scale with your data.
+
+0. [Deploy Spark on Mesos](http://spark.apache.org/docs/latest/running-on-mesos.html).
+1. Configure each slave with [the `--no-switch_user` flag](https://open.mesosphere.com/reference/mesos-slave/) or create the `jovyan` user on every slave node.
+2. Ensure Python 2.x and/or 3.x and any Python libraries you wish to use in your Spark lambda functions are installed on your Spark workers.
+3. Run the Docker container with `--net=host` in a location that is network addressable by all of your Spark workers. (This is a [Spark networking requirement](http://spark.apache.org/docs/latest/cluster-overview.html#components).)
+    * NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`. See https://github.com/jupyter/docker-stacks/issues/64 for details.
+4. Open a Python 2 or 3 notebook.
+5. Create a `SparkConf` instance in a new notebook pointing to your Mesos master node (or Zookeeper instance) and Spark binary package location.
+6. Create a `SparkContext` using this configuration.
+
+For example, the first few cells in a Python 3 notebook might read:
+
+```python
+import os
+# make sure pyspark tells workers to use python3 not 2 if both are installed
+os.environ['PYSPARK_PYTHON'] = '/usr/bin/python3'
+
+import pyspark
+conf = pyspark.SparkConf()
+
+# point to mesos master or zookeeper entry (e.g., zk://10.10.10.10:2181/mesos)
+conf.setMaster("mesos://10.10.10.10:5050")
+# point to spark binary package in HDFS or on local filesystem on all slave
+# nodes (e.g., file:///opt/spark/spark-2.2.0-bin-hadoop2.7.tgz)
+conf.set("spark.executor.uri", "hdfs://10.122.193.209/spark/spark-2.2.0-bin-hadoop2.7.tgz")
+# set other options as desired
+conf.set("spark.executor.memory", "8g")
+conf.set("spark.core.connection.ack.wait.timeout", "1200")
+
+# create the context
+sc = pyspark.SparkContext(conf=conf)
+
+# do something to prove it works
+rdd = sc.parallelize(range(100000000))
+rdd.sumApprox(3)
+```
+
+To use Python 2 in the notebook and on the workers, change the `PYSPARK_PYTHON` environment variable to point to the location of the Python 2.x interpreter binary. If you leave this environment variable unset, it defaults to `python`.
+
+Of course, all of this can be hidden in an [IPython kernel startup script](http://ipython.org/ipython-doc/stable/development/config.html?highlight=startup#startup-files), but "explicit is better than implicit." :)
+
+## Connecting to a Spark Cluster on Standalone Mode
+
+Connection to Spark Cluster on Standalone Mode requires the following set of steps:
+
+0. Verify that the docker image (check the Dockerfile) and the Spark Cluster which is being deployed, run the same version of Spark.
+1. [Deploy Spark on Standalone Mode](http://spark.apache.org/docs/latest/spark-standalone.html).
+2. Run the Docker container with `--net=host` in a location that is network addressable by all of your Spark workers. (This is a [Spark networking requirement](http://spark.apache.org/docs/latest/cluster-overview.html#components).)
+    * NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`. See https://github.com/jupyter/docker-stacks/issues/64 for details.
+3. The language specific instructions are almost same as mentioned above for Mesos, only the master url would now be something like spark://10.10.10.10:7077
--- a/docs/contributing.md
+++ b/docs/contributing.md
@@ -0,0 +1,9 @@
+# Contributing
+
+## Package Updates
+
+## New Packages
+
+## Tests
+
+## Community Stacks
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,20 +1,41 @@
-.. docker-stacks documentation master file, created by
-   sphinx-quickstart on Fri Dec 29 20:32:10 2017.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
+Jupyter Docker Stacks
+=====================

-Welcome to docker-stacks's documentation!
-=========================================
+Jupyter Docker Stacks are a set of ready-to-run Docker images containing Jupyter applications and interactive computing tools. You can use a stack image to start a personal Jupyter Notebook server in a local Docker container, to run JupyterLab servers for a team using JupyterHub, to write your own project Dockerfile, and so on.
+
+**Table of Contents**

 .. toctree::
-   :maxdepth: 2
-   :caption: Contents:
+   :maxdepth: 1

+   using
+   features
+   contributing

+Quick Start
+-----------

-Indices and tables
-==================
+The examples below may help you get started if you have Docker installed, know which Docker image you want to use, and want to launch a single Jupyter Notebook server in a container. The other pages in this documentation describe additional uses and features in detail.::

-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
+    # Run a Jupyter Notebook server in a Docker container started
+    # from the jupyter/scipy-notebook image built from Git commit 27ba573.
+    # All files saved in the container are lost when the notebook server exits.
+    # -ti: pseudo-TTY+STDIN open, so the logs appear in the terminal
+    # -rm: remove the container on exit
+    # -p: publish the notebook port 8888 as port 8888 on the host
+    docker run -ti --rm -p 8888:8888 jupyter/scipy-notebook:27ba573
+
+    # Run a Jupyter Notebook server in a Docker container started from the
+    # jupyter/r-notebook image built from Git commit cf1a3aa.
+    # All files written to ~/work in the container are saved to the
+    # current working on the host and persist even when the notebook server
+    # exits.
+    docker run -ti --rm -p 8888:8888 -v "$PWD":/home/jovyan/work jupyter/r-notebook:cf1a3aa
+
+    # Run a Jupyter Notebook server in a background Docker container started
+    # from the latest jupyter/all-spark-notebook image available on the local
+    # machine or Docker Cloud. All files saved in the container are lost
+    # when the container is destroyed.
+    # -d: detach, run container in background.
+    # -P: Publish all exposed ports to random ports
+    docker run -d -P jupyter/all-spark-notebook:latest
--- a/docs/using.md
+++ b/docs/using.md
@@ -0,0 +1,69 @@
+# Users Guide
+
+Using one of the Jupyter Docker Stacks requires two choices:
+
+1. Which Docker image you wish to use
+2. How you wish to start Docker containers from that image
+
+This section provides details about the available images and runtimes to inform your choices.
+
+## Selecting an Image
+
+### Core Stacks
+
+The Jupyter team maintains a set of Docker image definitions in the https://github.com/jupyter/docker-stacks GitHub repository. The following table describes these images, and links to their source on GitHub and their builds on Docker Cloud.
+
+|Name                        |Description|GitHub |Image Tags|
+|----------------------------|-----------|-----------|----------|
+|jupyter/base-notebook       ||||
+|jupyter/minimal-notebook    ||||
+|jupyter/r-notebook          ||||
+|jupyter/scipy-notebook      ||||
+|jupyter/datascience-notebook||||
+|jupyter/tensorflow-notebook ||||
+|jupyter/pyspark-notebook    ||||
+|jupyter/all-spark-notebook  ||||
+|----------------------------|-|-|-|
+
+#### Image Relationships
+
+The following diagram depicts the build dependencies between the core images (aka the `FROM` statement in their Dockerfiles). Any image lower in the tree inherits
+
+[![Image inheritance diagram](internal/inherit-diagram.svg)](http://interactive.blockdiag.com/?compression=deflate&src=eJyFzTEPgjAQhuHdX9Gws5sQjGzujsaYKxzmQrlr2msMGv-71K0srO_3XGud9NNA8DSfgzESCFlBSdi0xkvQAKTNugw4QnL6GIU10hvX-Zh7Z24OLLq2SjaxpvP10lX35vCf6pOxELFmUbQiUz4oQhYzMc3gCrRt2cWe_FKosmSjyFHC6OS1AwdQWCtyj7sfh523_BI9hKlQ25YdOFdv5fcH0kiEMA)
+
+#### Versioning
+
+[Click here for a commented build history of each image, with references to tag/SHA values.](https://github.com/jupyter/docker-stacks/wiki/Docker-build-history)
+
+The following are quick-links to READMEs about each image and their Docker image tags on Docker Cloud:
+
+* base-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/base-notebook), [SHA list](https://hub.docker.com/r/jupyter/base-notebook/tags/)
+* minimal-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/minimal-notebook), [SHA list](https://hub.docker.com/r/jupyter/minimal-notebook/tags/)
+* scipy-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/scipy-notebook), [SHA list](https://hub.docker.com/r/jupyter/scipy-notebook/tags/)
+* r-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/r-notebook), [SHA list](https://hub.docker.com/r/jupyter/r-notebook/tags/)
+* tensorflow-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/tensorflow-notebook), [SHA list](https://hub.docker.com/r/jupyter/tensorflow-notebook/tags/)
+* datascience-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/datascience-notebook), [SHA list](https://hub.docker.com/r/jupyter/datascience-notebook/tags/)
+* pyspark-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/pyspark-notebook), [SHA list](https://hub.docker.com/r/jupyter/pyspark-notebook/tags/)
+* all-spark-notebook: [README](https://github.com/jupyter/docker-stacks/tree/master/all-spark-notebook), [SHA list](https://hub.docker.com/r/jupyter/all-spark-notebook/tags/)
+
+
+The latest tag in each Docker Hub repository tracks the master branch HEAD reference on GitHub. This is a moving target and will make backward-incompatible changes regularly.
+Any 12-character image tag on Docker Hub refers to a git commit SHA here on GitHub. See the Docker build history wiki page for a table of build details.
+Stack contents (e.g., new library versions) will be updated upon request via PRs against this project.
+Users looking for reproducibility or stability should always refer to specific git SHA tagged images in their work, not latest.
+For legacy reasons, there are two additional tags named 3.2 and 4.0 on Docker Hub which point to images prior to our versioning scheme switch.
+If you haven't already, pin your image to a tag, e.g. FROM jupyter/scipy-notebook:7c45ec67c8e7. latest is a moving target which can change in backward-incompatible ways as packages and operating systems are updated.
+
+## Community Stacks
+
+The Jupyter community maintains additional
+
+## Running a Container
+
+### Using the Docker Command Line
+
+### Using JupyterHub
+
+Every notebook stack is compatible with JupyterHub 0.5 or higher. When running with JupyterHub, you must override the Docker run command to point to the `start-singleuser.sh script, which starts a single-user instance of the Notebook server. See each stack's README for instructions on running with JupyterHub.
+
+### Using Binder