mirror of
https://github.com/jupyter/docker-stacks.git
synced 2025-10-14 21:42:57 +00:00
Added instructions for connecting to spark on standalone mode.
This commit is contained in:
@@ -82,6 +82,16 @@ To use Python 2 in the notebook and on the workers, change the `PYSPARK_PYTHON`
|
||||
|
||||
Of course, all of this can be hidden in an [IPython kernel startup script](http://ipython.org/ipython-doc/stable/development/config.html?highlight=startup#startup-files), but "explicit is better than implicit." :)
|
||||
|
||||
## Connecting to a Spark Cluster on Standalone Mode
|
||||
|
||||
Connection to Spark Cluster on Standalone Mode requires the following set of steps:
|
||||
|
||||
0. Verify that the docker image (check the Dockerfile) and the Spark Cluster which is being deployed, run the same version of Spark.
|
||||
1. [Deploy Spark on Standalone Mode](http://spark.apache.org/docs/latest/spark-standalone.html).
|
||||
2. Run the Docker container with `--net=host` in a location that is network addressable by all of your Spark workers. (This is a [Spark networking requirement](http://spark.apache.org/docs/latest/cluster-overview.html#components).)
|
||||
* NOTE: When using `--net=host`, you must also use the flags `--pid=host -e TINI_SUBREAPER=true`. See https://github.com/jupyter/docker-stacks/issues/64 for details.
|
||||
3. The language specific instructions are almost same as mentioned above for Mesos, only the master url would now be something like spark://10.10.10.10:7077
|
||||
|
||||
## Notebook Options
|
||||
|
||||
You can pass [Jupyter command line options](http://jupyter.readthedocs.org/en/latest/config.html#command-line-arguments) through the [`start-notebook.sh` command](https://github.com/jupyter/docker-stacks/blob/master/minimal-notebook/start-notebook.sh#L15) when launching the container. For example, to set the base URL of the notebook server you might do the following:
|
||||
|
Reference in New Issue
Block a user