Spark installation improved by sourcing the `spark-config.sh` in the `before-notebook.d` hook that is run by `start.sh`. It permits to add automatically the right Py4J dependency version in the `PYTHONPATH`. So it is not needed anymore to set this variable at build time.
Documentation describing the installation of a custom Spark version modified to remove this step. Also updated to install the latest `2.x` Spark version.
`test_pyspark` fixed (was always OK before that).
Allow to build `pyspark-notebook` image with an alternative Spark version.
- Define arguments for Spark installation
- Add a note in "Image Specifics" explaining how to build an image with an alternative Spark version
- Remove Toree documentation from "Image Specifics" since its support has been droped in #1115
* Test added for all kernels
* Same examples as provided in the documentation (`specifics.md`)
* Used the same use case for all examples: sum of the first 100 whole numbers
Note: I've not automatically tested `local_sparklyr.ipynb` since it creates by default the `metastore_db` dir and the `derby.log` file in the working directory. Since I mount it in `RO` it's not working. I'm struggling to set it elsewhere...
Some changes to the Spark documentation
for local and standalone use cases with the following drivers
* Simplify some of them (removing options, etc.)
* Use the same code as much as possible in each example to be consistent (only kept R different from the others)
* Add Sparklyr as an option for R
* Add some notes about prerequisites (same version of Python, R installed on workers)