Files
docker-stacks/all-spark-notebook/test/data/local_pyspark.ipynb
Romain c83024c950 Add spark notebook tests and change examples
* Test added for all kernels
* Same examples as provided in the documentation (`specifics.md`)
* Used the same use case for all examples: sum of the first 100 whole numbers

Note: I've not automatically tested `local_sparklyr.ipynb` since it creates by default the `metastore_db` dir and the `derby.log` file in the working directory. Since I mount it in `RO` it's not working. I'm struggling to set it elsewhere...
2020-05-29 06:54:46 +02:00

60 lines
2.4 KiB
Plaintext

{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"output_type": "error",
"ename": "Error",
"evalue": "Jupyter cannot be started. Error attempting to locate jupyter: Data Science libraries jupyter and notebook are not installed in interpreter Python 3.7.7 64-bit ('jupyter': conda).",
"traceback": [
"Error: Jupyter cannot be started. Error attempting to locate jupyter: Data Science libraries jupyter and notebook are not installed in interpreter Python 3.7.7 64-bit ('jupyter': conda).",
"at b.startServer (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:92:270430)",
"at async b.createServer (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:92:269873)",
"at async connect (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:92:397876)",
"at async w.ensureConnectionAndNotebookImpl (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:16:556625)",
"at async w.ensureConnectionAndNotebook (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:16:556303)",
"at async w.clearResult (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:16:552346)",
"at async w.reexecuteCell (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:16:540374)",
"at async w.reexecuteCells (/Users/romain/.vscode/extensions/ms-python.python-2020.5.80290/out/client/extension.js:16:537541)"
]
}
],
"source": [
"from pyspark.sql import SparkSession\n",
"\n",
"# Spark session & context\n",
"spark = SparkSession.builder.master('local').getOrCreate()\n",
"sc = spark.sparkContext\n",
"\n",
"# Sum of the first 100 whole numbers\n",
"rdd = sc.parallelize(range(100 + 1))\n",
"rdd.sum()\n",
"# 5050"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}