Fix more errors

This commit is contained in:
Ayaz Salikhov
2021-05-01 13:07:40 +03:00
parent bf5ec600d1
commit c2031e2a4a
18 changed files with 34 additions and 38 deletions

View File

@@ -96,7 +96,7 @@ docker logs --tail 3 notebook
http://localhost:8888/?token=15914ca95f495075c0aa7d0e060f1a78b6d94f70ea373b00
```
Together, the URL to visit on the host machine to access the server in this case is http://localhost:32769?token=15914ca95f495075c0aa7d0e060f1a78b6d94f70ea373b00.
Together, the URL to visit on the host machine to access the server in this case is <http://localhost:32769?token=15914ca95f495075c0aa7d0e060f1a78b6d94f70ea373b00>.
The container runs in the background until stopped and/or removed by additional Docker commands.

View File

@@ -52,7 +52,7 @@ The `jupyter/pyspark-notebook` and `jupyter/all-spark-notebook` images support t
Spark **local mode** is useful for experimentation on small data when you do not have a Spark cluster available.
##### In Python
##### Local Mode in Python
In a Python notebook.
@@ -69,7 +69,7 @@ rdd.sum()
# 5050
```
##### In R
##### Local Mode in R
In a R notebook with [SparkR][sparkr].
@@ -107,7 +107,7 @@ sdf_len(sc, 100, repartition = 1) %>%
# 5050
```
##### In Scala
##### Local Mode in Scala
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
options in a `%%init_spark` magic cell.
@@ -136,11 +136,11 @@ Connection to Spark Cluster on **[Standalone Mode](https://spark.apache.org/docs
your Spark workers. (This is a [Spark networking
requirement](http://spark.apache.org/docs/latest/cluster-overview.html#components).)
* NOTE: When using `--net=host`, you must also use the flags `--pid=host -e
TINI_SUBREAPER=true`. See https://github.com/jupyter/docker-stacks/issues/64 for details.
TINI_SUBREAPER=true`. See <https://github.com/jupyter/docker-stacks/issues/64> for details.
**Note**: In the following examples we are using the Spark master URL `spark://master:7077` that shall be replaced by the URL of the Spark master.
##### In Python
##### Standalone Mode in Python
The **same Python version** need to be used on the notebook (where the driver is located) and on the Spark workers.
The python version used at driver and worker side can be adjusted by setting the environment variables `PYSPARK_PYTHON` and / or `PYSPARK_DRIVER_PYTHON`, see [Spark Configuration][spark-conf] for more information.
@@ -158,7 +158,7 @@ rdd.sum()
# 5050
```
##### In R
##### Standalone Mode in R
In a R notebook with [SparkR][sparkr].
@@ -195,7 +195,7 @@ sdf_len(sc, 100, repartition = 1) %>%
# 5050
```
##### In Scala
##### Standalone Mode in Scala
Spylon kernel instantiates a `SparkContext` for you in variable `sc` after you configure Spark
options in a `%%init_spark` magic cell.