Merge pull request #1562 from romainx/fix-1423

Turn off ipython low-level output capture and forward
This commit is contained in:
Ayaz Salikhov
2022-01-09 11:55:34 +03:00
committed by GitHub
3 changed files with 38 additions and 0 deletions

View File

@@ -12,6 +12,27 @@ This page provides details about features specific to one or more images.
Note every new spark context that is created is put onto an incrementing port (ie. 4040, 4041, 4042, etc.), and it might be necessary to open multiple ports.
For example: `docker run -d -p 8888:8888 -p 4040:4040 -p 4041:4041 jupyter/pyspark-notebook`.
#### IPython low-level output capture and forward
Spark images (`pyspark-notebook` and `all-spark-notebook`) have been configured to disable IPython low-level output capture and forward system-wide.
The rationale behind this choice is that Spark logs can be verbose, especially at startup when Ivy is used to load additional jars.
Those logs are still available but only in the container's logs.
If you want to make them appear in the notebook, you can overwrite the configuration in a user level IPython kernel profile.
To do that you have to uncomment the following line in your `~/.ipython/profile_default/ipython_kernel_config.py` and restart the kernel.
```Python
c.IPKernelApp.capture_fd_output = True
```
If you have no IPython profile you can initiate a fresh one by running the following command.
```bash
ipython profile create
# [ProfileCreate] Generating default config file: '/home/jovyan/.ipython/profile_default/ipython_config.py'
# [ProfileCreate] Generating default config file: '/home/jovyan/.ipython/profile_default/ipython_kernel_config.py'
```
### Build an Image with a Different Version of Spark
You can build a `pyspark-notebook` image (and also the downstream `all-spark-notebook` image) with a different version of Spark by overriding the default value of the following arguments at build time.

View File

@@ -53,6 +53,10 @@ RUN cp -p "${SPARK_HOME}/conf/spark-defaults.conf.template" "${SPARK_HOME}/conf/
echo 'spark.driver.extraJavaOptions -Dio.netty.tryReflectionSetAccessible=true' >> "${SPARK_HOME}/conf/spark-defaults.conf" && \
echo 'spark.executor.extraJavaOptions -Dio.netty.tryReflectionSetAccessible=true' >> "${SPARK_HOME}/conf/spark-defaults.conf"
# Configure IPython system-wide
COPY ipython_kernel_config.py "/etc/ipython/"
RUN fix-permissions "/etc/ipython/"
USER ${NB_UID}
# Install pyarrow

View File

@@ -0,0 +1,13 @@
# Configuration file for ipython-kernel.
# See <https://ipython.readthedocs.io/en/stable/config/options/kernel.html>
# With IPython >= 6.0.0, all outputs to stdout/stderr are captured.
# It is the case for subprocesses and output of compiled libraries like Spark.
# Those logs now both head to notebook logs and in notebooks outputs.
# Logs are particularly verbose with Spark, that is why we turn them off through this flag.
# <https://github.com/jupyter/docker-stacks/issues/1423>
# Attempt to capture and forward low-level output, e.g. produced by Extension
# libraries.
# Default: True
c.IPKernelApp.capture_fd_output = False # noqa: F821