From f2059c0d039f571480b2974c6d17e32b5800389a Mon Sep 17 00:00:00 2001 From: Goodness Chris-Ugari Date: Tue, 11 Oct 2022 09:37:54 +0100 Subject: [PATCH] Update troubleshooting.md This PR improves the troubleshooting doc and is part of [issue 41](https://github.com/jupyterhub/outreachy/issues/41) --- docs/source/troubleshooting.md | 171 +++++++++++++++++---------------- 1 file changed, 88 insertions(+), 83 deletions(-) diff --git a/docs/source/troubleshooting.md b/docs/source/troubleshooting.md index 0cecb6d9..b3ac424c 100644 --- a/docs/source/troubleshooting.md +++ b/docs/source/troubleshooting.md @@ -1,32 +1,37 @@ # Troubleshooting When troubleshooting, you may see unexpected behaviors or receive an error -message. This section provide links for identifying the cause of the +message. This section provides links for identifying the cause of the problem and how to resolve it. [_Behavior_](#behavior) -- JupyterHub proxy fails to start -- sudospawner fails to run -- What is the default behavior when none of the lists (admin, allowed, - allowed groups) are set? -- JupyterHub Docker container not accessible at localhost +- [JupyterHub proxy fails to start](#jupyterhub-proxy-fails-to-start) +- [sudospawner fails to run](#sudospawner-fails-to-run) +- [What is the default behavior when none of the lists (admin, allowed, + allowed groups) are set?](#what-is-the-default-behavior-when-none-of-the-lists-admin-allowed-allowed-groups-are-set) +- [JupyterHub Docker container not accessible at localhost](#jupyterhub-docker-container-not-accessible-at-localhost) +- [How can I kill ports from JupyterHub-managed services that have been orphaned?](#how-can-i-kill-ports-from-jupyterhub-managed-services-that-have-been-orphaned) +- [Why am I getting a Spawn failed error message?](#why-am-i-getting-a-spawn-failed-error-message) +- [How can I run JupyterHub with sudo but use my current env vars and virtualenv location?](#how-can-i-run-jupyterhub-with-sudo-but-use-my-current-env-vars-and-virtualenv-location) [_Errors_](#errors) -- 500 error after spawning my single-user server +- [Error 500 after spawning my single-user server](#error-500-after-spawning-my-single-user-server) +- [Launching Jupyter Notebooks to run as an externally managed JupyterHub service with the `jupyterhub-singleuser` command returns a `JUPYTERHUB_API_TOKEN` error](#launching-jupyter-notebooks-to-run-as-an-externally-managed-jupyterhub-service-with-the-jupyterhub-singleuser-command-returns-a-jupyterhub-api-token-error) [_How do I...?_](#how-do-i) -- Use a chained SSL certificate -- Install JupyterHub without a network connection -- I want access to the whole filesystem, but still default users to their home directory -- How do I increase the number of pySpark executors on YARN? -- How do I use JupyterLab's prerelease version with JupyterHub? -- How do I set up JupyterHub for a workshop (when users are not known ahead of time)? -- How do I set up rotating daily logs? -- Toree integration with HDFS rack awareness script -- Where do I find Docker images and Dockerfiles related to JupyterHub? +- [Use a chained SSL certificate](#use-a-chained-ssl-certificate) +- [Install JupyterHub without a network connection](#install-jupyterhub-without-a-network-connection) +- [I want access to the whole filesystem and still default users to their home directory](#i-want-access-to-the-whole-filesystem-and-still-default-users-to-their-home-directory) +- [How do I increase the number of pySpark executors on YARN?](#how-do-i-increase-the-number-of-pyspark-executors-on-yarn) +- [How do I use JupyterLab's prerelease version with JupyterHub?](#how-do-i-use-jupyterlab-s-prerelease-version-with-jupyterhub) +- [How do I set up JupyterHub for a workshop (when users are not known ahead of time)?](#how-do-i-set-up-jupyterhub-for-a-workshop-when-users-are-not-known-ahead-of-time) +- [How do I set up rotating daily logs?](#how-do-i-set-up-rotating-daily-logs) +- [Toree integration with HDFS rack awareness script](#toree-integration-with-hdfs-rack-awareness-script) +- [Where do I find Docker images and Dockerfiles related to JupyterHub?](#where-do-i-find-docker-images-and-dockerfiles-related-to-jupyterhub) +- [How can I view the logs for JupyterHub or the user's Notebook servers when using the DockerSpawner?](#how-can-i-view-the-logs-for-jupyterhub-or-the-user-s-notebook-servers-when-using-the-dockerspawner) [_Troubleshooting commands_](#troubleshooting-commands) @@ -40,9 +45,9 @@ If you have tried to start the JupyterHub proxy and it fails to start: `c.JupyterHub.ip = '*'`; if it is, try `c.JupyterHub.ip = ''` - Try starting with `jupyterhub --ip=0.0.0.0` -**Note**: If this occurs on Ubuntu/Debian, check that the you are using a -recent version of node. Some versions of Ubuntu/Debian come with a version -of node that is very old, and it is necessary to update node. +**Note**: If this occurs on Ubuntu/Debian, check that you are using a +recent version of [Node](https://nodejs.org). Some versions of Ubuntu/Debian come with a version +of Node that is very old, and it is necessary to update Node. ### sudospawner fails to run @@ -61,24 +66,24 @@ to the config file, `jupyterhub_config.py`. ### What is the default behavior when none of the lists (admin, allowed, allowed groups) are set? When nothing is given for these lists, there will be no admins, and all users -who can authenticate on the system (i.e. all the unix users on the server with +who can authenticate on the system (i.e. all the Unix users on the server with a password) will be allowed to start a server. The allowed username set lets you limit this to a particular set of users, and admin_users lets you specify who among them may use the admin interface (not necessary, unless you need to do -things like inspect other users' servers, or modify the user list at runtime). +things like inspect other users' servers or modify the user list at runtime). ### JupyterHub Docker container not accessible at localhost Even though the command to start your Docker container exposes port 8000 (`docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub`), -it is possible that the IP address itself is not accessible/visible. As a result +it is possible that the IP address itself is not accessible/visible. As a result, when you try http://localhost:8000 in your browser, you are unable to connect even though the container is running properly. One workaround is to explicitly tell Jupyterhub to start at `0.0.0.0` which is visible to everyone. Try this command: `docker run -p 8000:8000 -d --name jupyterhub jupyterhub/jupyterhub jupyterhub --ip 0.0.0.0 --port 8000` -### How can I kill ports from JupyterHub managed services that have been orphaned? +### How can I kill ports from JupyterHub-managed services that have been orphaned? I started JupyterHub + nbgrader on the same host without containers. When I try to restart JupyterHub + nbgrader with this configuration, errors appear that the service accounts cannot start because the ports are being used. @@ -92,7 +97,7 @@ Where `` is the port used by the nbgrader course service. This con ### Why am I getting a Spawn failed error message? -After successfully logging in to JupyterHub with a compatible authenticators, I get a 'Spawn failed' error message in the browser. The JupyterHub logs have `jupyterhub KeyError: "getpwnam(): name not found: `. +After successfully logging in to JupyterHub with a compatible authenticator, I get a 'Spawn failed' error message in the browser. The JupyterHub logs have `jupyterhub KeyError: "getpwnam(): name not found: `. This issue occurs when the authenticator requires a local system user to exist. In these cases, you need to use a spawner that does not require an existing system user account, such as `DockerSpawner` or `KubeSpawner`. @@ -109,23 +114,9 @@ sudo MY_ENV=abc123 \ /srv/jupyterhub/jupyterhub ``` -### How can I view the logs for JupyterHub or the user's Notebook servers when using the DockerSpawner? - -Use `docker logs ` where `` is the container name defined within `docker-compose.yml`. For example, to view the logs of the JupyterHub container use: - - docker logs hub - -By default, the user's notebook server is named `jupyter-` where `username` is the user's username within JupyterHub's db. So if you wanted to see the logs for user `foo` you would use: - - docker logs jupyter-foo - -You can also tail logs to view them in real time using the `-f` option: - - docker logs -f hub - ## Errors -### 500 error after spawning my single-user server +### Error 500 after spawning my single-user server You receive a 500 error when accessing the URL `/user//...`. This is often seen when your single-user server cannot verify your user cookie @@ -185,10 +176,10 @@ If you receive a 403 error, the API token for the single-user server is likely invalid. Commonly, the 403 error is caused by resetting the JupyterHub database (either removing jupyterhub.sqlite or some other action) while leaving single-user servers running. This happens most frequently when using -DockerSpawner, because Docker's default behavior is to stop/start containers -which resets the JupyterHub database, rather than destroying and recreating +DockerSpawner because Docker's default behavior is to stop/start containers +that reset the JupyterHub database, rather than destroying and recreating the container every time. This means that the same API token is used by the -server for its whole life, until the container is rebuilt. +server for its whole life until the container is rebuilt. The fix for this Docker case is to remove any Docker containers seeing this issue (typically all containers created before a certain point in time): @@ -201,14 +192,14 @@ your server again. ##### Proxy settings (403 GET) -When your whole JupyterHub sits behind a organization proxy (_not_ a reverse proxy like NGINX as part of your setup and _not_ the configurable-http-proxy) the environment variables `HTTP_PROXY`, `HTTPS_PROXY`, `http_proxy` and `https_proxy` might be set. This confuses the jupyterhub-singleuser servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the singleuser server thinking it has a wrong auth token. To circumvent this you should add `,,localhost,127.0.0.1` to the environment variables `NO_PROXY` and `no_proxy`. +When your whole JupyterHub sits behind an organization proxy (_not_ a reverse proxy like NGINX as part of your setup and _not_ the configurable-http-proxy) the environment variables `HTTP_PROXY`, `HTTPS_PROXY`, `http_proxy`, and `https_proxy` might be set. This confuses the Jupyterhub single-user servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the single-user server thinking it has the wrong auth token. To circumvent this you should add `,,localhost,127.0.0.1` to the environment variables `NO_PROXY` and `no_proxy`. ### Launching Jupyter Notebooks to run as an externally managed JupyterHub service with the `jupyterhub-singleuser` command returns a `JUPYTERHUB_API_TOKEN` error [JupyterHub services](https://jupyterhub.readthedocs.io/en/stable/reference/services.html) allow processes to interact with JupyterHub's REST API. Example use-cases include: - **Secure Testing**: provide a canonical Jupyter Notebook for testing production data to reduce the number of entry points into production systems. -- **Grading Assignments**: provide access to shared Jupyter Notebooks that may be used for management tasks such grading assignments. +- **Grading Assignments**: provide access to shared Jupyter Notebooks that may be used for management tasks such as grading assignments. - **Private Dashboards**: share dashboards with certain group members. If possible, try to run the Jupyter Notebook as an externally managed service with one of the provided [jupyter/docker-stacks](https://github.com/jupyter/docker-stacks). @@ -222,7 +213,7 @@ If you launch a Jupyter Notebook with the `jupyterhub-singleuser` command direct Did you launch it manually? ``` -If you plan on testing `jupyterhub-singleuser` independently from JupyterHub, then you can set the api token environment variable. For example, if were to run the single-user Jupyter Notebook on the host, then: +If you plan on testing `jupyterhub-singleuser` independently from JupyterHub, then you can set the API token environment variable. For example, if you were to run the single-user Jupyter Notebook on the host, then: export JUPYTERHUB_API_TOKEN=my_secret_token jupyterhub-singleuser @@ -256,7 +247,7 @@ You would then set in your `jupyterhub_config.py` file the `ssl_key` and #### Example Your certificate provider gives you the following files: `example_host.crt`, -`Entrust_L1Kroot.txt` and `Entrust_Root.txt`. +`Entrust_L1Kroot.txt`, and `Entrust_Root.txt`. Concatenate the files appending the chain cert and root cert to your host cert: @@ -289,7 +280,7 @@ with npmbox: python3 -m pip wheel jupyterhub npmbox configurable-http-proxy -### I want access to the whole filesystem, but still default users to their home directory +### I want access to the whole filesystem and still default users to their home directory Setting the following in `jupyterhub_config.py` will configure access to the entire filesystem and set the default to the user's home directory. @@ -321,7 +312,7 @@ For instance: python3 -m pip install jupyterlab jupyter serverextension enable --py jupyterlab --sys-prefix -The important thing is that jupyterlab is installed and enabled in the +The important thing is that Jupyterlab is installed and enabled in the single-user notebook server environment. For system users, this means system-wide, as indicated above. For Docker containers, it means inside the single-user docker image, etc. @@ -334,14 +325,14 @@ notebook servers to default to JupyterLab: ### How do I set up JupyterHub for a workshop (when users are not known ahead of time)? 1. Set up JupyterHub using OAuthenticator for GitHub authentication -2. Configure admin list to have workshop leaders be listed with administrator privileges. +2. Configure the admin list to have workshop leaders be listed with administrator privileges. -Users will need a GitHub account to login and be authenticated by the Hub. +Users will need a GitHub account to log in and be authenticated by the Hub. ### How do I set up rotating daily logs? You can do this with [logrotate](https://linux.die.net/man/8/logrotate), -or pipe to `logger` to use syslog instead of directly to a file. +or pipe to `logger` to use Syslog instead of directly to a file. For example, with this logrotate config file: @@ -361,6 +352,52 @@ logrotate /path/to/above-config Or use syslog: jupyterhub | logger -t jupyterhub + +### Toree integration with HDFS rack awareness script + +The Apache Toree kernel will have an issue when running with JupyterHub if the standard HDFS +rack awareness script is used. This will materialize in the logs as a repeated WARN: + +```bash +16/11/29 16:24:20 WARN ScriptBasedMapping: Exception running /etc/hadoop/conf/topology_script.py some.ip.address +ExitCodeException exitCode=1: File "/etc/hadoop/conf/topology_script.py", line 63 + print rack + ^ +SyntaxError: Missing parentheses in call to 'print' + + at `org.apache.hadoop.util.Shell.runCommand(Shell.java:576)` +``` + +In order to resolve this issue, there are two potential options. + +1. Update HDFS core-site.xml, so the parameter "net.topology.script.file.name" points to a custom + script (e.g. /etc/hadoop/conf/custom_topology_script.py). Copy the original script and change the first line point + to a python two installation (e.g. /usr/bin/python). +2. In spark-env.sh add a Python 2 installation to your path (e.g. export PATH=/opt/anaconda2/bin:$PATH). + +### Where do I find Docker images and Dockerfiles related to JupyterHub? + +Docker images can be found at the [JupyterHub organization on DockerHub](https://hub.docker.com/u/jupyterhub/). +The Docker image [jupyterhub/singleuser](https://hub.docker.com/r/jupyterhub/singleuser/) +provides an example single-user notebook server for use with DockerSpawner. + +Additional single-user notebook server images can be found at the [Jupyter +organization on DockerHub](https://hub.docker.com/r/jupyter/) and information +about each image at the [jupyter/docker-stacks repo](https://github.com/jupyter/docker-stacks). + +### How can I view the logs for JupyterHub or the user's Notebook servers when using the DockerSpawner? + +Use `docker logs ` where `` is the container name defined within `docker-compose.yml`. For example, to view the logs of the JupyterHub container use: + + docker logs hub + +By default, the user's notebook server is named `jupyter-` where `username` is the user's username within JupyterHub's db. So if you wanted to see the logs for user `foo` you would use: + + docker logs jupyter-foo + +You can also tail logs to view them in real-time using the `-f` option: + + docker logs -f hub ## Troubleshooting commands @@ -385,35 +422,3 @@ jupyter kernelspec list ```bash jupyterhub --debug ``` - -### Toree integration with HDFS rack awareness script - -The Apache Toree kernel will an issue, when running with JupyterHub, if the standard HDFS -rack awareness script is used. This will materialize in the logs as a repeated WARN: - -```bash -16/11/29 16:24:20 WARN ScriptBasedMapping: Exception running /etc/hadoop/conf/topology_script.py some.ip.address -ExitCodeException exitCode=1: File "/etc/hadoop/conf/topology_script.py", line 63 - print rack - ^ -SyntaxError: Missing parentheses in call to 'print' - - at `org.apache.hadoop.util.Shell.runCommand(Shell.java:576)` -``` - -In order to resolve this issue, there are two potential options. - -1. Update HDFS core-site.xml, so the parameter "net.topology.script.file.name" points to a custom - script (e.g. /etc/hadoop/conf/custom_topology_script.py). Copy the original script and change the first line point - to a python two installation (e.g. /usr/bin/python). -2. In spark-env.sh add a Python 2 installation to your path (e.g. export PATH=/opt/anaconda2/bin:$PATH). - -### Where do I find Docker images and Dockerfiles related to JupyterHub? - -Docker images can be found at the [JupyterHub organization on DockerHub](https://hub.docker.com/u/jupyterhub/). -The Docker image [jupyterhub/singleuser](https://hub.docker.com/r/jupyterhub/singleuser/) -provides an example single user notebook server for use with DockerSpawner. - -Additional single user notebook server images can be found at the [Jupyter -organization on DockerHub](https://hub.docker.com/r/jupyter/) and information -about each image at the [jupyter/docker-stacks repo](https://github.com/jupyter/docker-stacks).