mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-15 14:03:02 +00:00
what-is-jupyterhub: Full revision
This commit is contained in:
@@ -1,16 +1,22 @@
|
||||
# What is Jupyter and JupyterHub?
|
||||
|
||||
JupyterHub is not what you think it is. Most things you think are
|
||||
part of JupyterHub are actually handled by some other component, and
|
||||
it's not always obvious how the parts relate. This document was
|
||||
originally written to assist in debugging: very often, the actual
|
||||
problem is not where one thinks it is and thus people can't easily
|
||||
debug. In order to tell this story, we start at JupyterHub and go all
|
||||
the way down to the fundamental components of Jupyter.
|
||||
part of JupyterHub are actually handled by some other component, for
|
||||
example the spawner or notebook server itself, and it's not always
|
||||
obvious how the parts relate. The knowledge contained here hasn't
|
||||
been assembled in one place before, and is essential to understand
|
||||
when setting up a sufficiently complex Jupyter(Hub) setup.
|
||||
|
||||
We occasionally leave things out or bend the truth where it helps in
|
||||
explanation, and give our explanations in terms of Python even though
|
||||
many other languages can be used instead.
|
||||
This document was originally written to assist in debugging: very
|
||||
often, the actual problem is not where one thinks it is and thus
|
||||
people can't easily debug. In order to tell this story, we start at
|
||||
JupyterHub and go all the way down to the fundamental components of
|
||||
Jupyter.
|
||||
|
||||
In this document, we occasionally leave things out or bend the truth
|
||||
where it helps in explanation, and give our explanations in terms of
|
||||
Python even though Jupyter itself is language-neutral. The "(&)"
|
||||
symbol highlights important points where there is more.
|
||||
|
||||
This guide is long, but after reading it you will be know of all major
|
||||
components in the Jupyter ecosystem and everything else you read
|
||||
@@ -20,15 +26,15 @@ should make sense.
|
||||
|
||||
## Just what is Jupyter?
|
||||
|
||||
Before we get too far, let's remember what our end goal is. A Jupyter
|
||||
Notebook is really nothing more than a Python process (or some
|
||||
language) which is getting commands from a web browser and displaying
|
||||
the output via a browser. What the process actually sees can roughly
|
||||
be considered getting data on standard input and writing to standard
|
||||
output (*). There is nothing intrinsically special about this process
|
||||
Before we get too far, let's remember what our end goal is. A
|
||||
**Jupyter Notebook** is really nothing more than a Python(&) process
|
||||
which is getting commands from a web browser and displaying the output
|
||||
via that browser. What the process actually sees can roughly like
|
||||
getting commands on standard input(&) and writing to standard
|
||||
output(&). There is nothing intrinsically special about this process
|
||||
- it can do anything a normal Python process can do, and nothing more.
|
||||
The kernel handles capturing output and converting things like
|
||||
graphics to a form usable by the browser.
|
||||
The **Jupyter kernel** handles capturing output and converting things
|
||||
such as graphics to a form usable by the browser.
|
||||
|
||||
Everything we explain below is building up to this, going through many
|
||||
different layers which give you many ways of customizing how this
|
||||
@@ -39,36 +45,42 @@ process runs. But this process is not *too* special.
|
||||
## JupyterHub
|
||||
|
||||
**JupyterHub** is the central piece that provides multi-user
|
||||
login. Despite this, the end user only briefly interacts with it and
|
||||
most of the actual Jupyter session does not relate to the hub at all.
|
||||
In short, anything which is related to *starting* the user's workspace
|
||||
is about JupyterHub, anything about *running* usually isn't.
|
||||
login. Despite this, the end user only briefly interacts with
|
||||
JupyterHub and most of the actual Jupyter session does not relate to
|
||||
the hub at all: the hub mainly handles authentication and spawning the
|
||||
single-user server. In short, anything which is related to *starting*
|
||||
the user's workspace/environment is about JupyterHub, anything about
|
||||
*running* usually isn't.
|
||||
|
||||
If you have problems connecting the authentication, spawning, and the
|
||||
proxy (explained below), the issues is usually with JupyterHub. To
|
||||
debug, JupyterHub has extensive logs which get printed to its console
|
||||
and can be used to discover most problems.
|
||||
|
||||
JupyterHub consists of the main pieces below:
|
||||
The main pieces of JupyterHub are:
|
||||
|
||||
### Authenticators
|
||||
### Authenticator
|
||||
|
||||
JupyterHub itself doesn't actually (necessarily) manage your users.
|
||||
It has a database of users, but it is usually connected with some
|
||||
other system that manages the usernames and passwords. When someone
|
||||
tries to log in to JupyteHub, it just asks the **authenticator** if
|
||||
the username/password is valid. The authenticator can also return
|
||||
user groups and admin status of users, so that JupyterHub can roughly
|
||||
manage users to services.
|
||||
JupyterHub itself doesn't actually manage your users(&). It has a
|
||||
database of users, but it is usually connected with some other system
|
||||
that manages the usernames and passwords. When someone tries to log
|
||||
in to JupyteHub, it just asks the
|
||||
**authenticator**([basics](authenticators-users-basics.html),
|
||||
[reference](../reference/authenticators.html)) if the
|
||||
username/password is valid(&). The authenticator can also return user
|
||||
groups and admin status of users, so that JupyterHub can do some
|
||||
higher-level management. The authenticator returns a username(&),
|
||||
which is passed on to the spawner, which has to use it to start that
|
||||
user's environment.
|
||||
|
||||
The following authenticators are included with JupyterHub:
|
||||
|
||||
- **PAMAuthenticator** uses the standard Unix/Linux operating system
|
||||
functions to check users. Roughly, if someone already has access to
|
||||
the machine (they can log in by ssh or otherwise), they will be able
|
||||
to log in to JupyterHub automatically. Thus, JupyterHub fills the
|
||||
role of a ssh server, but providing a web-browser based way to
|
||||
access the machine.
|
||||
the machine (they can log in by ssh), they will be able to log in to
|
||||
JupyterHub without any other setup. Thus, JupyterHub fills the role
|
||||
of a ssh server, but providing a web-browser based way to access the
|
||||
machine.
|
||||
|
||||
|
||||
But those are fairly limited, and thus there are [plenty of others to
|
||||
@@ -77,17 +89,16 @@ from](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
|
||||
You can connect to almost any other existing service to manage your
|
||||
users. You either use all users from this other service (e.g. your
|
||||
company), or whitelist only the allowed users (e.g. your group's
|
||||
Github users). Some other popular authenticators include:
|
||||
Github usernames). Some other popular authenticators include:
|
||||
|
||||
- **OAuthenticator** uses the standard OAuth protocol to verify users.
|
||||
For example, you can easily use Github to authenticate your users -
|
||||
people have a "click to login with Github" button. This is often
|
||||
done with a whitelist to only allow certain users.
|
||||
|
||||
- **NativeAuthenticator** actually stores its own usernames and
|
||||
passwords, unlike most other authenticators. Thus, you can manage
|
||||
all your users within JupyterHUb only. (include one more example
|
||||
here)
|
||||
- **NativeAuthenticator** actually stores and validates its own
|
||||
usernames and passwords, unlike most other authenticators. Thus,
|
||||
you can manage all your users within JupyterHub only.
|
||||
|
||||
- There are authenticators for LTI (learning management systems),
|
||||
Shibboleth, Kerberos - and so on.
|
||||
@@ -100,15 +111,17 @@ The authenticator runs internally to the Hub process but communicates
|
||||
with outside services.
|
||||
|
||||
If you have trouble logging in, this is usually a problem of the
|
||||
authenticator. The authenticator debug information goes to the
|
||||
JupyterHub logs, but there may also be hints in whatever external
|
||||
authenticator. The authenticator logs are part of the the JupyterHub
|
||||
logs, but there may also be relevant information in whatever external
|
||||
services you are using.
|
||||
|
||||
### Spawners
|
||||
### Spawner
|
||||
|
||||
The **spawner** is the real core of JupyterHub: when someone wants a
|
||||
notebook server, it finds resources and starts the server. It could
|
||||
run on the current server, on another server, on some cloud service,
|
||||
The **spawner** ([basics](spawners-basics.html),
|
||||
[reference](../reference/spawners.html)) is the real core of
|
||||
JupyterHub: when someone wants a notebook server, it allocates
|
||||
resources and starts the server. The notebook server could run on the
|
||||
same server as JupyterHub, on another server, on some cloud service,
|
||||
or even more. They can limit resources (CPU, memory) or isolate users
|
||||
from each other - if the spawner supports it. They can also do no
|
||||
limiting and allow any user to access any other user's files if they
|
||||
@@ -116,35 +129,41 @@ are not configured properly.
|
||||
|
||||
Some basic spawners included in JupyterHub is:
|
||||
|
||||
**LocalProcessSpawner** is build in to JupyterHub and basically starts
|
||||
tries to switch user to the given username and start Jupyter. It
|
||||
requires that the hub be run as root (because only root has permission
|
||||
to start processes as other user IDs). LocalProcessSpawner is no
|
||||
different than a user logging in with something like `ssh` and running
|
||||
jobs. PAMAuthenticator and LocalProcessSpawner is the most basic way
|
||||
of using JupyterHub (and what it does out of the box) and makes the
|
||||
hub not too dissimilar to an advanced ssh server.
|
||||
- **LocalProcessSpawner** is build into JupyterHub and basically tries
|
||||
to switch user to the given username (`su` (&)) and start the
|
||||
notebook server. It requires that the hub be run as root (because
|
||||
only root has permission to start processes as other user IDs).
|
||||
LocalProcessSpawner is no different than a user logging in with
|
||||
something like `ssh` and running something. PAMAuthenticator and
|
||||
LocalProcessSpawner is the most basic way of using JupyterHub (and
|
||||
what it does out of the box) and makes the hub not too dissimilar to
|
||||
an advanced ssh server.
|
||||
|
||||
There are many more advanced fancy spawners:
|
||||
There are many more advanced spawners:
|
||||
|
||||
- **SudoSpawner** is like LocalProcessSpawner but lets you run
|
||||
JupyterHub without root. sudo has to be configured to allow the
|
||||
JupyterHub without root. `sudo` has to be configured to allow the
|
||||
hub's user to run processes under other user IDs.
|
||||
|
||||
- **SystemdSpawner** uses Systemd to start other processes. It can
|
||||
isolate users from each other and provide some limits.
|
||||
isolate users from each other and provide resource limiting.
|
||||
|
||||
- **DockerSpawner** runs stuff in Docker, a containerization system.
|
||||
This lets you fully isolate users, limit CPU, memory, and provide
|
||||
other operating system images to fully customize the environment.
|
||||
other container images to fully customize the environment.
|
||||
|
||||
- **KubeSpawner** runs on the Kubernetes, a cloud orchestration
|
||||
system. The spawner can easily limit users and provide cloud
|
||||
scaling - but the spawner doesn't actually do that, Kubernetes does.
|
||||
scaling - but the spawner doesn't actually do that, Kubernetes
|
||||
does. The spawner just tells Kubernetes what to do. If you want to
|
||||
get KubeSpawner to do something, first you would figure out how to
|
||||
do it in Kubernetes, then figure out how to tell KubeSpawner to tell
|
||||
Kubernetes that. Actually... this is true for most spawners.
|
||||
|
||||
- **BatchSpawner** runs on computer clusters with batch queuing
|
||||
systems. The user processes are run as batch jobs, having access to
|
||||
all the data and software that the users normally will.
|
||||
- **BatchSpawner** runs on computer clusters with batch job scheduling
|
||||
systems (e.g Slurm, HTCondor, PBS, etc). The user processes are run
|
||||
as batch jobs, having access to all the data and software that the
|
||||
users normally will.
|
||||
|
||||
In short, spawners are the interface to the rest of the operating
|
||||
system, and to configure them right you need to know a bit about how
|
||||
@@ -166,24 +185,25 @@ error is usually with the spawner or the notebook server (as described
|
||||
in the next section). Each spawner outputs some logs to the main
|
||||
JupyterHub logs, but may also have logs in other places depending on
|
||||
what services it interacts with (for example, the Docker spawner
|
||||
somehow puts logs in the Docker system services).
|
||||
somehow puts logs in the Docker system services, Kubernetes through
|
||||
the `kubectl` API).
|
||||
|
||||
|
||||
### Proxy
|
||||
|
||||
Previously, we said that the hub is between the user and the user's
|
||||
notebook servers. It actually isn't directly between, because the
|
||||
JupyterHub **proxy** relays connections between the users and their
|
||||
single-user notebook servers. What this basically means is that the
|
||||
hub itself can shut down, and if the proxy can continue to allow users
|
||||
to communicate with their notebook servers. (This just further
|
||||
emphasizes that the hub is responsible for starting, not running, the
|
||||
notebooks). By default, the hub starts the proxy automatically (so
|
||||
that you don't realize there is a separate proxy) and stops the proxy
|
||||
when the hub stops (so that connections get interrupted). But when
|
||||
you [configure the proxy to run
|
||||
separately](https://jupyterhub.readthedocs.io/en/stable/reference/separate-proxy.html),
|
||||
your users connections will stay working even without the hub.
|
||||
Previously, we said that the hub is between the user's web browser and
|
||||
the user's notebook servers. It actually isn't directly between,
|
||||
because the JupyterHub **proxy** relays connections between the users
|
||||
and their single-user notebook servers. What this basically means is
|
||||
that the hub itself can shut down, and if the proxy can continue to
|
||||
allow users to communicate with their notebook servers. (This just
|
||||
further emphasizes that the hub is responsible for starting, not
|
||||
running, the notebooks). By default, the hub starts the proxy
|
||||
automatically (so that you don't realize there is a separate proxy)
|
||||
and stops the proxy when the hub stops (so that connections get
|
||||
interrupted). But when you [configure the proxy to run
|
||||
separately](../reference/separate-proxy.html),
|
||||
users connection will stay working even without the hub.
|
||||
|
||||
The default proxy is **ConfigurableHttpProxy** which is simple but
|
||||
effective. A more advanced option is the **Traefik Proxy**, which
|
||||
@@ -192,11 +212,11 @@ gives you redundancy and high-availability.
|
||||
When users "connect to JupyterHub", they *always* first connect to the
|
||||
proxy and the proxy relays the connection to the hub. Thus, the proxy
|
||||
is responsible for SSL and accepting connections from the rest of the
|
||||
internet.
|
||||
|
||||
The hub has to connect to the proxy to adjust the routes (The web path
|
||||
`/user/someone` goes to the server of someone at a certain address).
|
||||
The proxy has to be able to connect to both the hub and all the
|
||||
internet. The user uses the hub to authenticate and start the server,
|
||||
and then the hub connect back to the proxy to adjust the proxy routes
|
||||
for the user's server (e.g. the web path `/user/someone` redirects to
|
||||
the server of someone at a certain internal address). The proxy has
|
||||
to be able to internally connect to both the hub and all the
|
||||
single-user servers.
|
||||
|
||||
The proxy always runs as a separate process to JupyterHub (even though
|
||||
@@ -210,26 +230,43 @@ notebook servers, or making the first connection to the hub, it is
|
||||
usually caused by the proxy. The ConfigurableHttpProxy's logs are
|
||||
mixed with JupyterHub's logs if it's started through the hub (the
|
||||
default case), otherwise from whatever system runs the proxy (if you
|
||||
do it, you'll know).
|
||||
do configure it, you'll know).
|
||||
|
||||
### Services
|
||||
|
||||
JupyterHub has the concept of **services**, which are other web
|
||||
services started by the hub, but otherwise are not really related to
|
||||
the hub itself. They are often used to do things related to Jupyter
|
||||
JupyterHub has the concept of **services**
|
||||
([basics](services-basics.html),
|
||||
[reference](../reference/services.html)), which are other web services
|
||||
started by the hub, but otherwise are not necessarily related to the
|
||||
hub itself. They are often used to do things related to Jupyter
|
||||
(things that user interacts with, usually not the hub), but could
|
||||
always be run some other way. Running from the hub provides an easy
|
||||
way to get Hub API tokens and authenticate users against the hub.
|
||||
way to get Hub API tokens and authenticate users against the hub. It
|
||||
can also automatically add a proxy route to forward web requests to
|
||||
that service.
|
||||
|
||||
The configuration option `c.JupyterHub.services` (??) is used to start
|
||||
services from the hub.
|
||||
A common example of a service is the [cull idle
|
||||
servers](https://jupyterhub.readthedocs.io/en/stable/getting-started/services-basics.html#real-world-example-to-cull-idle-servers)
|
||||
script. When started by the hub, it automatically gets admin API
|
||||
tokens. It uses the API to list all running servers, compare against
|
||||
activity timeouts, and shut down servers exceeding the limits. Even
|
||||
though this is an intrinsic part of JupyterHub, it is only loosely
|
||||
coupled and running as a service provides convenience of
|
||||
authentication - it could be just as well run some other way, with a
|
||||
manually provided API token.
|
||||
|
||||
Let's use the often-requested question of *sharing files using
|
||||
Another example of an often-requested question of *sharing files using
|
||||
hubshare* as an example. Hubshare would work as an external service
|
||||
which user notebooks talk to and use Hub authentication, but otherwise
|
||||
it isn't directly a matter of the hub. You could equally well share
|
||||
files by other extensions to the single-user notebook servers or
|
||||
configuring the spawners to access shared storage spaces.
|
||||
configuring the spawners to access shared storage spaces. In order to
|
||||
use something such as hubshare, the difficulty is not modifying
|
||||
JupyterHub: it is modifying the notebook servers to speak to some
|
||||
service, and making that service.
|
||||
|
||||
The configuration option `c.JupyterHub.services` is used to start
|
||||
services from the hub.
|
||||
|
||||
When a service is started from JupyterHub automatically, its logs are
|
||||
included in the JupyterHub logs.
|
||||
@@ -243,15 +280,14 @@ running `jupyter notebook` or `jupyter lab` from the command line -
|
||||
the actual Jupyter user interface for a single person.
|
||||
|
||||
The role of the spawner is to start this server - basically, running
|
||||
the command `jupyter notebook`.
|
||||
Actually it doesn't run that, it runs `jupyterhub-singleuser` which
|
||||
first communicates with the hub to say "I'm alive" before running a
|
||||
completely normal Jupyter server. The single-user server can be
|
||||
JupyterLab or classic notebooks. By this point, the hub is almost
|
||||
completely out of the picture (the web traffic is going through proxy
|
||||
unchanged). By this time, the spawner has already decided the
|
||||
environment which this single-user server will have and the
|
||||
single-user server has to deal with that.
|
||||
the command `jupyter notebook`. Actually it doesn't run that, it runs
|
||||
`jupyterhub-singleuser` which first communicates with the hub to say
|
||||
"I'm alive" before running a completely normal Jupyter server. The
|
||||
single-user server can be JupyterLab or classic notebooks. By this
|
||||
point, the hub is almost completely out of the picture (the web
|
||||
traffic is going through proxy unchanged). Also by this time, the
|
||||
spawner has already decided the environment which this single-user
|
||||
server will have and the single-user server has to deal with that.
|
||||
|
||||
The spawner starts the server using `jupyterhub-singleuser` with some
|
||||
environment variables like `JUPYTERHUB_API_TOKEN` and
|
||||
@@ -264,16 +300,23 @@ them, they run through the same backend server process and the web
|
||||
frontend is an option when it is starting. The spawner can choose the
|
||||
command line when it starts the single-user server. Extensions are a
|
||||
property of the single-user server (in two parts: there can be a part
|
||||
that runs in server process, and parts that run in javascript in lab
|
||||
or notebook).
|
||||
that runs in the Python server process, and parts that run in
|
||||
javascript in lab or notebook).
|
||||
|
||||
If one wants to install software for users, it is not a matter of
|
||||
"installing it for JupyerHub" - it's a matter of installing it for the
|
||||
single-user server, which might be the same environment as the hub,
|
||||
but not necessarily. (Actually, see below - it's a matter of the
|
||||
kernels!)
|
||||
|
||||
After the single-user notebook server is started, any errors are only
|
||||
an issue of the single-user notebook server. Sometimes, it seems like
|
||||
the spawner is failing, but really the spawner is working but the
|
||||
single-user notebook server dies right away (in this case, you need to
|
||||
find the problem with the single-user server and adjust the spawner to
|
||||
start it correctly). This can happen, for example, if the spawner
|
||||
doesn't set an environment variable or doesn't provide storage.
|
||||
start it correctly or fix the environment). This can happen, for
|
||||
example, if the spawner doesn't set an environment variable or doesn't
|
||||
provide storage.
|
||||
|
||||
The single-user server's logs are handled by the spawner, so if you
|
||||
notice problems at this phase you need to check your spawner for
|
||||
@@ -289,21 +332,26 @@ configuration option of the spawner.
|
||||
### Notebook
|
||||
|
||||
**(Jupyter) Notebook** is the classic interface, where each notebook
|
||||
opens in a separate tab.
|
||||
opens in a separate tab. It is traditionally started by `jupyter
|
||||
notebook`.
|
||||
|
||||
Does anything need to be said here?
|
||||
|
||||
### Lab
|
||||
|
||||
**JupyterLab** is the new interface, where multiple notebooks are
|
||||
openable in the same tab in an IDE-like environment. JupyterLab is
|
||||
run thorugh the same server file, but at a path `/lab` instead of
|
||||
`/tree`.
|
||||
openable in the same tab in an IDE-like environment. It is
|
||||
traditionally started with `jupyter lab`. Both Notebook and Lab use
|
||||
the same `.ipynb` file format.
|
||||
|
||||
Both Notebook and Lab use the same `.ipynb` file format.
|
||||
JupyterLab is run thorugh the same server file, but at a path `/lab`
|
||||
instead of `/tree`. Thus, they can be active at the same time in the
|
||||
backend and you can switch between them at runtime by changing your
|
||||
URL path.
|
||||
|
||||
Does anything need to be said here?
|
||||
- how extensions work in lab compared to notebook
|
||||
Extensions need to be re-written for JupyterLab (if moving from
|
||||
classic notebooks). But, the server-side of the extensions can be
|
||||
shared by both.
|
||||
|
||||
|
||||
|
||||
@@ -313,30 +361,40 @@ Normally, our tour of the Jupyter ecosystem would stop here. But,
|
||||
since if you've read this far you probably need to know every last
|
||||
bit, let's go further and talk about the kernels. The commands you
|
||||
run in the notebook session are not executed in the same process as
|
||||
the notebook itself, but in a separate **kernel**. There are [many
|
||||
the notebook itself, but in a separate **Jupyter kernel**. There are [many
|
||||
kernels
|
||||
available](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels).
|
||||
|
||||
As a basic approximation, a **Jupyter kernel** is a process which
|
||||
accepts commands (cells that are run) and returns the output to
|
||||
Jupyter to display. One example is the **IPython Jupyter kernel**,
|
||||
which runs Python and adds the IPython magic functions (`%`, `%%`,
|
||||
`!`, etc. commands). There is nothing special about it, it can be
|
||||
considered a *normal Python process*. Like we said above, the kernel
|
||||
process can be approximated as a process that takes commands on stdin
|
||||
and returns stuff on stdout. Actually, a kernel is more fancy,
|
||||
because it can communicate over the network and add in magic commands.
|
||||
which runs Python. There is nothing special about it, it can be
|
||||
considered a *normal Python process. The kernel process can be
|
||||
approximated in UNIX terms as a process that takes commands on stdin
|
||||
and returns stuff on stdout(&). Obviously, it's more because it has
|
||||
to be able to disentangle all the possible outputs, such as figures,
|
||||
and present it to the user in a web browser.
|
||||
|
||||
Kernel communication is via the the ZeroMQ protocol on the local
|
||||
computer. Kernels are separate processes from the main single-user
|
||||
notebook server (and thus obviously, different from the JupyterHub
|
||||
process and everything else). By default (and unless you do something
|
||||
special), kernels share the same environment as the notebook server
|
||||
(data, resource limits, permissions, user id, etc.). But there are
|
||||
things like the Jupyter Kernel Gateway / Enterprise Gateway, which
|
||||
(data, resource limits, permissions, user id, etc.). But they *can*
|
||||
run in a separate Python environment from the single-user server
|
||||
(search `--prefix` in the [ipykernel installation
|
||||
instructions](https://ipython.readthedocs.io/en/stable/install/kernel_install.html))
|
||||
There are also more fancy techniques such as the [Jupyter Kernel
|
||||
Gateway](https://jupyter-kernel-gateway.readthedocs.io/) and [Enterprise
|
||||
Gateway](https://jupyter-enterprise-gateway.readthedocs.io/), which
|
||||
allow you to run the kernels on a different machine and possibly with
|
||||
a different environment.
|
||||
|
||||
A kernel doesn't just execute it's language - cell magics such as `%`,
|
||||
`%%`, and `!` are a property of the kernel - in particular, these are
|
||||
IPython kernel commands and don't necessarily work in any other
|
||||
kernel unless they specifically support them.
|
||||
|
||||
What does this mean? There is yet *another* layer of configurability.
|
||||
Each kernel can run a different programming language, with different
|
||||
software, and so on. By default, they would run in the same
|
||||
@@ -345,8 +403,8 @@ other way they are configured is by
|
||||
running in different Python virtual environments or conda
|
||||
environments. They can be started and killed independently (there is
|
||||
normally one per notebook you have open). The kernels is what uses
|
||||
most of your memory and CPU if you have large amounts of data open or
|
||||
are using a lot of compute power.
|
||||
most of your memory and CPU when running Jupyter - the rest of the web
|
||||
interface has a small footprint.
|
||||
|
||||
You can list your installed kernels with `jupyter kernelspec list`.
|
||||
If you look at one of `kernel.json` files in those directories, you
|
||||
@@ -355,43 +413,47 @@ automatically made by the kernels, but can be edited as needed. [The
|
||||
spec](https://jupyter-client.readthedocs.io/en/stable/kernels.html)
|
||||
tells you even more.
|
||||
|
||||
The kernel has to be reachable by the single-user notebook server.
|
||||
The normally has to be reachable by the single-user notebook server
|
||||
but the gateways mentioned above can get around that limitation.
|
||||
|
||||
If you get problems with "Kernel died" or some other error in a single
|
||||
notebook but the single-user notebook server stays working, it is
|
||||
usually a problem with the kernel. It could be that you are trying to
|
||||
use more resources than you are allowed and the symptom is the kernel
|
||||
getting killed. It could be that it crashes for some other reason.
|
||||
In these cases, you need to find the kernel logs and investigate.
|
||||
|
||||
The debug logs for the kernel are normally mixed in with the
|
||||
single-user notebook server logs.
|
||||
|
||||
|
||||
|
||||
### JupyterHub distributions
|
||||
## JupyterHub distributions
|
||||
|
||||
There are several "distributions" which automatically install all of
|
||||
the things above and configure them for a certain purpose. They are
|
||||
good ways to get started, but if you are doing very custom things
|
||||
eventually it may become hard to adapt them to your needs.
|
||||
good ways to get started, but if you have custom needs, eventually it
|
||||
may become hard to adapt them to your requirements.
|
||||
|
||||
* **Zero to JupyterHub with Kubernetes** installs an entire scaleable
|
||||
system using Kubernetes. Uses KubeSpawner, ....Authenticator, ....
|
||||
* [**Zero to JupyterHub with
|
||||
Kubernetes**](https://zero-to-jupyterhub.readthedocs.io/) installs
|
||||
an entire scaleable system using Kubernetes. Uses KubeSpawner,
|
||||
....Authenticator, ....
|
||||
|
||||
* **The Littlest JupyterHub** installs JupyterHub on a single system
|
||||
* [**The Littlest JupyterHub**](https://tljh.jupyter.org/) installs JupyterHub on a single system
|
||||
using SystemdSpawner and NativeAuthenticator (which manages users
|
||||
itself).
|
||||
|
||||
* **JupyterHub the hard way** takes you through everything yourself.
|
||||
It is a natural companion to this guide, since you get to experience
|
||||
every little bit.
|
||||
* [**JupyterHub the hard
|
||||
way**](https://jupyterhub.readthedocs.io/en/stable/installation-guide-hard.html)
|
||||
takes you through everything yourself. It is a natural companion to
|
||||
this guide, since you get to experience every little bit.
|
||||
|
||||
|
||||
|
||||
## I want to...
|
||||
|
||||
**Share files between users**. Spawner to share data, or
|
||||
JupyterNotebook/Lab user interface + some service for distributing
|
||||
files.
|
||||
TODO: answers to common cross-layer questions.
|
||||
|
||||
|
||||
## What's next?
|
||||
@@ -399,5 +461,5 @@ files.
|
||||
Now you know everything. Well, you know how everything relates, but
|
||||
there are still plenty of details, implementations, and exceptions.
|
||||
When setting up JupyterHub, the first step is to consider the above
|
||||
layers and see what options are suitable for you. Then, put
|
||||
layers, decide the right option for each of them, then begin putting
|
||||
everything together.
|
||||
|
Reference in New Issue
Block a user