mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-17 15:03:02 +00:00
Restructured references section of the docs
This commit is contained in:
1939
docs/source/reference/technical-reference/changelog/changelog.md
Normal file
1939
docs/source/reference/technical-reference/changelog/changelog.md
Normal file
File diff suppressed because one or more lines are too long
@@ -0,0 +1,29 @@
|
||||
# Configuration Reference
|
||||
|
||||
:::{important}
|
||||
Make sure the version of JupyterHub for this documentation matches your
|
||||
installation version, as the output of this command may change between versions.
|
||||
:::
|
||||
|
||||
## JupyterHub configuration
|
||||
|
||||
As explained in the [Configuration Basics](generate-config-file)
|
||||
section, the `jupyterhub_config.py` can be automatically generated via
|
||||
|
||||
> ```bash
|
||||
> jupyterhub --generate-config
|
||||
> ```
|
||||
|
||||
The following contains the output of that command for reference.
|
||||
|
||||
```{eval-rst}
|
||||
.. jupyterhub-generate-config::
|
||||
```
|
||||
|
||||
## JupyterHub help command output
|
||||
|
||||
This section contains the output of the command `jupyterhub --help-all`.
|
||||
|
||||
```{eval-rst}
|
||||
.. jupyterhub-help-all::
|
||||
```
|
@@ -0,0 +1,413 @@
|
||||
# Services
|
||||
|
||||
## Definition of a Service
|
||||
|
||||
When working with JupyterHub, a **Service** is defined as a process that interacts
|
||||
with the Hub's REST API. A Service may perform a specific
|
||||
action or task. For example, the following tasks can each be a unique Service:
|
||||
|
||||
- shutting down individuals' single user notebook servers that have been idle
|
||||
for some time
|
||||
- registering additional web servers which should use the Hub's authentication
|
||||
and be served behind the Hub's proxy.
|
||||
|
||||
Two key features help define a Service:
|
||||
|
||||
- Is the Service **managed** by JupyterHub?
|
||||
- Does the Service have a web server that should be added to the proxy's
|
||||
table?
|
||||
|
||||
Currently, these characteristics distinguish two types of Services:
|
||||
|
||||
- A **Hub-Managed Service** which is managed by JupyterHub
|
||||
- An **Externally-Managed Service** which runs its own web server and
|
||||
communicates operation instructions via the Hub's API.
|
||||
|
||||
## Properties of a Service
|
||||
|
||||
A Service may have the following properties:
|
||||
|
||||
- `name: str` - the name of the service
|
||||
- `admin: bool (default - false)` - whether the service should have
|
||||
administrative privileges
|
||||
- `url: str (default - None)` - The URL where the service is/should be. If a
|
||||
url is specified for where the Service runs its own web server,
|
||||
the service will be added to the proxy at `/services/:name`
|
||||
- `api_token: str (default - None)` - For Externally-Managed Services you need to specify
|
||||
an API token to perform API requests to the Hub
|
||||
- `display: bool (default - True)` - When set to true, display a link to the
|
||||
service's URL under the 'Services' dropdown in user's hub home page.
|
||||
|
||||
- `oauth_no_confirm: bool (default - False)` - When set to true,
|
||||
skip the OAuth confirmation page when users access this service.
|
||||
|
||||
By default, when users authenticate with a service using JupyterHub,
|
||||
they are prompted to confirm that they want to grant that service
|
||||
access to their credentials.
|
||||
Skipping the confirmation page is useful for admin-managed services that are considered part of the Hub
|
||||
and shouldn't need extra prompts for login.
|
||||
|
||||
If a service is also to be managed by the Hub, it has a few extra options:
|
||||
|
||||
- `command: (str/Popen list)` - Command for JupyterHub to spawn the service. - Only use this if the service should be a subprocess. - If command is not specified, the Service is assumed to be managed
|
||||
externally. - If a command is specified for launching the Service, the Service will
|
||||
be started and managed by the Hub.
|
||||
- `environment: dict` - additional environment variables for the Service.
|
||||
- `user: str` - the name of a system user to manage the Service. If
|
||||
unspecified, run as the same user as the Hub.
|
||||
|
||||
## Hub-Managed Services
|
||||
|
||||
A **Hub-Managed Service** is started by the Hub, and the Hub is responsible
|
||||
for the Service's actions. A Hub-Managed Service can only be a local
|
||||
subprocess of the Hub. The Hub will take care of starting the process and
|
||||
restart the service if the service stops.
|
||||
|
||||
While Hub-Managed Services share some similarities with notebook Spawners,
|
||||
there are no plans for Hub-Managed Services to support the same spawning
|
||||
abstractions as a notebook Spawner.
|
||||
|
||||
If you wish to run a Service in a Docker container or other deployment
|
||||
environments, the Service can be registered as an
|
||||
**Externally-Managed Service**, as described below.
|
||||
|
||||
## Launching a Hub-Managed Service
|
||||
|
||||
A Hub-Managed Service is characterized by its specified `command` for launching
|
||||
the Service. For example, a 'cull idle' notebook server task configured as a
|
||||
Hub-Managed Service would include:
|
||||
|
||||
- the Service name,
|
||||
- admin permissions, and
|
||||
- the `command` to launch the Service which will cull idle servers after a
|
||||
timeout interval
|
||||
|
||||
This example would be configured as follows in `jupyterhub_config.py`:
|
||||
|
||||
```python
|
||||
c.JupyterHub.load_roles = [
|
||||
{
|
||||
"name": "idle-culler",
|
||||
"scopes": [
|
||||
"read:users:activity", # read user last_activity
|
||||
"servers", # start and stop servers
|
||||
# 'admin:users' # needed if culling idle users as well
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
c.JupyterHub.services = [
|
||||
{
|
||||
'name': 'idle-culler',
|
||||
'command': [sys.executable, '-m', 'jupyterhub_idle_culler', '--timeout=3600']
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
A Hub-Managed Service may also be configured with additional optional
|
||||
parameters, which describe the environment needed to start the Service process:
|
||||
|
||||
- `environment: dict` - additional environment variables for the Service.
|
||||
- `user: str` - name of the user to run the server if different from the Hub.
|
||||
Requires Hub to be root.
|
||||
- `cwd: path` directory in which to run the Service, if different from the
|
||||
Hub directory.
|
||||
|
||||
The Hub will pass the following environment variables to launch the Service:
|
||||
|
||||
(service-env)=
|
||||
|
||||
```bash
|
||||
JUPYTERHUB_SERVICE_NAME: The name of the service
|
||||
JUPYTERHUB_API_TOKEN: API token assigned to the service
|
||||
JUPYTERHUB_API_URL: URL for the JupyterHub API (default, http://127.0.0.1:8080/hub/api)
|
||||
JUPYTERHUB_BASE_URL: Base URL of the Hub (https://mydomain[:port]/)
|
||||
JUPYTERHUB_SERVICE_PREFIX: URL path prefix of this service (/services/:service-name/)
|
||||
JUPYTERHUB_SERVICE_URL: Local URL where the service is expected to be listening.
|
||||
Only for proxied web services.
|
||||
JUPYTERHUB_OAUTH_SCOPES: JSON-serialized list of scopes to use for allowing access to the service
|
||||
(deprecated in 3.0, use JUPYTERHUB_OAUTH_ACCESS_SCOPES).
|
||||
JUPYTERHUB_OAUTH_ACCESS_SCOPES: JSON-serialized list of scopes to use for allowing access to the service (new in 3.0).
|
||||
JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES: JSON-serialized list of scopes that can be requested by the oauth client on behalf of users (new in 3.0).
|
||||
```
|
||||
|
||||
For the previous 'cull idle' Service example, these environment variables
|
||||
would be passed to the Service when the Hub starts the 'cull idle' Service:
|
||||
|
||||
```bash
|
||||
JUPYTERHUB_SERVICE_NAME: 'idle-culler'
|
||||
JUPYTERHUB_API_TOKEN: API token assigned to the service
|
||||
JUPYTERHUB_API_URL: http://127.0.0.1:8080/hub/api
|
||||
JUPYTERHUB_BASE_URL: https://mydomain[:port]
|
||||
JUPYTERHUB_SERVICE_PREFIX: /services/idle-culler/
|
||||
```
|
||||
|
||||
See the GitHub repo for additional information about the [jupyterhub_idle_culler][].
|
||||
|
||||
## Externally-Managed Services
|
||||
|
||||
You may prefer to use your own service management tools, such as Docker or
|
||||
systemd, to manage a JupyterHub Service. These **Externally-Managed
|
||||
Services**, unlike Hub-Managed Services, are not subprocesses of the Hub. You
|
||||
must tell JupyterHub which API token the Externally-Managed Service is using
|
||||
to perform its API requests. Each Externally-Managed Service will need a
|
||||
unique API token, because the Hub authenticates each API request and the API
|
||||
token is used to identify the originating Service or user.
|
||||
|
||||
A configuration example of an Externally-Managed Service with admin access and
|
||||
running its own web server is:
|
||||
|
||||
```python
|
||||
c.JupyterHub.services = [
|
||||
{
|
||||
'name': 'my-web-service',
|
||||
'url': 'https://10.0.1.1:1984',
|
||||
# any secret >8 characters, you'll use api_token to
|
||||
# authenticate api requests to the hub from your service
|
||||
'api_token': 'super-secret',
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
In this case, the `url` field will be passed along to the Service as
|
||||
`JUPYTERHUB_SERVICE_URL`.
|
||||
|
||||
## Writing your own Services
|
||||
|
||||
When writing your own services, you have a few decisions to make (in addition
|
||||
to what your service does!):
|
||||
|
||||
1. Does my service need a public URL?
|
||||
2. Do I want JupyterHub to start/stop the service?
|
||||
3. Does my service need to authenticate users?
|
||||
|
||||
When a Service is managed by JupyterHub, the Hub will pass the necessary
|
||||
information to the Service via the environment variables described above. A
|
||||
flexible Service, whether managed by the Hub or not, can make use of these
|
||||
same environment variables.
|
||||
|
||||
When you run a service that has a URL, it will be accessible under a
|
||||
`/services/` prefix, such as `https://myhub.horse/services/my-service/`. For
|
||||
your service to route proxied requests properly, it must take
|
||||
`JUPYTERHUB_SERVICE_PREFIX` into account when routing requests. For example, a
|
||||
web service would normally service its root handler at `'/'`, but the proxied
|
||||
service would need to serve `JUPYTERHUB_SERVICE_PREFIX`.
|
||||
|
||||
Note that `JUPYTERHUB_SERVICE_PREFIX` will contain a trailing slash. This must
|
||||
be taken into consideration when creating the service routes. If you include an
|
||||
extra slash you might get unexpected behavior. For example if your service has a
|
||||
`/foo` endpoint, the route would be `JUPYTERHUB_SERVICE_PREFIX + foo`, and
|
||||
`/foo/bar` would be `JUPYTERHUB_SERVICE_PREFIX + foo/bar`.
|
||||
|
||||
## Hub Authentication and Services
|
||||
|
||||
JupyterHub provides some utilities for using the Hub's authentication
|
||||
mechanism to govern access to your service.
|
||||
|
||||
Requests to all JupyterHub services are made with OAuth tokens.
|
||||
These can either be requests with a token in the `Authorization` header,
|
||||
or url parameter `?token=...`,
|
||||
or browser requests which must complete the OAuth authorization code flow,
|
||||
which results in a token that should be persisted for future requests
|
||||
(persistence is up to the service,
|
||||
but an encrypted cookie confined to the service path is appropriate,
|
||||
and provided by default).
|
||||
|
||||
:::{versionchanged} 2.0
|
||||
The shared `jupyterhub-services` cookie is removed.
|
||||
OAuth must be used to authenticate browser requests with services.
|
||||
:::
|
||||
|
||||
JupyterHub includes a reference implementation of Hub authentication that
|
||||
can be used by services. You may go beyond this reference implementation and
|
||||
create custom hub-authenticating clients and services. We describe the process
|
||||
below.
|
||||
|
||||
The reference, or base, implementation is the {class}`.HubAuth` class,
|
||||
which implements the API requests to the Hub that resolve a token to a User model.
|
||||
|
||||
There are two levels of authentication with the Hub:
|
||||
|
||||
- {class}`.HubAuth` - the most basic authentication,
|
||||
for services that should only accept API requests authorized with a token.
|
||||
|
||||
- {class}`.HubOAuth` - For services that should use oauth to authenticate with the Hub.
|
||||
This should be used for any service that serves pages that should be visited with a browser.
|
||||
|
||||
To use HubAuth, you must set the `.api_token` instance variable. This can be
|
||||
done either programmatically when constructing the class, or via the
|
||||
`JUPYTERHUB_API_TOKEN` environment variable. A number of the examples in the
|
||||
root of the jupyterhub git repository set the `JUPYTERHUB_API_TOKEN` variable
|
||||
so consider having a look at those for futher reading
|
||||
([cull-idle](https://github.com/jupyterhub/jupyterhub/tree/master/examples/cull-idle),
|
||||
[external-oauth](https://github.com/jupyterhub/jupyterhub/tree/master/examples/external-oauth),
|
||||
[service-notebook](https://github.com/jupyterhub/jupyterhub/tree/master/examples/service-notebook)
|
||||
and [service-whoiami](https://github.com/jupyterhub/jupyterhub/tree/master/examples/service-whoami))
|
||||
|
||||
(TODO: Where is this API TOKen set?)
|
||||
|
||||
Most of the logic for authentication implementation is found in the
|
||||
{meth}`.HubAuth.user_for_token` methods,
|
||||
which makes a request of the Hub, and returns:
|
||||
|
||||
- None, if no user could be identified, or
|
||||
- a dict of the following form:
|
||||
|
||||
```python
|
||||
{
|
||||
"name": "username",
|
||||
"groups": ["list", "of", "groups"],
|
||||
"scopes": [
|
||||
"access:servers!server=username/",
|
||||
],
|
||||
}
|
||||
```
|
||||
|
||||
You are then free to use the returned user information to take appropriate
|
||||
action.
|
||||
|
||||
HubAuth also caches the Hub's response for a number of seconds,
|
||||
configurable by the `cookie_cache_max_age` setting (default: five minutes).
|
||||
|
||||
If your service would like to make further requests _on behalf of users_,
|
||||
it should use the token issued by this OAuth process.
|
||||
If you are using tornado,
|
||||
you can access the token authenticating the current request with {meth}`.HubAuth.get_token`.
|
||||
|
||||
:::{versionchanged} 2.2
|
||||
|
||||
{meth}`.HubAuth.get_token` adds support for retrieving
|
||||
tokens stored in tornado cookies after the completion of OAuth.
|
||||
Previously, it only retrieved tokens from URL parameters or the Authorization header.
|
||||
Passing `get_token(handler, in_cookie=False)` preserves this behavior.
|
||||
:::
|
||||
|
||||
### Flask Example
|
||||
|
||||
For example, you have a Flask service that returns information about a user.
|
||||
JupyterHub's HubAuth class can be used to authenticate requests to the Flask
|
||||
service. See the `service-whoami-flask` example in the
|
||||
[JupyterHub GitHub repo](https://github.com/jupyterhub/jupyterhub/tree/HEAD/examples/service-whoami-flask)
|
||||
for more details.
|
||||
|
||||
```{literalinclude} ../../../../../../jupyterhub/examples/service-whoami-flask/whoami-flask.py
|
||||
:language: python
|
||||
```
|
||||
|
||||
### Authenticating tornado services with JupyterHub
|
||||
|
||||
Since most Jupyter services are written with tornado,
|
||||
we include a mixin class, [`HubOAuthenticated`][huboauthenticated],
|
||||
for quickly authenticating your own tornado services with JupyterHub.
|
||||
|
||||
Tornado's {py:func}`~.tornado.web.authenticated` decorator calls a Handler's {py:meth}`~.tornado.web.RequestHandler.get_current_user`
|
||||
method to identify the user. Mixing in {class}`.HubAuthenticated` defines
|
||||
{meth}`~.HubAuthenticated.get_current_user` to use HubAuth. If you want to configure the HubAuth
|
||||
instance beyond the default, you'll want to define an {py:meth}`~.tornado.web.RequestHandler.initialize` method,
|
||||
such as:
|
||||
|
||||
```python
|
||||
class MyHandler(HubOAuthenticated, web.RequestHandler):
|
||||
|
||||
def initialize(self, hub_auth):
|
||||
self.hub_auth = hub_auth
|
||||
|
||||
@web.authenticated
|
||||
def get(self):
|
||||
...
|
||||
```
|
||||
|
||||
The HubAuth class will automatically load the desired configuration from the Service
|
||||
[environment variables](service-env).
|
||||
|
||||
:::{versionchanged} 2.0
|
||||
|
||||
Access scopes are used to govern access to services.
|
||||
Prior to 2.0,
|
||||
sets of users and groups could be used to grant access
|
||||
by defining `.hub_groups` or `.hub_users` on the authenticated handler.
|
||||
These are ignored if the 2.0 `.hub_scopes` is defined.
|
||||
:::
|
||||
|
||||
:::{seealso}
|
||||
{meth}`.HubAuth.check_scopes`
|
||||
:::
|
||||
|
||||
### Implementing your own Authentication with JupyterHub
|
||||
|
||||
If you don't want to use the reference implementation
|
||||
(e.g. you find the implementation a poor fit for your Flask app),
|
||||
you can implement authentication via the Hub yourself.
|
||||
JupyterHub is a standard OAuth2 provider,
|
||||
so you can use any OAuth 2 client implementation appropriate for your toolkit.
|
||||
See the [FastAPI example][] for an example of using JupyterHub as an OAuth provider with [FastAPI][],
|
||||
without using any code imported from JupyterHub.
|
||||
|
||||
On completion of OAuth, you will have an access token for JupyterHub,
|
||||
which can be used to identify the user and the permissions (scopes)
|
||||
the user has authorized for your service.
|
||||
|
||||
You will only get to this stage if the user has the required `access:services!service=$service-name` scope.
|
||||
|
||||
To retrieve the user model for the token, make a request to `GET /hub/api/user` with the token in the Authorization header.
|
||||
For example, using flask:
|
||||
|
||||
```{literalinclude} ../../../../../../jupyterhub/examples/service-whoami-flask/whoami-flask.py
|
||||
:language: python
|
||||
```
|
||||
|
||||
We recommend looking at the [`HubOAuth`][huboauth] class implementation for reference,
|
||||
and taking note of the following process:
|
||||
|
||||
1. retrieve the token from the request.
|
||||
2. Make an API request `GET /hub/api/user`,
|
||||
with the token in the `Authorization` header.
|
||||
|
||||
For example, with [requests][]:
|
||||
|
||||
```python
|
||||
r = requests.get(
|
||||
"http://127.0.0.1:8081/hub/api/user",
|
||||
headers = {
|
||||
'Authorization' : f'token {api_token}',
|
||||
},
|
||||
)
|
||||
r.raise_for_status()
|
||||
user = r.json()
|
||||
```
|
||||
|
||||
3. On success, the reply will be a JSON model describing the user:
|
||||
|
||||
```python
|
||||
{
|
||||
"name": "inara",
|
||||
# groups may be omitted, depending on permissions
|
||||
"groups": ["serenity", "guild"],
|
||||
# scopes is new in JupyterHub 2.0
|
||||
"scopes": [
|
||||
"access:services",
|
||||
"read:users:name",
|
||||
"read:users!user=inara",
|
||||
"..."
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
The `scopes` field can be used to manage access.
|
||||
Note: a user will have access to a service to complete oauth access to the service for the first time.
|
||||
Individual permissions may be revoked at any later point without revoking the token,
|
||||
in which case the `scopes` field in this model should be checked on each access.
|
||||
The default required scopes for access are available from `hub_auth.oauth_scopes` or `$JUPYTERHUB_OAUTH_ACCESS_SCOPES`.
|
||||
|
||||
An example of using an Externally-Managed Service and authentication is
|
||||
in the [nbviewer README][nbviewer example] section on securing the notebook viewer,
|
||||
and an example of its configuration is found [here](https://github.com/jupyter/nbviewer/blob/ed942b10a52b6259099e2dd687930871dc8aac22/nbviewer/providers/base.py#L95).
|
||||
nbviewer can also be run as a Hub-Managed Service as described [nbviewer README][nbviewer example]
|
||||
section on securing the notebook viewer.
|
||||
|
||||
[requests]: https://docs.python-requests.org/en/master/
|
||||
[services_auth]: ../api/services.auth.html
|
||||
[nbviewer example]: https://github.com/jupyter/nbviewer#securing-the-notebook-viewer
|
||||
[fastapi example]: https://github.com/jupyterhub/jupyterhub/tree/HEAD/examples/service-fastapi
|
||||
[fastapi]: https://fastapi.tiangolo.com
|
||||
[jupyterhub_idle_culler]: https://github.com/jupyterhub/jupyterhub-idle-culler
|
254
docs/source/reference/technical-reference/configuration/urls.md
Normal file
254
docs/source/reference/technical-reference/configuration/urls.md
Normal file
@@ -0,0 +1,254 @@
|
||||
(jupyterhub-url)=
|
||||
|
||||
# JupyterHub URL scheme
|
||||
|
||||
This document describes how JupyterHub routes requests.
|
||||
|
||||
This does not include the [REST API](using-jupyterhub-rest-api) URLs.
|
||||
|
||||
In general, all URLs can be prefixed with `c.JupyterHub.base_url` to
|
||||
run the whole JupyterHub application on a prefix.
|
||||
|
||||
All authenticated handlers redirect to `/hub/login` to log-in users
|
||||
before being redirected back to the originating page.
|
||||
The returned request should preserve all query parameters.
|
||||
|
||||
## `/`
|
||||
|
||||
The top-level request is always a simple redirect to `/hub/`,
|
||||
to be handled by the default JupyterHub handler.
|
||||
|
||||
In general, all requests to `/anything` that do not start with `/hub/`
|
||||
but are routed to the Hub, will be redirected to `/hub/anything` before being handled by the Hub.
|
||||
|
||||
## `/hub/`
|
||||
|
||||
This is an authenticated URL.
|
||||
|
||||
This handler redirects users to the default URL of the application,
|
||||
which defaults to the user's default server.
|
||||
That is, the handler redirects to `/hub/spawn` if the user's server is not running,
|
||||
or to the server itself (`/user/:name`) if the server is running.
|
||||
|
||||
This default URL behavior can be customized in two ways:
|
||||
|
||||
First, to redirect users to the JupyterHub home page (`/hub/home`)
|
||||
instead of spawning their server,
|
||||
set `redirect_to_server` to False:
|
||||
|
||||
```python
|
||||
c.JupyterHub.redirect_to_server = False
|
||||
```
|
||||
|
||||
This might be useful if you have a Hub where you expect
|
||||
users to be managing multiple server configurations
|
||||
but automatic spawning is not desirable.
|
||||
|
||||
Second, you can customise the landing page to any page you like,
|
||||
such as a custom service you have deployed e.g. with course information:
|
||||
|
||||
```python
|
||||
c.JupyterHub.default_url = '/services/my-landing-service'
|
||||
```
|
||||
|
||||
## `/hub/home`
|
||||
|
||||

|
||||
|
||||
By default, the Hub home page has just one or two buttons
|
||||
for starting and stopping the user's server.
|
||||
|
||||
If named servers are enabled, there will be some additional
|
||||
tools for management of the named servers.
|
||||
|
||||
_Version added: 1.0_ named server UI is new in 1.0.
|
||||
|
||||
## `/hub/login`
|
||||
|
||||
This is the JupyterHub login page.
|
||||
If you have a form-based username+password login,
|
||||
such as the default [PAMAuthenticator](https://en.wikipedia.org/wiki/Pluggable_authentication_module),
|
||||
this page will render the login form.
|
||||
|
||||

|
||||
|
||||
If login is handled by an external service,
|
||||
e.g. with OAuth, this page will have a button,
|
||||
declaring "Log in with ..." which users can click
|
||||
to log in with the chosen service.
|
||||
|
||||

|
||||
|
||||
If you want to skip the user interaction and initiate login
|
||||
via the button, you can set:
|
||||
|
||||
```python
|
||||
c.Authenticator.auto_login = True
|
||||
```
|
||||
|
||||
This can be useful when the user is "already logged in" via some mechanism.
|
||||
However, a handshake via `redirects` is necessary to complete the authentication with JupyterHub.
|
||||
|
||||
## `/hub/logout`
|
||||
|
||||
Visiting `/hub/logout` clears [cookies](https://en.wikipedia.org/wiki/HTTP_cookie) from the current browser.
|
||||
Note that **logging out does not stop a user's server(s)** by default.
|
||||
|
||||
If you would like to shut down user servers on logout,
|
||||
you can enable this behavior with:
|
||||
|
||||
```python
|
||||
c.JupyterHub.shutdown_on_logout = True
|
||||
```
|
||||
|
||||
Be careful with this setting because logging out one browser
|
||||
does not mean the user is no longer actively using their server from another machine.
|
||||
|
||||
## `/user/:username[/:servername]`
|
||||
|
||||
If a user's server is running, this URL is handled by the user's given server,
|
||||
not by the Hub.
|
||||
The username is the first part, and if using named servers,
|
||||
the server name is the second part.
|
||||
|
||||
If the user's server is _not_ running, this will be redirected to `/hub/user/:username/...`
|
||||
|
||||
## `/hub/user/:username[/:servername]`
|
||||
|
||||
This URL indicates a request for a user server that is not running
|
||||
(because `/user/...` would have been handled by the notebook server
|
||||
if the specified server were running).
|
||||
|
||||
Handling this URL depends on two conditions: whether a requested user is found
|
||||
as a match and the state of the requested user's notebook server,
|
||||
for example:
|
||||
|
||||
1. the server is not active
|
||||
a. user matches
|
||||
b. user doesn't match
|
||||
2. the server is ready
|
||||
3. the server is pending, but not ready
|
||||
|
||||
If the server is pending spawn,
|
||||
the browser will be redirected to `/hub/spawn-pending/:username/:servername`
|
||||
to see a progress page while waiting for the server to be ready.
|
||||
|
||||
If the server is not active at all,
|
||||
a page will be served with a link to `/hub/spawn/:username/:servername`.
|
||||
Following that link will launch the requested server.
|
||||
The HTTP status will be 503 in this case because a request has been made for a server that is not running.
|
||||
|
||||
If the server is ready, it is assumed that the proxy has not yet registered the route.
|
||||
Some checks are performed and a delay is added before redirecting back to `/user/:username/:servername/...`.
|
||||
If something is really wrong, this can result in a redirect loop.
|
||||
|
||||
Visiting this page will never result in triggering the spawn of servers
|
||||
without additional user action (i.e. clicking the link on the page).
|
||||
|
||||

|
||||
|
||||
_Version changed: 1.0_
|
||||
|
||||
Prior to 1.0, this URL itself was responsible for spawning servers.
|
||||
If the progress page was pending, the URL redirected it to running servers.
|
||||
This was useful because it made sure that the requested servers were restarted after they stopped.
|
||||
However, it could also be harmful because unused servers would continuously be restarted if e.g.
|
||||
an idle JupyterLab frontend that constantly makes polling requests was openly pointed at it.
|
||||
|
||||
### Special handling of API requests
|
||||
|
||||
Requests to `/user/:username[/:servername]/api/...` are assumed to be
|
||||
from applications connected to stopped servers.
|
||||
These requests fail with a `503` status code and an informative JSON error message
|
||||
that indicates how to spawn the server.
|
||||
This is meant to help applications such as JupyterLab,
|
||||
that are connected to a server that has stopped.
|
||||
|
||||
_Version changed: 1.0_
|
||||
|
||||
JupyterHub version 0.9 failed these API requests with status `404`,
|
||||
but version 1.0 uses 503.
|
||||
|
||||
## `/user-redirect/...`
|
||||
|
||||
The `/user-redirect/...` URL is for sharing a URL that will redirect a user
|
||||
to a path on their own default server.
|
||||
This is useful when different users have the same file at the same URL on their servers,
|
||||
and you want a single link to give to any user that will open that file on their server.
|
||||
|
||||
e.g. a link to `/user-redirect/notebooks/Index.ipynb`
|
||||
will send user `hortense` to `/user/hortense/notebooks/Index.ipynb`
|
||||
|
||||
**DO NOT** share links to your own server with other users.
|
||||
This will not work in general,
|
||||
unless you grant those users access to your server.
|
||||
|
||||
**Contributions welcome:** The JupyterLab "shareable link" should share this link
|
||||
when run with JupyterHub, but it does not.
|
||||
See [jupyterlab-hub](https://github.com/jupyterhub/jupyterlab-hub)
|
||||
where this should probably be done and
|
||||
[this issue in JupyterLab](https://github.com/jupyterlab/jupyterlab/issues/5388)
|
||||
that is intended to make it possible.
|
||||
|
||||
## Spawning
|
||||
|
||||
### `/hub/spawn[/:username[/:servername]]`
|
||||
|
||||
Requesting `/hub/spawn` will spawn the default server for the current user.
|
||||
If the `username` and optionally `servername` are specified,
|
||||
then the specified server for the specified user will be spawned.
|
||||
Once spawn has been requested,
|
||||
the browser is redirected to `/hub/spawn-pending/...`.
|
||||
|
||||
If `Spawner.options_form` is used,
|
||||
this will render a form,
|
||||
and a POST request will trigger the actual spawn and redirect.
|
||||
|
||||

|
||||
|
||||
_Version added: 1.0_
|
||||
|
||||
1.0 adds the ability to specify `username` and `servername`.
|
||||
Prior to 1.0, only `/hub/spawn` was recognized for the default server.
|
||||
|
||||
_Version changed: 1.0_
|
||||
|
||||
Prior to 1.0, this page redirected back to `/hub/user/:username`,
|
||||
which was responsible for triggering spawn and rendering progress, etc.
|
||||
|
||||
### `/hub/spawn-pending[/:username[/:servername]]`
|
||||
|
||||

|
||||
|
||||
_Version added: 1.0_ this URL is new in JupyterHub 1.0.
|
||||
|
||||
This page renders the progress view for the given spawn request.
|
||||
Once the server is ready,
|
||||
the browser is redirected to the running server at `/user/:username/:servername/...`.
|
||||
|
||||
If this page is requested at any time after the specified server is ready,
|
||||
the browser will be redirected to the running server.
|
||||
|
||||
Requesting this page will never trigger any side effects.
|
||||
If the server is not running (e.g. because the spawn has failed),
|
||||
the spawn failure message (if applicable) will be displayed,
|
||||
and the page will show a link back to `/hub/spawn/...`.
|
||||
|
||||
## `/hub/token`
|
||||
|
||||

|
||||
|
||||
On this page, users can manage their JupyterHub API tokens.
|
||||
They can revoke access and request new tokens for writing scripts
|
||||
against the [JupyterHub REST API](using-jupyterhub-rest-api).
|
||||
|
||||
## `/hub/admin`
|
||||
|
||||

|
||||
|
||||
Administrators can take various administrative actions from this page:
|
||||
|
||||
- add/remove users
|
||||
- grant admin privileges
|
||||
- start/stop user servers
|
||||
- shutdown JupyterHub itself
|
@@ -0,0 +1,195 @@
|
||||
(gallery-of-deployments)=
|
||||
|
||||
# A Gallery of JupyterHub Deployments
|
||||
|
||||
**A JupyterHub Community Resource**
|
||||
|
||||
We've compiled this list of JupyterHub deployments to help the community
|
||||
see the breadth and growth of JupyterHub's use in education, research, and
|
||||
high performance computing.
|
||||
|
||||
Please submit pull requests to update information or to add new institutions or uses.
|
||||
|
||||
## Academic Institutions, Research Labs, and Supercomputer Centers
|
||||
|
||||
### University of California Berkeley
|
||||
|
||||
- [BIDS - Berkeley Institute for Data Science](https://bids.berkeley.edu/)
|
||||
|
||||
- [Teaching with Jupyter notebooks and JupyterHub](https://bids.berkeley.edu/resources/videos/teaching-ipythonjupyter-notebooks-and-jupyterhub)
|
||||
|
||||
- [Data 8](http://data8.org/)
|
||||
|
||||
- [GitHub organization](https://github.com/data-8)
|
||||
|
||||
- [NERSC](https://www.nersc.gov/)
|
||||
|
||||
- [Press release on Jupyter and Cori](https://www.nersc.gov/news-publications/nersc-news/nersc-center-news/2016/jupyter-notebooks-will-open-up-new-possibilities-on-nerscs-cori-supercomputer/)
|
||||
- [Moving and sharing data](https://www.nersc.gov/assets/Uploads/03-MovingAndSharingData-Cholia.pdf)
|
||||
|
||||
- [Research IT](https://research-it.berkeley.edu)
|
||||
- [JupyterHub server supports campus research computation](https://research-it.berkeley.edu/blog/17/01/24/free-fully-loaded-jupyterhub-server-supports-campus-research-computation)
|
||||
|
||||
### University of California Davis
|
||||
|
||||
- [Spinning up multiple Jupyter Notebooks on AWS for a tutorial](https://github.com/mblmicdiv/course2017/blob/HEAD/exercises/sourmash-setup.md)
|
||||
|
||||
Although not technically a JupyterHub deployment, this tutorial setup
|
||||
may be helpful to others in the Jupyter community.
|
||||
|
||||
Thank you C. Titus Brown for sharing this with the Software Carpentry
|
||||
mailing list.
|
||||
|
||||
```
|
||||
* I started a big Amazon machine;
|
||||
* I installed Docker and built a custom image containing my software of
|
||||
interest;
|
||||
* I ran multiple containers, one connected to port 8000, one on 8001,
|
||||
etc. and gave each student a different port;
|
||||
* students could connect in and use the Terminal program in Jupyter to
|
||||
execute commands, and could upload/download files via the Jupyter
|
||||
console interface;
|
||||
* in theory I could have used notebooks too, but for this I didn’t have
|
||||
need.
|
||||
|
||||
I am aware that JupyterHub can probably do all of this including manage
|
||||
the containers, but I’m still a bit shy of diving into that; this was
|
||||
fairly straightforward, gave me disposable containers that were isolated
|
||||
for each individual student, and worked almost flawlessly. Should be
|
||||
easy to do with RStudio too.
|
||||
```
|
||||
|
||||
### Cal Poly San Luis Obispo
|
||||
|
||||
- [jupyterhub-deploy-teaching](https://github.com/jupyterhub/jupyterhub-deploy-teaching) based on work by Brian Granger for Cal Poly's Data Science 301 Course
|
||||
|
||||
### Chameleon
|
||||
|
||||
[Chameleon](https://www.chameleoncloud.org) is a NSF-funded configurable experimental environment for large-scale computer science systems research with [bare metal reconfigurability](https://chameleoncloud.readthedocs.io/en/latest/technical/baremetal.html). Chameleon users utilize JupyterHub to document and reproduce their complex CISE and networking experiments.
|
||||
|
||||
- [Shared JupyterHub](https://jupyter.chameleoncloud.org): provides a common "workbench" environment for any Chameleon user.
|
||||
- [Trovi](https://www.chameleoncloud.org/experiment/share): a sharing portal of experiments, tutorials, and examples, which users can launch as a dedicated isolated environments on Chameleon's JupyterHub.
|
||||
|
||||
### Clemson University
|
||||
|
||||
- Advanced Computing
|
||||
- [Palmetto cluster and JupyterHub](https://citi.sites.clemson.edu/2016/08/18/JupyterHub-for-Palmetto-Cluster.html)
|
||||
|
||||
### University of Colorado Boulder
|
||||
|
||||
- (CU Research Computing) CURC
|
||||
|
||||
- [JupyterHub User Guide](https://curc.readthedocs.io/en/latest/gateways/jupyterhub.html)
|
||||
- Slurm job dispatched on Crestone compute cluster
|
||||
- log troubleshooting
|
||||
- Profiles in IPython Clusters tab
|
||||
- [Parallel Processing with JupyterHub tutorial](https://curc.readthedocs.io/en/latest/gateways/parallel-programming-jupyter.html)
|
||||
|
||||
### George Washington University
|
||||
|
||||
- [JupyterHub](https://go.gwu.edu/jupyter) with university single-sign-on. Deployed early 2017.
|
||||
|
||||
### HTCondor
|
||||
|
||||
- [HTCondor Python Bindings Tutorial from HTCondor Week 2017 includes information on their JupyterHub tutorials](https://research.cs.wisc.edu/htcondor/HTCondorWeek2017/presentations/TueBockelman_Python.pdf)
|
||||
|
||||
### University of Illinois
|
||||
|
||||
- https://datascience.business.illinois.edu (currently down; checked 10/26/22)
|
||||
|
||||
### IllustrisTNG Simulation Project
|
||||
|
||||
- [JupyterHub/Lab-based analysis platform, part of the TNG public data release](https://www.tng-project.org/data/)
|
||||
|
||||
### MIT and Lincoln Labs
|
||||
|
||||
- https://supercloud.mit.edu/
|
||||
|
||||
### Michigan State University
|
||||
|
||||
- [Setting up JupyterHub](https://mediaspace.msu.edu/media/Setting+Up+Your+JupyterHub+Password/1_hgv13aag/11980471)
|
||||
|
||||
### University of Minnesota
|
||||
|
||||
- [JupyterHub Inside HPC](https://insidehpc.com/tag/jupyterhub/)
|
||||
|
||||
### University of Missouri
|
||||
|
||||
- https://dsa.missouri.edu/faq/
|
||||
|
||||
### Paderborn University
|
||||
|
||||
- [Data Science (DICE) group](https://dice-research.org)
|
||||
- [nbgraderutils](https://github.com/dice-group/nbgraderutils): Use JupyterHub + nbgrader + iJava kernel for online Java exercises. Used in lecture Statistical Natural Language Processing.
|
||||
|
||||
### Penn State University
|
||||
|
||||
- [Press release](https://news.psu.edu/story/523093/2018/05/24/new-open-source-web-apps-available-students-and-faculty): "New open-source web apps available for students and faculty"
|
||||
|
||||
### University of California San Diego
|
||||
|
||||
- San Diego Supercomputer Center - Andrea Zonca
|
||||
|
||||
- [Deploy JupyterHub on a Supercomputer with SSH](https://zonca.github.io/2017/05/jupyterhub-hpc-batchspawner-ssh.html)
|
||||
- [Run Jupyterhub on a Supercomputer](https://zonca.github.io/2015/04/jupyterhub-hpc.html)
|
||||
- [Deploy JupyterHub on a VM for a Workshop](https://zonca.github.io/2016/04/jupyterhub-sdsc-cloud.html)
|
||||
- [Customize your Python environment in Jupyterhub](https://zonca.github.io/2017/02/customize-python-environment-jupyterhub.html)
|
||||
- [Jupyterhub deployment on multiple nodes with Docker Swarm](https://zonca.github.io/2016/05/jupyterhub-docker-swarm.html)
|
||||
- [Sample deployment of Jupyterhub in HPC on SDSC Comet](https://zonca.github.io/2017/02/sample-deployment-jupyterhub-hpc.html)
|
||||
|
||||
- Educational Technology Services - Paul Jamason
|
||||
- [datahub.ucsd.edu](https://datahub.ucsd.edu)
|
||||
|
||||
### TACC University of Texas
|
||||
|
||||
### Texas A&M
|
||||
|
||||
- Kristen Thyng - Oceanography
|
||||
- [Teaching with JupyterHub and nbgrader](http://kristenthyng.com/blog/2016/09/07/jupyterhub+nbgrader/)
|
||||
|
||||
### Elucidata
|
||||
|
||||
- What's new in Jupyter Notebooks @[Elucidata](https://elucidata.io/):
|
||||
- [Using Jupyter Notebooks with Jupyterhub on GCP, managed by GKE](https://medium.com/elucidata/why-you-should-be-using-a-jupyter-notebook-8385a4ccd93d)
|
||||
|
||||
## Service Providers
|
||||
|
||||
### AWS
|
||||
|
||||
- [Run Jupyter Notebook and JupyterHub on Amazon EMR](https://aws.amazon.com/blogs/big-data/running-jupyter-notebook-and-jupyterhub-on-amazon-emr/)
|
||||
|
||||
### Google Cloud Platform
|
||||
|
||||
- [Using Tensorflow and JupyterHub in Classrooms](https://cloud.google.com/solutions/using-tensorflow-jupyterhub-classrooms)
|
||||
- [using-tensorflow-and-jupyterhub blog post](https://opensource.googleblog.com/2016/10/using-tensorflow-and-jupyterhub.html)
|
||||
|
||||
### Everware
|
||||
|
||||
[Everware](https://github.com/everware) Reproducible and reusable science powered by jupyterhub and docker. Like nbviewer, but executable. CERN, Geneva [website](http://everware.xyz/)
|
||||
|
||||
### Microsoft Azure
|
||||
|
||||
- [Azure Data Science Virtual Machine release notes](https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-data-science-linux-dsvm-intro)
|
||||
|
||||
### Rackspace Carina
|
||||
|
||||
- https://getcarina.com/blog/learning-how-to-whale/
|
||||
- https://carolynvanslyck.com/talk/carina/jupyterhub/#/ (but carolynvanslyck is currently down; checked 10/26/22)
|
||||
|
||||
### Hadoop
|
||||
|
||||
- [Deploying JupyterHub on Hadoop](https://jupyterhub-on-hadoop.readthedocs.io)
|
||||
|
||||
## Miscellaneous
|
||||
|
||||
- https://medium.com/@ybarraud/setting-up-jupyterhub-with-sudospawner-and-anaconda-844628c0dbee#.rm3yt87e1
|
||||
- [Mailing list UT deployment](https://groups.google.com/g/jupyter/c/nkPSEeMr8c0)
|
||||
- [JupyterHub setup on Centos](https://gist.github.com/johnrc/604971f7d41ebf12370bf5729bf3e0a4)
|
||||
- [Deploy JupyterHub to Docker Swarm](https://jupyterhub.surge.sh)
|
||||
- https://www.laketide.com/building-your-lab-part-3/
|
||||
- https://estrellita.hatenablog.com/entry/2015/07/31/083202
|
||||
- https://www.walkingrandomly.com/?p=5734
|
||||
- https://wrdrd.com/docs/consulting/education-technology
|
||||
- https://bitbucket.org/jackhale/fenics-jupyter
|
||||
- [LinuxCluster blog](https://linuxcluster.wordpress.com/category/application/jupyterhub/)
|
||||
- [Spark Cluster on OpenStack with Multi-User Jupyter Notebook](https://arnesund.com/2015/09/21/spark-cluster-on-openstack-with-multi-user-jupyter-notebook/)
|
@@ -0,0 +1,20 @@
|
||||
# Monitoring
|
||||
|
||||
This section covers details on monitoring the state of your JupyterHub installation.
|
||||
|
||||
JupyterHub expose the `/metrics` endpoint that returns text describing its current
|
||||
operational state formatted in a way [Prometheus](https://prometheus.io) understands.
|
||||
|
||||
Prometheus is a separate open source tool that can be configured to repeatedly poll
|
||||
JupyterHub's `/metrics` endpoint to parse and save its current state.
|
||||
|
||||
By doing so, Prometheus can describe JupyterHub's evolving state over time.
|
||||
This evolving state can then be accessed through Prometheus that expose its underlying
|
||||
storage to those allowed to access it, and be presented with dashboards by a
|
||||
tool like [Grafana](https://grafana.com).
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 2
|
||||
|
||||
/reference/metrics
|
||||
```
|
@@ -0,0 +1,315 @@
|
||||
(authenticators-reference)=
|
||||
|
||||
# Authenticators
|
||||
|
||||
The {class}`.Authenticator` is the mechanism for authorizing users to use the
|
||||
Hub and single user notebook servers.
|
||||
|
||||
## The default PAM Authenticator
|
||||
|
||||
JupyterHub ships with the default [PAM][]-based Authenticator, for
|
||||
logging in with local user accounts via a username and password.
|
||||
|
||||
## The OAuthenticator
|
||||
|
||||
Some login mechanisms, such as [OAuth][], don't map onto username and
|
||||
password authentication, and instead use tokens. When using these
|
||||
mechanisms, you can override the login handlers.
|
||||
|
||||
You can see an example implementation of an Authenticator that uses
|
||||
[GitHub OAuth][] at [OAuthenticator][].
|
||||
|
||||
JupyterHub's [OAuthenticator][] currently supports the following
|
||||
popular services:
|
||||
|
||||
- Auth0
|
||||
- Bitbucket
|
||||
- CILogon
|
||||
- GitHub
|
||||
- GitLab
|
||||
- Globus
|
||||
- Google
|
||||
- MediaWiki
|
||||
- Okpy
|
||||
- OpenShift
|
||||
|
||||
A [generic implementation](https://github.com/jupyterhub/oauthenticator/blob/master/oauthenticator/generic.py), which you can use for OAuth authentication with any provider, is also available.
|
||||
|
||||
## The Dummy Authenticator
|
||||
|
||||
When testing, it may be helpful to use the
|
||||
{class}`jupyterhub.auth.DummyAuthenticator`. This allows for any username and
|
||||
password unless if a global password has been set. Once set, any username will
|
||||
still be accepted but the correct password will need to be provided.
|
||||
|
||||
## Additional Authenticators
|
||||
|
||||
A partial list of other authenticators is available on the
|
||||
[JupyterHub wiki](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
|
||||
|
||||
## Technical Overview of Authentication
|
||||
|
||||
### How the Base Authenticator works
|
||||
|
||||
The base authenticator uses simple username and password authentication.
|
||||
|
||||
The base Authenticator has one central method:
|
||||
|
||||
#### Authenticator.authenticate method
|
||||
|
||||
Authenticator.authenticate(handler, data)
|
||||
|
||||
This method is passed the Tornado `RequestHandler` and the `POST data`
|
||||
from JupyterHub's login form. Unless the login form has been customized,
|
||||
`data` will have two keys:
|
||||
|
||||
- `username`
|
||||
- `password`
|
||||
|
||||
The `authenticate` method's job is simple:
|
||||
|
||||
- return the username (non-empty str) of the authenticated user if
|
||||
authentication is successful
|
||||
- return `None` otherwise
|
||||
|
||||
Writing an Authenticator that looks up passwords in a dictionary
|
||||
requires only overriding this one method:
|
||||
|
||||
```python
|
||||
from IPython.utils.traitlets import Dict
|
||||
from jupyterhub.auth import Authenticator
|
||||
|
||||
class DictionaryAuthenticator(Authenticator):
|
||||
|
||||
passwords = Dict(config=True,
|
||||
help="""dict of username:password for authentication"""
|
||||
)
|
||||
|
||||
async def authenticate(self, handler, data):
|
||||
if self.passwords.get(data['username']) == data['password']:
|
||||
return data['username']
|
||||
```
|
||||
|
||||
#### Normalize usernames
|
||||
|
||||
Since the Authenticator and Spawner both use the same username,
|
||||
sometimes you want to transform the name coming from the authentication service
|
||||
(e.g. turning email addresses into local system usernames) before adding them to the Hub service.
|
||||
Authenticators can define `normalize_username`, which takes a username.
|
||||
The default normalization is to cast names to lowercase
|
||||
|
||||
For simple mappings, a configurable dict `Authenticator.username_map` is used to turn one name into another:
|
||||
|
||||
```python
|
||||
c.Authenticator.username_map = {
|
||||
'service-name': 'localname'
|
||||
}
|
||||
```
|
||||
|
||||
When using `PAMAuthenticator`, you can set
|
||||
`c.PAMAuthenticator.pam_normalize_username = True`, which will
|
||||
normalize usernames using PAM (basically round-tripping them: username
|
||||
to uid to username), which is useful in case you use some external
|
||||
service that allows multiple usernames mapping to the same user (such
|
||||
as ActiveDirectory, yes, this really happens). When
|
||||
`pam_normalize_username` is on, usernames are _not_ normalized to
|
||||
lowercase.
|
||||
|
||||
#### Validate usernames
|
||||
|
||||
In most cases, there is a very limited set of acceptable usernames.
|
||||
Authenticators can define `validate_username(username)`,
|
||||
which should return True for a valid username and False for an invalid one.
|
||||
The primary effect this has is improving error messages during user creation.
|
||||
|
||||
The default behavior is to use configurable `Authenticator.username_pattern`,
|
||||
which is a regular expression string for validation.
|
||||
|
||||
To only allow usernames that start with 'w':
|
||||
|
||||
```python
|
||||
c.Authenticator.username_pattern = r'w.*'
|
||||
```
|
||||
|
||||
### How to write a custom authenticator
|
||||
|
||||
You can use custom Authenticator subclasses to enable authentication
|
||||
via other mechanisms. One such example is using [GitHub OAuth][].
|
||||
|
||||
Because the username is passed from the Authenticator to the Spawner,
|
||||
a custom Authenticator and Spawner are often used together.
|
||||
For example, the Authenticator methods, {meth}`.Authenticator.pre_spawn_start`
|
||||
and {meth}`.Authenticator.post_spawn_stop`, are hooks that can be used to do
|
||||
auth-related startup (e.g. opening PAM sessions) and cleanup
|
||||
(e.g. closing PAM sessions).
|
||||
|
||||
See a list of custom Authenticators [on the wiki](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
|
||||
|
||||
If you are interested in writing a custom authenticator, you can read
|
||||
[this tutorial](http://jupyterhub-tutorial.readthedocs.io/en/latest/authenticators.html).
|
||||
|
||||
### Registering custom Authenticators via entry points
|
||||
|
||||
As of JupyterHub 1.0, custom authenticators can register themselves via
|
||||
the `jupyterhub.authenticators` entry point metadata.
|
||||
To do this, in your `setup.py` add:
|
||||
|
||||
```python
|
||||
setup(
|
||||
...
|
||||
entry_points={
|
||||
'jupyterhub.authenticators': [
|
||||
'myservice = mypackage:MyAuthenticator',
|
||||
],
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
If you have added this metadata to your package,
|
||||
admins can select your authenticator with the configuration:
|
||||
|
||||
```python
|
||||
c.JupyterHub.authenticator_class = 'myservice'
|
||||
```
|
||||
|
||||
instead of the full
|
||||
|
||||
```python
|
||||
c.JupyterHub.authenticator_class = 'mypackage:MyAuthenticator'
|
||||
```
|
||||
|
||||
previously required.
|
||||
Additionally, configurable attributes for your authenticator will
|
||||
appear in jupyterhub help output and auto-generated configuration files
|
||||
via `jupyterhub --generate-config`.
|
||||
|
||||
### Authentication state
|
||||
|
||||
JupyterHub 0.8 adds the ability to persist state related to authentication,
|
||||
such as auth-related tokens.
|
||||
If such state should be persisted, `.authenticate()` should return a dictionary of the form:
|
||||
|
||||
```python
|
||||
{
|
||||
'name': username,
|
||||
'auth_state': {
|
||||
'key': 'value',
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
where `username` is the username that has been authenticated,
|
||||
and `auth_state` is any JSON-serializable dictionary.
|
||||
|
||||
Because `auth_state` may contain sensitive information,
|
||||
it is encrypted before being stored in the database.
|
||||
To store auth_state, two conditions must be met:
|
||||
|
||||
1. persisting auth state must be enabled explicitly via configuration
|
||||
```python
|
||||
c.Authenticator.enable_auth_state = True
|
||||
```
|
||||
2. encryption must be enabled by the presence of `JUPYTERHUB_CRYPT_KEY` environment variable,
|
||||
which should be a hex-encoded 32-byte key.
|
||||
For example:
|
||||
```bash
|
||||
export JUPYTERHUB_CRYPT_KEY=$(openssl rand -hex 32)
|
||||
```
|
||||
|
||||
JupyterHub uses [Fernet](https://cryptography.io/en/latest/fernet/) to encrypt auth_state.
|
||||
To facilitate key-rotation, `JUPYTERHUB_CRYPT_KEY` may be a semicolon-separated list of encryption keys.
|
||||
If there are multiple keys present, the **first** key is always used to persist any new auth_state.
|
||||
|
||||
#### Using auth_state
|
||||
|
||||
Typically, if `auth_state` is persisted it is desirable to affect the Spawner environment in some way.
|
||||
This may mean defining environment variables, placing certificate in the user's home directory, etc.
|
||||
The {meth}`Authenticator.pre_spawn_start` method can be used to pass information from authenticator state
|
||||
to Spawner environment:
|
||||
|
||||
```python
|
||||
class MyAuthenticator(Authenticator):
|
||||
async def authenticate(self, handler, data=None):
|
||||
username = await identify_user(handler, data)
|
||||
upstream_token = await token_for_user(username)
|
||||
return {
|
||||
'name': username,
|
||||
'auth_state': {
|
||||
'upstream_token': upstream_token,
|
||||
},
|
||||
}
|
||||
|
||||
async def pre_spawn_start(self, user, spawner):
|
||||
"""Pass upstream_token to spawner via environment variable"""
|
||||
auth_state = await user.get_auth_state()
|
||||
if not auth_state:
|
||||
# auth_state not enabled
|
||||
return
|
||||
spawner.environment['UPSTREAM_TOKEN'] = auth_state['upstream_token']
|
||||
```
|
||||
|
||||
Note that environment variable names and values are always strings, so passing multiple values means setting multiple environment variables or serializing more complex data into a single variable, e.g. as a JSON string.
|
||||
|
||||
auth state can also be used to configure the spawner via _config_ without subclassing
|
||||
by setting `c.Spawner.auth_state_hook`. This function will be called with `(spawner, auth_state)`,
|
||||
only when auth_state is defined.
|
||||
|
||||
For example:
|
||||
(for KubeSpawner)
|
||||
|
||||
```python
|
||||
def auth_state_hook(spawner, auth_state):
|
||||
spawner.volumes = auth_state['user_volumes']
|
||||
spawner.mounts = auth_state['user_mounts']
|
||||
|
||||
c.Spawner.auth_state_hook = auth_state_hook
|
||||
```
|
||||
|
||||
(authenticator-groups)=
|
||||
|
||||
## Authenticator-managed group membership
|
||||
|
||||
:::{versionadded} 2.2
|
||||
:::
|
||||
|
||||
Some identity providers may have their own concept of group membership that you would like to preserve in JupyterHub.
|
||||
This is now possible with `Authenticator.managed_groups`.
|
||||
|
||||
You can set the config:
|
||||
|
||||
```python
|
||||
c.Authenticator.manage_groups = True
|
||||
```
|
||||
|
||||
to enable this behavior.
|
||||
The default is False for Authenticators that ship with JupyterHub,
|
||||
but may be True for custom Authenticators.
|
||||
Check your Authenticator's documentation for manage_groups support.
|
||||
|
||||
If True, {meth}`.Authenticator.authenticate` and {meth}`.Authenticator.refresh_user` may include a field `groups`
|
||||
which is a list of group names the user should be a member of:
|
||||
|
||||
- Membership will be added for any group in the list
|
||||
- Membership in any groups not in the list will be revoked
|
||||
- Any groups not already present in the database will be created
|
||||
- If `None` is returned, no changes are made to the user's group membership
|
||||
|
||||
If authenticator-managed groups are enabled,
|
||||
all group-management via the API is disabled.
|
||||
|
||||
## pre_spawn_start and post_spawn_stop hooks
|
||||
|
||||
Authenticators use two hooks, {meth}`.Authenticator.pre_spawn_start` and
|
||||
{meth}`.Authenticator.post_spawn_stop(user, spawner)` to add pass additional state information
|
||||
between the authenticator and a spawner. These hooks are typically used auth-related
|
||||
startup, i.e. opening a PAM session, and auth-related cleanup, i.e. closing a
|
||||
PAM session.
|
||||
|
||||
## JupyterHub as an OAuth provider
|
||||
|
||||
Beginning with version 0.8, JupyterHub is an OAuth provider.
|
||||
|
||||
[pam]: https://en.wikipedia.org/wiki/Pluggable_authentication_module
|
||||
[oauth]: https://en.wikipedia.org/wiki/OAuth
|
||||
[github oauth]: https://developer.github.com/v3/oauth/
|
||||
[oauthenticator]: https://github.com/jupyterhub/oauthenticator
|
412
docs/source/reference/technical-reference/subsystems/spawners.md
Normal file
412
docs/source/reference/technical-reference/subsystems/spawners.md
Normal file
@@ -0,0 +1,412 @@
|
||||
(spawners-reference)=
|
||||
|
||||
# Spawners
|
||||
|
||||
A [Spawner][] starts each single-user notebook server.
|
||||
The Spawner represents an abstract interface to a process,
|
||||
and a custom Spawner needs to be able to take three actions:
|
||||
|
||||
- start a process
|
||||
- poll whether a process is still running
|
||||
- stop a process
|
||||
|
||||
## Examples
|
||||
|
||||
Custom Spawners for JupyterHub can be found on the [JupyterHub wiki](https://github.com/jupyterhub/jupyterhub/wiki/Spawners).
|
||||
Some examples include:
|
||||
|
||||
- [DockerSpawner](https://github.com/jupyterhub/dockerspawner) for spawning user servers in Docker containers
|
||||
- `dockerspawner.DockerSpawner` for spawning identical Docker containers for
|
||||
each user
|
||||
- `dockerspawner.SystemUserSpawner` for spawning Docker containers with an
|
||||
environment and home directory for each user
|
||||
- both `DockerSpawner` and `SystemUserSpawner` also work with Docker Swarm for
|
||||
launching containers on remote machines
|
||||
- [SudoSpawner](https://github.com/jupyterhub/sudospawner) enables JupyterHub to
|
||||
run without being root, by spawning an intermediate process via `sudo`
|
||||
- [BatchSpawner](https://github.com/jupyterhub/batchspawner) for spawning remote
|
||||
servers using batch systems
|
||||
- [YarnSpawner](https://github.com/jupyterhub/yarnspawner) for spawning notebook
|
||||
servers in YARN containers on a Hadoop cluster
|
||||
- [SSHSpawner](https://github.com/NERSC/sshspawner) to spawn notebooks
|
||||
on a remote server using SSH
|
||||
- [KubeSpawner](https://github.com/jupyterhub/kubespawner) to spawn notebook servers on kubernetes cluster.
|
||||
|
||||
## Spawner control methods
|
||||
|
||||
### Spawner.start
|
||||
|
||||
`Spawner.start` should start a single-user server for a single user.
|
||||
Information about the user can be retrieved from `self.user`,
|
||||
an object encapsulating the user's name, authentication, and server info.
|
||||
|
||||
The return value of `Spawner.start` should be the `(ip, port)` of the running server,
|
||||
or a full URL as a string.
|
||||
|
||||
Most `Spawner.start` functions will look similar to this example:
|
||||
|
||||
```python
|
||||
async def start(self):
|
||||
self.ip = '127.0.0.1'
|
||||
self.port = random_port()
|
||||
# get environment variables,
|
||||
# several of which are required for configuring the single-user server
|
||||
env = self.get_env()
|
||||
cmd = []
|
||||
# get jupyterhub command to run,
|
||||
# typically ['jupyterhub-singleuser']
|
||||
cmd.extend(self.cmd)
|
||||
cmd.extend(self.get_args())
|
||||
|
||||
await self._actually_start_server_somehow(cmd, env)
|
||||
# url may not match self.ip:self.port, but it could!
|
||||
url = self._get_connectable_url()
|
||||
return url
|
||||
```
|
||||
|
||||
When `Spawner.start` returns, the single-user server process should actually be running,
|
||||
not just requested. JupyterHub can handle `Spawner.start` being very slow
|
||||
(such as PBS-style batch queues, or instantiating whole AWS instances)
|
||||
via relaxing the `Spawner.start_timeout` config value.
|
||||
|
||||
#### Note on IPs and ports
|
||||
|
||||
`Spawner.ip` and `Spawner.port` attributes set the _bind_ URL,
|
||||
which the single-user server should listen on
|
||||
(passed to the single-user process via the `JUPYTERHUB_SERVICE_URL` environment variable).
|
||||
The _return_ value is the IP and port (or full URL) the Hub should _connect to_.
|
||||
These are not necessarily the same, and usually won't be in any Spawner that works with remote resources or containers.
|
||||
|
||||
The default for `Spawner.ip`, and `Spawner.port` is `127.0.0.1:{random}`,
|
||||
which is appropriate for Spawners that launch local processes,
|
||||
where everything is on localhost and each server needs its own port.
|
||||
For remote or container Spawners, it will often make sense to use a different value,
|
||||
such as `ip = '0.0.0.0'` and a fixed port, e.g. `8888`.
|
||||
The defaults can be changed in the class,
|
||||
preserving configuration with traitlets:
|
||||
|
||||
```python
|
||||
from traitlets import default
|
||||
from jupyterhub.spawner import Spawner
|
||||
|
||||
class MySpawner(Spawner):
|
||||
@default("ip")
|
||||
def _default_ip(self):
|
||||
return '0.0.0.0'
|
||||
|
||||
@default("port")
|
||||
def _default_port(self):
|
||||
return 8888
|
||||
|
||||
async def start(self):
|
||||
env = self.get_env()
|
||||
cmd = []
|
||||
# get jupyterhub command to run,
|
||||
# typically ['jupyterhub-singleuser']
|
||||
cmd.extend(self.cmd)
|
||||
cmd.extend(self.get_args())
|
||||
|
||||
remote_server_info = await self._actually_start_server_somehow(cmd, env)
|
||||
url = self.get_public_url_from(remote_server_info)
|
||||
return url
|
||||
```
|
||||
|
||||
#### Exception handling
|
||||
|
||||
When `Spawner.start` raises an Exception, a message can be passed on to the user via the exception using a `.jupyterhub_html_message` or `.jupyterhub_message` attribute.
|
||||
|
||||
When the Exception has a `.jupyterhub_html_message` attribute, it will be rendered as HTML to the user.
|
||||
|
||||
Alternatively `.jupyterhub_message` is rendered as unformatted text.
|
||||
|
||||
If both attributes are not present, the Exception will be shown to the user as unformatted text.
|
||||
|
||||
### Spawner.poll
|
||||
|
||||
`Spawner.poll` checks if the spawner is still running.
|
||||
It should return `None` if it is still running,
|
||||
and an integer exit status, otherwise.
|
||||
|
||||
In the case of local processes, `Spawner.poll` uses `os.kill(PID, 0)`
|
||||
to check if the local process is still running. On Windows, it uses `psutil.pid_exists`.
|
||||
|
||||
### Spawner.stop
|
||||
|
||||
`Spawner.stop` should stop the process. It must be a tornado coroutine, which should return when the process has finished exiting.
|
||||
|
||||
## Spawner state
|
||||
|
||||
JupyterHub should be able to stop and restart without tearing down
|
||||
single-user notebook servers. To do this task, a Spawner may need to persist
|
||||
some information that can be restored later.
|
||||
A JSON-able dictionary of state can be used to store persisted information.
|
||||
|
||||
Unlike start, stop, and poll methods, the state methods must not be coroutines.
|
||||
|
||||
In the case of single processes, the Spawner state is only the process ID of the server:
|
||||
|
||||
```python
|
||||
def get_state(self):
|
||||
"""get the current state"""
|
||||
state = super().get_state()
|
||||
if self.pid:
|
||||
state['pid'] = self.pid
|
||||
return state
|
||||
|
||||
def load_state(self, state):
|
||||
"""load state from the database"""
|
||||
super().load_state(state)
|
||||
if 'pid' in state:
|
||||
self.pid = state['pid']
|
||||
|
||||
def clear_state(self):
|
||||
"""clear any state (called after shutdown)"""
|
||||
super().clear_state()
|
||||
self.pid = 0
|
||||
```
|
||||
|
||||
## Spawner options form
|
||||
|
||||
(new in 0.4)
|
||||
|
||||
Some deployments may want to offer options to users to influence how their servers are started.
|
||||
This may include cluster-based deployments, where users specify what resources should be available,
|
||||
or docker-based deployments where users can select from a list of base images.
|
||||
|
||||
This feature is enabled by setting `Spawner.options_form`, which is an HTML form snippet
|
||||
inserted unmodified into the spawn form.
|
||||
If the `Spawner.options_form` is defined, when a user tries to start their server, they will be directed to a form page, like this:
|
||||
|
||||

|
||||
|
||||
If `Spawner.options_form` is undefined, the user's server is spawned directly, and no spawn page is rendered.
|
||||
|
||||
See [this example](https://github.com/jupyterhub/jupyterhub/blob/HEAD/examples/spawn-form/jupyterhub_config.py) for a form that allows custom CLI args for the local spawner.
|
||||
|
||||
### `Spawner.options_from_form`
|
||||
|
||||
Options from this form will always be a dictionary of lists of strings, e.g.:
|
||||
|
||||
```python
|
||||
{
|
||||
'integer': ['5'],
|
||||
'text': ['some text'],
|
||||
'select': ['a', 'b'],
|
||||
}
|
||||
```
|
||||
|
||||
When `formdata` arrives, it is passed through `Spawner.options_from_form(formdata)`,
|
||||
which is a method to turn the form data into the correct structure.
|
||||
This method must return a dictionary, and is meant to interpret the lists-of-strings into the correct types. For example, the `options_from_form` for the above form would look like:
|
||||
|
||||
```python
|
||||
def options_from_form(self, formdata):
|
||||
options = {}
|
||||
options['integer'] = int(formdata['integer'][0]) # single integer value
|
||||
options['text'] = formdata['text'][0] # single string value
|
||||
options['select'] = formdata['select'] # list already correct
|
||||
options['notinform'] = 'extra info' # not in the form at all
|
||||
return options
|
||||
```
|
||||
|
||||
which would return:
|
||||
|
||||
```python
|
||||
{
|
||||
'integer': 5,
|
||||
'text': 'some text',
|
||||
'select': ['a', 'b'],
|
||||
'notinform': 'extra info',
|
||||
}
|
||||
```
|
||||
|
||||
When `Spawner.start` is called, this dictionary is accessible as `self.user_options`.
|
||||
|
||||
[spawner]: https://github.com/jupyterhub/jupyterhub/blob/HEAD/jupyterhub/spawner.py
|
||||
|
||||
## Writing a custom spawner
|
||||
|
||||
If you are interested in building a custom spawner, you can read [this tutorial](https://jupyterhub-tutorial.readthedocs.io/en/latest/spawners.html).
|
||||
|
||||
### Registering custom Spawners via entry points
|
||||
|
||||
As of JupyterHub 1.0, custom Spawners can register themselves via
|
||||
the `jupyterhub.spawners` entry point metadata.
|
||||
To do this, in your `setup.py` add:
|
||||
|
||||
```python
|
||||
setup(
|
||||
...
|
||||
entry_points={
|
||||
'jupyterhub.spawners': [
|
||||
'myservice = mypackage:MySpawner',
|
||||
],
|
||||
},
|
||||
)
|
||||
```
|
||||
|
||||
If you have added this metadata to your package,
|
||||
users can select your spawner with the configuration:
|
||||
|
||||
```python
|
||||
c.JupyterHub.spawner_class = 'myservice'
|
||||
```
|
||||
|
||||
instead of the full
|
||||
|
||||
```python
|
||||
c.JupyterHub.spawner_class = 'mypackage:MySpawner'
|
||||
```
|
||||
|
||||
previously required.
|
||||
Additionally, configurable attributes for your spawner will
|
||||
appear in jupyterhub help output and auto-generated configuration files
|
||||
via `jupyterhub --generate-config`.
|
||||
|
||||
## Environment variables and command-line arguments
|
||||
|
||||
Spawners mainly do one thing: launch a command in an environment.
|
||||
|
||||
The command-line is constructed from user configuration:
|
||||
|
||||
- Spawner.cmd (default: `['jupyterhub-singleuser']`)
|
||||
- Spawner.args (CLI args to pass to the cmd, default: empty)
|
||||
|
||||
where the configuration:
|
||||
|
||||
```python
|
||||
c.Spawner.cmd = ["my-singleuser-wrapper"]
|
||||
c.Spawner.args = ["--debug", "--flag"]
|
||||
```
|
||||
|
||||
would result in spawning the command:
|
||||
|
||||
```bash
|
||||
my-singleuser-wrapper --debug --flag
|
||||
```
|
||||
|
||||
The `Spawner.get_args()` method is how `Spawner.args` is accessed,
|
||||
and can be used by Spawners to customize/extend user-provided arguments.
|
||||
|
||||
Prior to 2.0, JupyterHub unconditionally added certain options _if specified_ to the command-line,
|
||||
such as `--ip={Spawner.ip}` and `--port={Spawner.port}`.
|
||||
These have now all been moved to environment variables,
|
||||
and from JupyterHub 2.0,
|
||||
the command-line launched by JupyterHub is fully specified by overridable configuration `Spawner.cmd + Spawner.args`.
|
||||
|
||||
Most process configuration is passed via environment variables.
|
||||
Additional variables can be specified via the `Spawner.environment` configuration.
|
||||
|
||||
The process environment is returned by `Spawner.get_env`, which specifies the following environment variables:
|
||||
|
||||
- `JUPYTERHUB_SERVICE_URL` - the _bind_ URL where the server should launch its HTTP server (`http://127.0.0.1:12345`).
|
||||
This includes `Spawner.ip` and `Spawner.port`; _new in 2.0, prior to 2.0 IP, port were on the command-line and only if specified_
|
||||
- `JUPYTERHUB_SERVICE_PREFIX` - the URL prefix the service will run on (e.g. `/user/name/`)
|
||||
- `JUPYTERHUB_USER` - the JupyterHub user's username
|
||||
- `JUPYTERHUB_SERVER_NAME` - the server's name, if using named servers (default server has an empty name)
|
||||
- `JUPYTERHUB_API_URL` - the full URL for the JupyterHub API (http://17.0.0.1:8001/hub/api)
|
||||
- `JUPYTERHUB_BASE_URL` - the base URL of the whole jupyterhub deployment, i.e. the bit before `hub/` or `user/`,
|
||||
as set by `c.JupyterHub.base_url` (default: `/`)
|
||||
- `JUPYTERHUB_API_TOKEN` - the API token the server can use to make requests to the Hub.
|
||||
This is also the OAuth client secret.
|
||||
- `JUPYTERHUB_CLIENT_ID` - the OAuth client ID for authenticating visitors.
|
||||
- `JUPYTERHUB_OAUTH_CALLBACK_URL` - the callback URL to use in OAuth, typically `/user/:name/oauth_callback`
|
||||
- `JUPYTERHUB_OAUTH_ACCESS_SCOPES` - the scopes required to access the server (called `JUPYTERHUB_OAUTH_SCOPES` prior to 3.0)
|
||||
- `JUPYTERHUB_OAUTH_CLIENT_ALLOWED_SCOPES` - the scopes the service is allowed to request.
|
||||
If no scopes are requested explicitly, these scopes will be requested.
|
||||
|
||||
Optional environment variables, depending on configuration:
|
||||
|
||||
- `JUPYTERHUB_SSL_[KEYFILE|CERTFILE|CLIENT_CI]` - SSL configuration, when `internal_ssl` is enabled
|
||||
- `JUPYTERHUB_ROOT_DIR` - the root directory of the server (notebook directory), when `Spawner.notebook_dir` is defined (new in 2.0)
|
||||
- `JUPYTERHUB_DEFAULT_URL` - the default URL for the server (for redirects from `/user/:name/`),
|
||||
if `Spawner.default_url` is defined
|
||||
(new in 2.0, previously passed via CLI)
|
||||
- `JUPYTERHUB_DEBUG=1` - generic debug flag, sets maximum log level when `Spawner.debug` is True
|
||||
(new in 2.0, previously passed via CLI)
|
||||
- `JUPYTERHUB_DISABLE_USER_CONFIG=1` - disable loading user config,
|
||||
sets maximum log level when `Spawner.debug` is True (new in 2.0,
|
||||
previously passed via CLI)
|
||||
|
||||
- `JUPYTERHUB_[MEM|CPU]_[LIMIT_GUARANTEE]` - the values of CPU and memory limits and guarantees.
|
||||
These are not expected to be enforced by the process,
|
||||
but are made available as a hint,
|
||||
e.g. for resource monitoring extensions.
|
||||
|
||||
## Spawners, resource limits, and guarantees (Optional)
|
||||
|
||||
Some spawners of the single-user notebook servers allow setting limits or
|
||||
guarantees on resources, such as CPU and memory. To provide a consistent
|
||||
experience for sysadmins and users, we provide a standard way to set and
|
||||
discover these resource limits and guarantees, such as for memory and CPU.
|
||||
For the limits and guarantees to be useful, **the spawner must implement
|
||||
support for them**. For example, `LocalProcessSpawner`, the default
|
||||
spawner, does not support limits and guarantees. One of the spawners
|
||||
that supports limits and guarantees is the
|
||||
[`systemdspawner`](https://github.com/jupyterhub/systemdspawner).
|
||||
|
||||
### Memory Limits & Guarantees
|
||||
|
||||
`c.Spawner.mem_limit`: A **limit** specifies the _maximum amount of memory_
|
||||
that may be allocated, though there is no promise that the maximum amount will
|
||||
be available. In supported spawners, you can set `c.Spawner.mem_limit` to
|
||||
limit the total amount of memory that a single-user notebook server can
|
||||
allocate. Attempting to use more memory than this limit will cause errors. The
|
||||
single-user notebook server can discover its own memory limit by looking at
|
||||
the environment variable `MEM_LIMIT`, which is specified in absolute bytes.
|
||||
|
||||
`c.Spawner.mem_guarantee`: Sometimes, a **guarantee** of a _minimum amount of
|
||||
memory_ is desirable. In this case, you can set `c.Spawner.mem_guarantee` to
|
||||
to provide a guarantee that at minimum this much memory will always be
|
||||
available for the single-user notebook server to use. The environment variable
|
||||
`MEM_GUARANTEE` will also be set in the single-user notebook server.
|
||||
|
||||
**The spawner's underlying system or cluster is responsible for enforcing these
|
||||
limits and providing these guarantees.** If these values are set to `None`, no
|
||||
limits or guarantees are provided, and no environment values are set.
|
||||
|
||||
### CPU Limits & Guarantees
|
||||
|
||||
`c.Spawner.cpu_limit`: In supported spawners, you can set
|
||||
`c.Spawner.cpu_limit` to limit the total number of cpu-cores that a
|
||||
single-user notebook server can use. These can be fractional - `0.5` means 50%
|
||||
of one CPU core, `4.0` is 4 CPU-cores, etc. This value is also set in the
|
||||
single-user notebook server's environment variable `CPU_LIMIT`. The limit does
|
||||
not claim that you will be able to use all the CPU up to your limit as other
|
||||
higher priority applications might be taking up CPU.
|
||||
|
||||
`c.Spawner.cpu_guarantee`: You can set `c.Spawner.cpu_guarantee` to provide a
|
||||
guarantee for CPU usage. The environment variable `CPU_GUARANTEE` will be set
|
||||
in the single-user notebook server when a guarantee is being provided.
|
||||
|
||||
**The spawner's underlying system or cluster is responsible for enforcing these
|
||||
limits and providing these guarantees.** If these values are set to `None`, no
|
||||
limits or guarantees are provided, and no environment values are set.
|
||||
|
||||
### Encryption
|
||||
|
||||
Communication between the `Proxy`, `Hub`, and `Notebook` can be secured by
|
||||
turning on `internal_ssl` in `jupyterhub_config.py`. For a custom spawner to
|
||||
utilize these certs, there are two methods of interest on the base `Spawner`
|
||||
class: `.create_certs` and `.move_certs`.
|
||||
|
||||
The first method, `.create_certs` will sign a key-cert pair using an internally
|
||||
trusted authority for notebooks. During this process, `.create_certs` can
|
||||
apply `ip` and `dns` name information to the cert via an `alt_names` `kwarg`.
|
||||
This is used for certificate authentication (verification). Without proper
|
||||
verification, the `Notebook` will be unable to communicate with the `Hub` and
|
||||
vice versa when `internal_ssl` is enabled. For example, given a deployment
|
||||
using the `DockerSpawner` which will start containers with `ips` from the
|
||||
`docker` subnet pool, the `DockerSpawner` would need to instead choose a
|
||||
container `ip` prior to starting and pass that to `.create_certs` (TODO: edit).
|
||||
|
||||
In general though, this method will not need to be changed and the default
|
||||
`ip`/`dns` (localhost) info will suffice.
|
||||
|
||||
When `.create_certs` is run, it will create the certificates in a default,
|
||||
central location specified by `c.JupyterHub.internal_certs_location`. For
|
||||
`Spawners` that need access to these certs elsewhere (i.e. on another host
|
||||
altogether), the `.move_certs` method can be overridden to move the certs
|
||||
appropriately. Again, using `DockerSpawner` as an example, this would entail
|
||||
moving certs to a directory that will get mounted into the container this
|
||||
spawner starts.
|
132
docs/source/reference/technical-reference/technical-overview.md
Normal file
132
docs/source/reference/technical-reference/technical-overview.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Technical Overview
|
||||
|
||||
The **Technical Overview** section gives you a high-level view of:
|
||||
|
||||
- JupyterHub's major Subsystems: Hub, Proxy, Single-User Notebook Server
|
||||
- how the subsystems interact
|
||||
- the process from JupyterHub access to user login
|
||||
- JupyterHub's default behavior
|
||||
- customizing JupyterHub
|
||||
|
||||
The goal of this section is to share a deeper technical understanding of
|
||||
JupyterHub and how it works.
|
||||
|
||||
## The Major Subsystems: Hub, Proxy, Single-User Notebook Server
|
||||
|
||||
JupyterHub is a set of processes that together, provide a single-user Jupyter
|
||||
Notebook server for each person in a group. Three subsystems are started
|
||||
by the `jupyterhub` command line program:
|
||||
|
||||
- **Hub** (Python/Tornado): manages user accounts, authentication, and
|
||||
coordinates Single User Notebook Servers using a [Spawner](spawners-reference).
|
||||
|
||||
- **Proxy**: the public-facing part of JupyterHub that uses a dynamic proxy
|
||||
to route HTTP requests to the Hub and Single User Notebook Servers.
|
||||
[configurable http proxy](https://github.com/jupyterhub/configurable-http-proxy)
|
||||
(node-http-proxy) is the default proxy.
|
||||
|
||||
- **Single-User Notebook Server** (Python/Tornado): a dedicated,
|
||||
single-user, Jupyter Notebook server is started for each user on the system
|
||||
when the user logs in. The object that starts the single-user notebook
|
||||
servers is called a **[Spawner](spawners-reference)**.
|
||||
|
||||

|
||||
|
||||
## How the Subsystems Interact
|
||||
|
||||
Users access JupyterHub through a web browser, by going to the IP address or
|
||||
the domain name of the server.
|
||||
|
||||
The basic principles of operation are:
|
||||
|
||||
- The Hub spawns the proxy (in the default JupyterHub configuration)
|
||||
- The proxy forwards all requests to the Hub by default
|
||||
- The Hub handles login and spawns single-user notebook servers on demand
|
||||
- The Hub configures the proxy to forward URL prefixes to single-user notebook
|
||||
servers
|
||||
|
||||
The proxy is the only process that listens on a public interface. The Hub sits
|
||||
behind the proxy at `/hub`. Single-user servers sit behind the proxy at
|
||||
`/user/[username]`.
|
||||
|
||||
Different **[authenticators](authenticators-reference)** control access
|
||||
to JupyterHub. The default one [(PAM)](https://en.wikipedia.org/wiki/Pluggable_authentication_module) uses the user accounts on the server where
|
||||
JupyterHub is running. If you use this, you will need to create a user account
|
||||
on the system for each user on your team. However, using other authenticators you can
|
||||
allow users to sign in with e.g. a GitHub account, or with any single-sign-on
|
||||
system your organization has.
|
||||
|
||||
Next, **[spawners](spawners-reference)** control how JupyterHub starts
|
||||
the individual notebook server for each user. The default spawner will
|
||||
start a notebook server on the same machine running under their system username.
|
||||
The other main option is to start each server in a separate container, often using [Docker](https://jupyterhub-dockerspawner.readthedocs.io/en/latest/).
|
||||
|
||||
## The Process from JupyterHub Access to User Login
|
||||
|
||||
When a user accesses JupyterHub, the following events take place:
|
||||
|
||||
- Login data is handed to the [Authenticator](authenticators-reference) instance for
|
||||
validation
|
||||
- The Authenticator returns the username if the login information is valid
|
||||
- A single-user notebook server instance is [spawned](spawners-reference) for the
|
||||
logged-in user
|
||||
- When the single-user notebook server starts, the proxy is notified to forward
|
||||
requests made to `/user/[username]/*`, to the single-user notebook server.
|
||||
- A [cookie](https://en.wikipedia.org/wiki/HTTP_cookie) is set on `/hub/`, containing an encrypted token. (Prior to version
|
||||
0.8, a cookie for `/user/[username]` was used too.)
|
||||
- The browser is redirected to `/user/[username]`, and the request is handled by
|
||||
the single-user notebook server.
|
||||
|
||||
How does the single-user server identify the user with the Hub via OAuth?
|
||||
|
||||
- On request, the single-user server checks a cookie
|
||||
- If no cookie is set, the single-user server redirects to the Hub for verification via OAuth
|
||||
- After verification at the Hub, the browser is redirected back to the
|
||||
single-user server
|
||||
- The token is verified and stored in a cookie
|
||||
- If no user is identified, the browser is redirected back to `/hub/login`
|
||||
|
||||
## Default Behavior
|
||||
|
||||
By default, the **Proxy** listens on all public interfaces on port 8000.
|
||||
Thus you can reach JupyterHub through either:
|
||||
|
||||
- `http://localhost:8000`
|
||||
- or any other public IP or domain pointing to your system.
|
||||
|
||||
In their default configuration, the other services, the **Hub** and
|
||||
**Single-User Notebook Servers**, all communicate with each other on localhost
|
||||
only.
|
||||
|
||||
By default, starting JupyterHub will write two files to disk in the current
|
||||
working directory:
|
||||
|
||||
- `jupyterhub.sqlite` is the SQLite database containing all of the state of the
|
||||
**Hub**. This file allows the **Hub** to remember which users are running and
|
||||
where, as well as storing other information enabling you to restart parts of
|
||||
JupyterHub separately. It is important to note that this database contains
|
||||
**no** sensitive information other than **Hub** usernames.
|
||||
- `jupyterhub_cookie_secret` is the encryption key used for securing cookies.
|
||||
This file needs to persist so that a **Hub** server restart will avoid
|
||||
invalidating cookies. Conversely, deleting this file and restarting the server
|
||||
effectively invalidates all login cookies. The cookie secret file is discussed
|
||||
in the [Cookie Secret section of the Security Settings document](security-basics).
|
||||
|
||||
The location of these files can be specified via configuration settings. It is
|
||||
recommended that these files be stored in standard UNIX filesystem locations,
|
||||
such as `/etc/jupyterhub` for all configuration files and `/srv/jupyterhub` for
|
||||
all security and runtime files.
|
||||
|
||||
## Customizing JupyterHub
|
||||
|
||||
There are two basic extension points for JupyterHub:
|
||||
|
||||
- How users are authenticated by [Authenticators](authenticators-reference)
|
||||
- How user's single-user notebook server processes are started by
|
||||
[Spawners](spawners-reference)
|
||||
|
||||
Each is governed by a customizable class, and JupyterHub ships with basic
|
||||
defaults for each.
|
||||
|
||||
To enable custom authentication and/or spawning, subclass `Authenticator` or
|
||||
`Spawner`, and override the relevant methods.
|
Reference in New Issue
Block a user