mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-17 23:13:00 +00:00
Merge branch 'jupyterhub:main' into server
This commit is contained in:
@@ -6,10 +6,10 @@ Only do this if you are very sure you must.
|
||||
|
||||
## Overview
|
||||
|
||||
There are many Authenticators and Spawners available for JupyterHub. Some, such
|
||||
as DockerSpawner or OAuthenticator, do not need any elevated permissions. This
|
||||
There are many [Authenticators](./authenticators-users-basics) and [Spawners](./spawners-basics) available for JupyterHub. Some, such
|
||||
as [DockerSpawner](https://github.com/jupyterhub/dockerspawner) or [OAuthenticator](https://github.com/jupyterhub/oauthenticator), do not need any elevated permissions. This
|
||||
document describes how to get the full default behavior of JupyterHub while
|
||||
running notebook servers as real system users on a shared system without
|
||||
running notebook servers as real system users on a shared system, without
|
||||
running the Hub itself as root.
|
||||
|
||||
Since JupyterHub needs to spawn processes as other users, the simplest way
|
||||
@@ -90,7 +90,7 @@ $ adduser -G jupyterhub newuser
|
||||
Test that the new user doesn't need to enter a password to run the sudospawner
|
||||
command.
|
||||
|
||||
This should prompt for your password to switch to rhea, but _not_ prompt for
|
||||
This should prompt for your password to switch to `rhea`, but _not_ prompt for
|
||||
any password for the second switch. It should show some help output about
|
||||
logging options:
|
||||
|
||||
@@ -119,7 +119,7 @@ the shadow password database.
|
||||
|
||||
### Shadow group (Linux)
|
||||
|
||||
**Note:** On Fedora based distributions there is no clear way to configure
|
||||
**Note:** On [Fedora based distributions](https://fedoraproject.org/wiki/List_of_Fedora_remixes) there is no clear way to configure
|
||||
the PAM database to allow sufficient access for authenticating with the target user's password
|
||||
from JupyterHub. As a workaround we recommend use an
|
||||
[alternative authentication method](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
|
||||
@@ -150,7 +150,7 @@ We want our new user to be able to read the shadow passwords, so add it to the s
|
||||
$ sudo usermod -a -G shadow rhea
|
||||
```
|
||||
|
||||
If you want jupyterhub to serve pages on a restricted port (such as port 80 for http),
|
||||
If you want jupyterhub to serve pages on a restricted port (such as port 80 for HTTP),
|
||||
then you will need to give `node` permission to do so:
|
||||
|
||||
```bash
|
||||
@@ -226,7 +226,7 @@ And try logging in.
|
||||
## Troubleshooting: SELinux
|
||||
|
||||
If you still get a generic `Permission denied` `PermissionError`, it's possible SELinux is blocking you.
|
||||
Here's how you can make a module to allow this.
|
||||
Here's how you can make a module to resolve this.
|
||||
First, put this in a file named `sudo_exec_selinux.te`:
|
||||
|
||||
```bash
|
||||
@@ -253,6 +253,6 @@ $ semodule -i sudo_exec_selinux.pp
|
||||
## Troubleshooting: PAM session errors
|
||||
|
||||
If the PAM authentication doesn't work and you see errors for
|
||||
`login:session-auth`, or similar, considering updating to a more recent version
|
||||
`login:session-auth`, or similar, consider updating to a more recent version
|
||||
of jupyterhub and disabling the opening of PAM sessions with
|
||||
`c.PAMAuthenticator.open_sessions=False`.
|
||||
|
@@ -1,26 +1,26 @@
|
||||
# JupyterHub and OAuth
|
||||
|
||||
JupyterHub uses OAuth 2 internally as a mechanism for authenticating users.
|
||||
JupyterHub uses [OAuth 2](https://oauth.net/2/) as an internal mechanism for authenticating users.
|
||||
As such, JupyterHub itself always functions as an OAuth **provider**.
|
||||
More on what that means [below](oauth-terms).
|
||||
You can find out more about what that means [below](oauth-terms).
|
||||
|
||||
Additionally, JupyterHub is _often_ deployed with [oauthenticator](https://oauthenticator.readthedocs.io),
|
||||
Additionally, JupyterHub is _often_ deployed with [OAuthenticator](https://oauthenticator.readthedocs.io),
|
||||
where an external identity provider, such as GitHub or KeyCloak, is used to authenticate users.
|
||||
When this is the case, there are _two_ nested oauth flows:
|
||||
an _internal_ oauth flow where JupyterHub is the **provider**,
|
||||
and and _external_ oauth flow, where JupyterHub is a **client**.
|
||||
When this is the case, there are _two_ nested OAuth flows:
|
||||
an _internal_ OAuth flow where JupyterHub is the **provider**,
|
||||
and an _external_ OAuth flow, where JupyterHub is the **client**.
|
||||
|
||||
This means that when you are using JupyterHub, there is always _at least one_ and often two layers of OAuth involved in a user logging in and accessing their server.
|
||||
|
||||
Some relevant points:
|
||||
The following points are noteworthy:
|
||||
|
||||
- Single-user servers _never_ need to communicate with or be aware of the upstream provider configured in your Authenticator.
|
||||
As far as they are concerned, only JupyterHub is an OAuth provider,
|
||||
As far as the servers are concerned, only JupyterHub is an OAuth provider,
|
||||
and how users authenticate with the Hub itself is irrelevant.
|
||||
- When talking to a single-user server,
|
||||
- When interacting with a single-user server,
|
||||
there are ~always two tokens:
|
||||
a token issued to the server itself to communicate with the Hub API,
|
||||
and a second per-user token in the browser to represent the completed login process and authorized permissions.
|
||||
first, a token issued to the server itself to communicate with the Hub API,
|
||||
and second, a per-user token in the browser to represent the completed login process and authorized permissions.
|
||||
More on this [later](two-tokens).
|
||||
|
||||
(oauth-terms)=
|
||||
@@ -28,64 +28,64 @@ Some relevant points:
|
||||
## Key OAuth terms
|
||||
|
||||
Here are some key definitions to keep in mind when we are talking about OAuth.
|
||||
You can also read more detail [here](https://www.oauth.com/oauth2-servers/definitions/).
|
||||
You can also read more in detail [here](https://www.oauth.com/oauth2-servers/definitions/).
|
||||
|
||||
- **provider**: The entity responsible for managing identity and authorization,
|
||||
- **provider**: The entity responsible for managing identity and authorization;
|
||||
always a web server.
|
||||
JupyterHub is _always_ an oauth provider for JupyterHub's components.
|
||||
When OAuthenticator is used, an external service, such as GitHub or KeyCloak, is also an oauth provider.
|
||||
- **client**: An entity that requests OAuth **tokens** on a user's behalf,
|
||||
JupyterHub is _always_ an OAuth provider for JupyterHub's components.
|
||||
When OAuthenticator is used, an external service, such as GitHub or KeyCloak, is also an OAuth provider.
|
||||
- **client**: An entity that requests OAuth **tokens** on a user's behalf;
|
||||
generally a web server of some kind.
|
||||
OAuth **clients** are services that _delegate_ authentication and/or authorization
|
||||
to an OAuth **provider**.
|
||||
JupyterHub _services_ or single-user _servers_ are OAuth **clients** of the JupyterHub **provider**.
|
||||
When OAuthenticator is used, JupyterHub is itself _also_ an OAuth **client** for the external oauth **provider**, e.g. GitHub.
|
||||
When OAuthenticator is used, JupyterHub is itself _also_ an OAuth **client** for the external OAuth **provider**, e.g. GitHub.
|
||||
- **browser**: A user's web browser, which makes requests and stores things like cookies.
|
||||
- **token**: The secret value used to represent a user's authorization. This is the final product of the OAuth process.
|
||||
- **code**: A short-lived temporary secret that the **client** exchanges
|
||||
for a **token** at the conclusion of oauth,
|
||||
in what's generally called the "oauth callback handler."
|
||||
for a **token** at the conclusion of OAuth,
|
||||
in what's generally called the "OAuth callback handler."
|
||||
|
||||
## One oauth flow
|
||||
|
||||
OAuth **flow** is what we call the sequence of HTTP requests involved in authenticating a user and issuing a token, ultimately used for authorized access to a service or single-user server.
|
||||
OAuth **flow** is what we call the sequence of HTTP requests involved in authenticating a user and issuing a token, ultimately used for authorizing access to a service or single-user server.
|
||||
|
||||
A single oauth flow generally goes like this:
|
||||
A single OAuth flow typically goes like this:
|
||||
|
||||
### OAuth request and redirect
|
||||
|
||||
1. A **browser** makes an HTTP request to an oauth **client**.
|
||||
2. There are no credentials, so the client _redirects_ the browser to an "authorize" page on the oauth **provider** with some extra information:
|
||||
- the oauth **client id** of the client itself.
|
||||
- the **redirect uri** to be redirected back to after completion.
|
||||
1. A **browser** makes an HTTP request to an OAuth **client**.
|
||||
2. There are no credentials, so the client _redirects_ the browser to an "authorize" page on the OAuth **provider** with some extra information:
|
||||
- the OAuth **client ID** of the client itself.
|
||||
- the **redirect URI** to be redirected back to after completion.
|
||||
- the **scopes** requested, which the user should be presented with to confirm.
|
||||
This is the "X would like to be able to Y on your behalf. Allow this?" page you see on all the "Login with ..." pages around the Internet.
|
||||
3. During this authorize step,
|
||||
the browser must be _authenticated_ with the provider.
|
||||
This is often already stored in a cookie,
|
||||
but if not the provider webapp must begin its _own_ authentication process before serving the authorization page.
|
||||
This _may_ even begin another oauth flow!
|
||||
This _may_ even begin another OAuth flow!
|
||||
4. After the user tells the provider that they want to proceed with the authorization,
|
||||
the provider records this authorization in a short-lived record called an **oauth code**.
|
||||
5. Finally, the oauth provider redirects the browser _back_ to the oauth client's "redirect uri"
|
||||
(or "oauth callback uri"),
|
||||
with the oauth code in a url parameter.
|
||||
the provider records this authorization in a short-lived record called an **OAuth code**.
|
||||
5. Finally, the oauth provider redirects the browser _back_ to the oauth client's "redirect URI"
|
||||
(or "OAuth callback URI"),
|
||||
with the OAuth code in a URL parameter.
|
||||
|
||||
That's the end of the requests made between the **browser** and the **provider**.
|
||||
That marks the end of the requests made between the **browser** and the **provider**.
|
||||
|
||||
### State after redirect
|
||||
|
||||
At this point:
|
||||
|
||||
- The browser is authenticated with the _provider_.
|
||||
- The user's authorized permissions are recorded in an _oauth code_.
|
||||
- The _provider_ knows that the given oauth client's requested permissions have been granted, but the client doesn't know this yet.
|
||||
- All requests so far have been made directly by the browser.
|
||||
No requests have originated at the client or provider.
|
||||
- The user's authorized permissions are recorded in an _OAuth code_.
|
||||
- The _provider_ knows that the permissions requested by the OAuth client have been granted, but the client doesn't know this yet.
|
||||
- All the requests so far have been made directly by the browser.
|
||||
No requests have originated from the client or provider.
|
||||
|
||||
### OAuth Client Handles Callback Request
|
||||
|
||||
Now we get to finish the OAuth process.
|
||||
At this stage, we get to finish the OAuth process.
|
||||
Let's dig into what the OAuth client does when it handles
|
||||
the OAuth callback request.
|
||||
|
||||
@@ -95,12 +95,12 @@ the OAuth callback request.
|
||||
makes a second API request to the _provider_
|
||||
to retrieve information about the owner of the token (the user).
|
||||
This is the step where behavior diverges for different OAuth providers.
|
||||
Up to this point, all oauth providers are the same, following the oauth specification.
|
||||
However, oauth does not define a standard for exchanging tokens for information about their owner or permissions ([OpenID Connect](https://openid.net/connect/) does that),
|
||||
Up to this point, all OAuth providers are the same, following the OAuth specification.
|
||||
However, OAuth does not define a standard for issuing tokens in exchange for information about their owner or permissions ([OpenID Connect](https://openid.net/connect/) does that),
|
||||
so this step may be different for each OAuth provider.
|
||||
- Finally, the oauth client stores its own record that the user is authorized in a cookie.
|
||||
- Finally, the OAuth client stores its own record that the user is authorized in a cookie.
|
||||
This could be the token itself, or any other appropriate representation of successful authentication.
|
||||
- Last of all, now that credentials have been established,
|
||||
- Now that credentials have been established,
|
||||
the browser can be redirected to the _original_ URL where it started,
|
||||
to try the request again.
|
||||
If the client wasn't able to keep track of the original URL all this time
|
||||
@@ -114,7 +114,7 @@ So that's _one_ OAuth process.
|
||||
## Full sequence of OAuth in JupyterHub
|
||||
|
||||
Let's go through the above OAuth process in JupyterHub,
|
||||
with specific examples of each HTTP request and what information is contained.
|
||||
with specific examples of each HTTP request and what information it contains.
|
||||
For bonus points, we are using the double-OAuth example of JupyterHub configured with GitHubOAuthenticator.
|
||||
|
||||
To disambiguate, we will call the OAuth process where JupyterHub is the **provider** "internal OAuth,"
|
||||
@@ -184,7 +184,7 @@ The first:
|
||||
|
||||
- JupyterHub->GitHub
|
||||
- `POST https://github.com/login/oauth/access_token`
|
||||
- request made with oauth **code** from url parameter
|
||||
- request made with OAuth **code** from URL parameter
|
||||
- response includes an access **token**
|
||||
|
||||
The second:
|
||||
@@ -271,15 +271,15 @@ To handle this, OAuth tokens and the various places they are stored can _expire_
|
||||
which should have the same effect as no credentials,
|
||||
and trigger the authorization process again.
|
||||
|
||||
In JupyterHub's internal oauth, we have these layers of information that can go stale:
|
||||
In JupyterHub's internal OAuth, we have these layers of information that can go stale:
|
||||
|
||||
- The oauth client has a **cache** of Hub responses for tokens,
|
||||
- The OAuth client has a **cache** of Hub responses for tokens,
|
||||
so it doesn't need to make API requests to the Hub for every request it receives.
|
||||
This cache has an expiry of five minutes by default,
|
||||
and is governed by the configuration `HubAuth.cache_max_age` in the single-user server.
|
||||
- The internal oauth token is stored in a cookie, which has its own expiry (default: 14 days),
|
||||
- The internal OAuth token is stored in a cookie, which has its own expiry (default: 14 days),
|
||||
governed by `JupyterHub.cookie_max_age_days`.
|
||||
- The internal oauth token can also itself expire,
|
||||
- The internal OAuth token itself can also expire,
|
||||
which is by default the same as the cookie expiry,
|
||||
since it makes sense for the token itself and the place it is stored to expire at the same time.
|
||||
This is governed by `JupyterHub.cookie_max_age_days` first,
|
||||
@@ -317,9 +317,9 @@ triggering the external login process anew before letting a user proceed.
|
||||
- If the token has expired, but is still in the cookie:
|
||||
when the token response cache expires,
|
||||
the next time the server asks the hub about the token,
|
||||
no user will be identified and the internal oauth process begins again.
|
||||
no user will be identified and the internal OAuth process begins again.
|
||||
- If the token _cookie_ expires, the next browser request will be made with no credentials,
|
||||
and the internal oauth process will begin again.
|
||||
and the internal OAuth process will begin again.
|
||||
This will usually have the form of a transparent redirect browsers won't notice.
|
||||
However, if this occurs on an API request in a long-lived page visit
|
||||
such as a JupyterLab session, the API request may fail and require
|
||||
@@ -352,7 +352,7 @@ Logging out of JupyterHub means clearing and revoking many of these credentials:
|
||||
### A tale of two tokens
|
||||
|
||||
**TODO**: discuss API token issued to server at startup ($JUPYTERHUB_API_TOKEN)
|
||||
and oauth-issued token in the cookie,
|
||||
and OAuth-issued token in the cookie,
|
||||
and some details of how JupyterLab currently deals with that.
|
||||
They are different, and JupyterLab should be making requests using the token from the cookie,
|
||||
not the token from the server,
|
||||
|
@@ -2,7 +2,7 @@
|
||||
|
||||
## Background
|
||||
|
||||
The thing which users directly connect to is the proxy, by default
|
||||
The thing which users directly connect to is the proxy, which by default is
|
||||
`configurable-http-proxy`. The proxy either redirects users to the
|
||||
hub (for login and managing servers), or to their own single-user
|
||||
servers. Thus, as long as the proxy stays running, access to existing
|
||||
@@ -10,16 +10,15 @@ servers continues, even if the hub itself restarts or goes down.
|
||||
|
||||
When you first configure the hub, you may not even realize this
|
||||
because the proxy is automatically managed by the hub. This is great
|
||||
for getting started and even most use, but everytime you restart the
|
||||
hub, all user connections also get restarted. But it's also simple to
|
||||
for getting started and even most use-cases, although, everytime you restart the
|
||||
hub, all user connections are also restarted. However, it is also simple to
|
||||
run the proxy as a service separate from the hub, so that you are free
|
||||
to reconfigure the hub while only interrupting users who are currently
|
||||
actively starting the hub.
|
||||
|
||||
The default JupyterHub proxy is
|
||||
[configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy),
|
||||
and that page has some docs. If you are using a different proxy, such
|
||||
as Traefik, these instructions are probably not relevant to you.
|
||||
[configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy). If you are using a different proxy, such
|
||||
as [Traefik](https://github.com/traefik/traefik), these instructions are probably not relevant to you.
|
||||
|
||||
## Configuration options
|
||||
|
||||
@@ -40,9 +39,14 @@ set to the URL which the hub uses to connect _to the proxy's API_.
|
||||
## Proxy configuration
|
||||
|
||||
You need to configure a service to start the proxy. An example
|
||||
command line for this is `configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error`. (Details for how to
|
||||
do this is out of scope for this tutorial - for example it might be a
|
||||
systemd service on within another docker cotainer). The proxy has no
|
||||
command line argument for this is:
|
||||
|
||||
```bash
|
||||
$ configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error
|
||||
```
|
||||
|
||||
(Details on how to do this is out of the scope of this tutorial. For example, it might be a
|
||||
systemd service configured within another docker container). The proxy has no
|
||||
configuration files, all configuration is via the command line and
|
||||
environment variables.
|
||||
|
||||
@@ -57,9 +61,9 @@ match the token given to `c.ConfigurableHTTPProxy.auth_token`.
|
||||
|
||||
You should check the [configurable-http-proxy
|
||||
options](https://github.com/jupyterhub/configurable-http-proxy) to see
|
||||
what other options are needed, for example SSL options. Note that
|
||||
these are configured in the hub if the hub is starting the proxy - you
|
||||
need to move the options to here.
|
||||
what other options are needed, for example, SSL options. Note that
|
||||
these options are configured in the hub if the hub is starting the proxy, so you
|
||||
need to configure the options there.
|
||||
|
||||
## Docker image
|
||||
|
||||
|
Reference in New Issue
Block a user