mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-17 23:13:00 +00:00
Merge branch 'jupyterhub:main' into server
This commit is contained in:
@@ -6,7 +6,7 @@ We use different channels of communication for different purposes. Whichever one
|
||||
|
||||
We use [Discourse](https://discourse.jupyter.org) for online discussions and support questions. Everyone in the Jupyter community is welcome to bring ideas and questions there.
|
||||
|
||||
All our past and current discussions on Discourse are archived and searchable. This is why we recommend you first go to Discourse, so that discussions remain useful and accessible to the whole community.
|
||||
We recommend that you first use our Discourse as all past and current discussions on it are archived and searchable. Thus, all discussions remain useful and accessible to the whole community.
|
||||
|
||||
## Gitter
|
||||
|
||||
@@ -14,10 +14,12 @@ We use [our Gitter channel](https://gitter.im/jupyterhub/jupyterhub) for online,
|
||||
|
||||
## Github Issues
|
||||
|
||||
Github issues are used for most long-form project discussions, bug reports and feature requests.
|
||||
[Github issues](https://docs.github.com/en/issues/tracking-your-work-with-issues/about-issues) are used for most long-form project discussions, bug reports and feature requests.
|
||||
|
||||
Issues related to a specific authenticator or spawner should be opened in the appropriate repository for the authenticator or spawner. If you are using a specific JupyterHub distribution (such as [Zero to JupyterHub on Kubernetes](http://github.com/jupyterhub/zero-to-jupyterhub-k8s) or [The Littlest JupyterHub](http://github.com/jupyterhub/the-littlest-jupyterhub/)), you should open issues directly in their repository.
|
||||
- Issues related to a specific authenticator or spawner should be opened in the appropriate repository for the authenticator or spawner.
|
||||
- If you are using a specific JupyterHub distribution (such as [Zero to JupyterHub on Kubernetes](http://github.com/jupyterhub/zero-to-jupyterhub-k8s) or [The Littlest JupyterHub](http://github.com/jupyterhub/the-littlest-jupyterhub/)), you should open issues directly in their repository.
|
||||
- If you cannot find a repository to open your issue in, do not worry! Open the issue in the [main JupyterHub repository](https://github.com/jupyterhub/jupyterhub/) and our community will help you figure it out.
|
||||
|
||||
If you cannot find a repository to open your issue in, do not worry! Open the issue in the [main JupyterHub repository](https://github.com/jupyterhub/jupyterhub/) and our community will help you figure it out.
|
||||
|
||||
**NOTE**: Our community is distributed across the world in various timezones, so please be patient if you do not get a response immediately!
|
||||
```{note}
|
||||
Our community is distributed across the world in various timezones, so please be patient if you do not get a response immediately!
|
||||
```
|
||||
|
@@ -4,7 +4,7 @@ This roadmap collects "next steps" for JupyterHub. It is about creating a
|
||||
shared understanding of the project's vision and direction amongst
|
||||
the community of users, contributors, and maintainers.
|
||||
The goal is to communicate priorities and upcoming release plans.
|
||||
It is not a aimed at limiting contributions to what is listed here.
|
||||
It is not aimed at limiting contributions to what is listed here.
|
||||
|
||||
## Using the roadmap
|
||||
|
||||
|
@@ -1,6 +1,6 @@
|
||||
# Authentication and User Basics
|
||||
|
||||
The default Authenticator uses [PAM][] to authenticate system users with
|
||||
The default Authenticator uses [PAM][] (Pluggable Authentication Module) to authenticate system users with
|
||||
their username and password. With the default Authenticator, any user
|
||||
with an account and password on the system will be allowed to login.
|
||||
|
||||
@@ -25,7 +25,7 @@ If this configuration value is not set, then **all authenticated users will be a
|
||||
```{note}
|
||||
As of JupyterHub 2.0, the full permissions of `admin_users`
|
||||
should not be required.
|
||||
Instead, you can assign [roles](https://jupyterhub.readthedocs.io/en/stable/rbac/roles.html#define-role-target) to users or groups
|
||||
Instead, you can assign [roles](define-role-target) to users or groups
|
||||
with only the scopes they require.
|
||||
```
|
||||
|
||||
@@ -42,7 +42,7 @@ c.Authenticator.admin_users = {'mal', 'zoe'}
|
||||
Users in the admin set are automatically added to the user `allowed_users` set,
|
||||
if they are not already present.
|
||||
|
||||
Each authenticator may have different ways of determining whether a user is an
|
||||
Each Authenticator may have different ways of determining whether a user is an
|
||||
administrator. By default, JupyterHub uses the PAMAuthenticator which provides the
|
||||
`admin_groups` option and can set administrator status based on a user
|
||||
group. For example, we can let any user in the `wheel` group be an admin:
|
||||
@@ -76,7 +76,7 @@ fresh.
|
||||
|
||||
## Use LocalAuthenticator to create system users
|
||||
|
||||
The `LocalAuthenticator` is a special kind of authenticator that has
|
||||
The `LocalAuthenticator` is a special kind of Authenticator that has
|
||||
the ability to manage users on the local system. When you try to add a
|
||||
new user to the Hub, a `LocalAuthenticator` will check if the user
|
||||
already exists. If you set the configuration value, `create_system_users`,
|
||||
@@ -118,8 +118,8 @@ with any provider, is also available.
|
||||
|
||||
## Use DummyAuthenticator for testing
|
||||
|
||||
The `DummyAuthenticator` is a simple authenticator that
|
||||
allows for any username/password unless a global password has been set. If
|
||||
The `DummyAuthenticator` is a simple Authenticator that
|
||||
allows for any username or password unless a global password has been set. If
|
||||
set, it will allow for any username as long as the correct password is provided.
|
||||
To set a global password, add this to the config file:
|
||||
|
||||
|
@@ -78,7 +78,7 @@ gives administrators more control over their setup and hardware.
|
||||
|
||||
Because JupyterHub is an open-source, community-driven tool, it can be extended and
|
||||
modified to fit an institution's needs. It plays nicely with the open source data science
|
||||
stack, and can serve a variety of computing enviroments, user interfaces, and
|
||||
stack, and can serve a variety of computing environments, user interfaces, and
|
||||
computational hardware. It can also be deployed anywhere - on enterprise cloud infrastructure, on
|
||||
High-Performance-Computing machines, on local hardware, or even on a single laptop, which
|
||||
is not possible with most other tools for shared interactive computing.
|
||||
|
@@ -5,8 +5,8 @@ Security settings
|
||||
|
||||
You should not run JupyterHub without SSL encryption on a public network.
|
||||
|
||||
Security is the most important aspect of configuring Jupyter. Three
|
||||
configuration settings are the main aspects of security configuration:
|
||||
Security is the most important aspect of configuring Jupyter.
|
||||
Three (3) configuration settings are the main aspects of security configuration:
|
||||
|
||||
1. :ref:`SSL encryption <ssl-encryption>` (to enable HTTPS)
|
||||
2. :ref:`Cookie secret <cookie-secret>` (a key for encrypting browser cookies)
|
||||
@@ -15,7 +15,7 @@ configuration settings are the main aspects of security configuration:
|
||||
|
||||
The Hub hashes all secrets (e.g., auth tokens) before storing them in its
|
||||
database. A loss of control over read-access to the database should have
|
||||
minimal impact on your deployment; if your database has been compromised, it
|
||||
minimal impact on your deployment. If your database has been compromised, it
|
||||
is still a good idea to revoke existing tokens.
|
||||
|
||||
.. _ssl-encryption:
|
||||
@@ -31,7 +31,7 @@ Using an SSL certificate
|
||||
|
||||
This will require you to obtain an official, trusted SSL certificate or create a
|
||||
self-signed certificate. Once you have obtained and installed a key and
|
||||
certificate you need to specify their locations in the ``jupyterhub_config.py``
|
||||
certificate, you need to specify their locations in the ``jupyterhub_config.py``
|
||||
configuration file as follows:
|
||||
|
||||
.. code-block:: python
|
||||
@@ -72,13 +72,13 @@ would be the needed configuration:
|
||||
If SSL termination happens outside of the Hub
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In certain cases, for example if the hub is running behind a reverse proxy, and
|
||||
In certain cases, for example, if the hub is running behind a reverse proxy, and
|
||||
`SSL termination is being provided by NGINX <https://www.nginx.com/resources/admin-guide/nginx-ssl-termination/>`_,
|
||||
it is reasonable to run the hub without SSL.
|
||||
|
||||
To achieve this, simply omit the configuration settings
|
||||
``c.JupyterHub.ssl_key`` and ``c.JupyterHub.ssl_cert``
|
||||
(setting them to ``None`` does not have the same effect, and is an error).
|
||||
(setting them to ``None`` does not have the same effect, but results in an error).
|
||||
|
||||
.. _authentication-token:
|
||||
|
||||
@@ -92,7 +92,7 @@ use an auth token.
|
||||
|
||||
The value of this token should be a random string (for example, generated by
|
||||
``openssl rand -hex 32``). You can store it in the configuration file or an
|
||||
environment variable
|
||||
environment variable.
|
||||
|
||||
Generating and storing token in the configuration file
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
@@ -119,7 +119,7 @@ Default if token is not set
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If you don't set the Proxy authentication token, the Hub will generate a random
|
||||
key itself, which means that any time you restart the Hub you **must also
|
||||
key itself. This means that any time you restart the Hub, you **must also
|
||||
restart the Proxy**. If the proxy is a subprocess of the Hub, this should happen
|
||||
automatically (this is the default configuration).
|
||||
|
||||
@@ -128,7 +128,7 @@ automatically (this is the default configuration).
|
||||
Cookie secret
|
||||
-------------
|
||||
|
||||
The cookie secret is an encryption key, used to encrypt the browser cookies
|
||||
The cookie secret is an encryption key, used to encrypt the browser cookies,
|
||||
which are used for authentication. Three common methods are described for
|
||||
generating and configuring the cookie secret.
|
||||
|
||||
@@ -136,8 +136,8 @@ Generating and storing as a cookie secret file
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The cookie secret should be 32 random bytes, encoded as hex, and is typically
|
||||
stored in a ``jupyterhub_cookie_secret`` file. An example command to generate the
|
||||
``jupyterhub_cookie_secret`` file is:
|
||||
stored in a ``jupyterhub_cookie_secret`` file. Below, is an example command to generate the
|
||||
``jupyterhub_cookie_secret`` file:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
@@ -155,7 +155,7 @@ The location of the ``jupyterhub_cookie_secret`` file can be specified in the
|
||||
|
||||
If the cookie secret file doesn't exist when the Hub starts, a new cookie
|
||||
secret is generated and stored in the file. The file must not be readable by
|
||||
``group`` or ``other`` or the server won't start. The recommended permissions
|
||||
``group`` or ``other``, otherwise the server won't start. The recommended permissions
|
||||
for the cookie secret file are ``600`` (owner-only rw).
|
||||
|
||||
Generating and storing as an environment variable
|
||||
@@ -176,8 +176,8 @@ the Hub starts.
|
||||
Generating and storing as a binary string
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You can also set the cookie secret in the configuration file
|
||||
itself, ``jupyterhub_config.py``, as a binary string:
|
||||
You can also set the cookie secret, as a binary string,
|
||||
in the configuration file (``jupyterhub_config.py``) itself:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@@ -198,7 +198,7 @@ jupyterhub-hub-login
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This is the login token used when visiting Hub-served pages that are
|
||||
protected by authentication such as the main home, the spawn form, etc.
|
||||
protected by authentication, such as the main home, the spawn form, etc.
|
||||
If this cookie is set, then the user is logged in.
|
||||
|
||||
Resetting the Hub cookie secret effectively revokes this cookie.
|
||||
@@ -209,7 +209,7 @@ jupyterhub-user-<username>
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This is the cookie used for authenticating with a single-user server.
|
||||
It is set by the single-user server after OAuth with the Hub.
|
||||
It is set by the single-user server, after OAuth with the Hub.
|
||||
|
||||
Effectively the same as ``jupyterhub-hub-login``, but for the
|
||||
single-user server instead of the Hub. It contains an OAuth access token,
|
||||
@@ -218,14 +218,13 @@ which is checked with the Hub to authenticate the browser.
|
||||
Each OAuth access token is associated with a session id (see ``jupyterhub-session-id`` section
|
||||
below).
|
||||
|
||||
To avoid hitting the Hub on every request, the authentication response
|
||||
is cached. And to avoid a stale cache the cache key is comprised of both
|
||||
the token and session id.
|
||||
To avoid hitting the Hub on every request, the authentication response is cached.
|
||||
The cache key is comprised of both the token and session id, to avoid a stale cache.
|
||||
|
||||
Resetting the Hub cookie secret effectively revokes this cookie.
|
||||
|
||||
This cookie is restricted to the path ``/user/<username>``, so that
|
||||
only the user’s server receives it.
|
||||
This cookie is restricted to the path ``/user/<username>``,
|
||||
to ensure that only the user’s server receives it.
|
||||
|
||||
jupyterhub-session-id
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
@@ -235,7 +234,7 @@ shared by the Hub and single-user servers.
|
||||
|
||||
Its sole purpose is to coordinate logout of the multiple OAuth cookies.
|
||||
|
||||
This cookie is set to ``/`` so all endpoints can receive it, or clear it, etc.
|
||||
This cookie is set to ``/`` so all endpoints can receive it, clear it, etc.
|
||||
|
||||
jupyterhub-user-<username>-oauth-state
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
@@ -245,7 +244,7 @@ It is only set while OAuth between the single-user server and the Hub
|
||||
is processing.
|
||||
|
||||
If you use your browser development tools, you should see this cookie
|
||||
for a very brief moment before your are logged in,
|
||||
for a very brief moment before you are logged in,
|
||||
with an expiration date shorter than ``jupyterhub-hub-login`` or
|
||||
``jupyterhub-user-<username>``.
|
||||
|
||||
|
@@ -1,5 +1,3 @@
|
||||
(roles)=
|
||||
|
||||
# Roles
|
||||
|
||||
JupyterHub provides four (4) roles that are available by default:
|
||||
|
@@ -9,12 +9,12 @@ To determine which scopes a role should have, one can follow these steps:
|
||||
5. Customize the scopes with filters if needed
|
||||
6. Define the role with required scopes and assign to users/services/groups/tokens
|
||||
|
||||
Below, different use cases are presented on how to use the RBAC framework.
|
||||
Below, different use cases are presented on how to use the [RBAC framework](./index.md)
|
||||
|
||||
## Service to cull idle servers
|
||||
|
||||
Finding and shutting down idle servers can save a lot of computational resources.
|
||||
We can make use of [jupyterhub-idle-culler](https://github.com/jupyterhub/jupyterhub-idle-culler) to manage this for us.
|
||||
**We can make use of [jupyterhub-idle-culler](https://github.com/jupyterhub/jupyterhub-idle-culler) to manage this for us.**
|
||||
Below follows a short tutorial on how to add a cull-idle service in the RBAC system.
|
||||
|
||||
1. Install the cull-idle server script with `pip install jupyterhub-idle-culler`.
|
||||
|
@@ -6,10 +6,10 @@ Only do this if you are very sure you must.
|
||||
|
||||
## Overview
|
||||
|
||||
There are many Authenticators and Spawners available for JupyterHub. Some, such
|
||||
as DockerSpawner or OAuthenticator, do not need any elevated permissions. This
|
||||
There are many [Authenticators](./authenticators-users-basics) and [Spawners](./spawners-basics) available for JupyterHub. Some, such
|
||||
as [DockerSpawner](https://github.com/jupyterhub/dockerspawner) or [OAuthenticator](https://github.com/jupyterhub/oauthenticator), do not need any elevated permissions. This
|
||||
document describes how to get the full default behavior of JupyterHub while
|
||||
running notebook servers as real system users on a shared system without
|
||||
running notebook servers as real system users on a shared system, without
|
||||
running the Hub itself as root.
|
||||
|
||||
Since JupyterHub needs to spawn processes as other users, the simplest way
|
||||
@@ -90,7 +90,7 @@ $ adduser -G jupyterhub newuser
|
||||
Test that the new user doesn't need to enter a password to run the sudospawner
|
||||
command.
|
||||
|
||||
This should prompt for your password to switch to rhea, but _not_ prompt for
|
||||
This should prompt for your password to switch to `rhea`, but _not_ prompt for
|
||||
any password for the second switch. It should show some help output about
|
||||
logging options:
|
||||
|
||||
@@ -119,7 +119,7 @@ the shadow password database.
|
||||
|
||||
### Shadow group (Linux)
|
||||
|
||||
**Note:** On Fedora based distributions there is no clear way to configure
|
||||
**Note:** On [Fedora based distributions](https://fedoraproject.org/wiki/List_of_Fedora_remixes) there is no clear way to configure
|
||||
the PAM database to allow sufficient access for authenticating with the target user's password
|
||||
from JupyterHub. As a workaround we recommend use an
|
||||
[alternative authentication method](https://github.com/jupyterhub/jupyterhub/wiki/Authenticators).
|
||||
@@ -150,7 +150,7 @@ We want our new user to be able to read the shadow passwords, so add it to the s
|
||||
$ sudo usermod -a -G shadow rhea
|
||||
```
|
||||
|
||||
If you want jupyterhub to serve pages on a restricted port (such as port 80 for http),
|
||||
If you want jupyterhub to serve pages on a restricted port (such as port 80 for HTTP),
|
||||
then you will need to give `node` permission to do so:
|
||||
|
||||
```bash
|
||||
@@ -226,7 +226,7 @@ And try logging in.
|
||||
## Troubleshooting: SELinux
|
||||
|
||||
If you still get a generic `Permission denied` `PermissionError`, it's possible SELinux is blocking you.
|
||||
Here's how you can make a module to allow this.
|
||||
Here's how you can make a module to resolve this.
|
||||
First, put this in a file named `sudo_exec_selinux.te`:
|
||||
|
||||
```bash
|
||||
@@ -253,6 +253,6 @@ $ semodule -i sudo_exec_selinux.pp
|
||||
## Troubleshooting: PAM session errors
|
||||
|
||||
If the PAM authentication doesn't work and you see errors for
|
||||
`login:session-auth`, or similar, considering updating to a more recent version
|
||||
`login:session-auth`, or similar, consider updating to a more recent version
|
||||
of jupyterhub and disabling the opening of PAM sessions with
|
||||
`c.PAMAuthenticator.open_sessions=False`.
|
||||
|
@@ -1,26 +1,26 @@
|
||||
# JupyterHub and OAuth
|
||||
|
||||
JupyterHub uses OAuth 2 internally as a mechanism for authenticating users.
|
||||
JupyterHub uses [OAuth 2](https://oauth.net/2/) as an internal mechanism for authenticating users.
|
||||
As such, JupyterHub itself always functions as an OAuth **provider**.
|
||||
More on what that means [below](oauth-terms).
|
||||
You can find out more about what that means [below](oauth-terms).
|
||||
|
||||
Additionally, JupyterHub is _often_ deployed with [oauthenticator](https://oauthenticator.readthedocs.io),
|
||||
Additionally, JupyterHub is _often_ deployed with [OAuthenticator](https://oauthenticator.readthedocs.io),
|
||||
where an external identity provider, such as GitHub or KeyCloak, is used to authenticate users.
|
||||
When this is the case, there are _two_ nested oauth flows:
|
||||
an _internal_ oauth flow where JupyterHub is the **provider**,
|
||||
and and _external_ oauth flow, where JupyterHub is a **client**.
|
||||
When this is the case, there are _two_ nested OAuth flows:
|
||||
an _internal_ OAuth flow where JupyterHub is the **provider**,
|
||||
and an _external_ OAuth flow, where JupyterHub is the **client**.
|
||||
|
||||
This means that when you are using JupyterHub, there is always _at least one_ and often two layers of OAuth involved in a user logging in and accessing their server.
|
||||
|
||||
Some relevant points:
|
||||
The following points are noteworthy:
|
||||
|
||||
- Single-user servers _never_ need to communicate with or be aware of the upstream provider configured in your Authenticator.
|
||||
As far as they are concerned, only JupyterHub is an OAuth provider,
|
||||
As far as the servers are concerned, only JupyterHub is an OAuth provider,
|
||||
and how users authenticate with the Hub itself is irrelevant.
|
||||
- When talking to a single-user server,
|
||||
- When interacting with a single-user server,
|
||||
there are ~always two tokens:
|
||||
a token issued to the server itself to communicate with the Hub API,
|
||||
and a second per-user token in the browser to represent the completed login process and authorized permissions.
|
||||
first, a token issued to the server itself to communicate with the Hub API,
|
||||
and second, a per-user token in the browser to represent the completed login process and authorized permissions.
|
||||
More on this [later](two-tokens).
|
||||
|
||||
(oauth-terms)=
|
||||
@@ -28,64 +28,64 @@ Some relevant points:
|
||||
## Key OAuth terms
|
||||
|
||||
Here are some key definitions to keep in mind when we are talking about OAuth.
|
||||
You can also read more detail [here](https://www.oauth.com/oauth2-servers/definitions/).
|
||||
You can also read more in detail [here](https://www.oauth.com/oauth2-servers/definitions/).
|
||||
|
||||
- **provider**: The entity responsible for managing identity and authorization,
|
||||
- **provider**: The entity responsible for managing identity and authorization;
|
||||
always a web server.
|
||||
JupyterHub is _always_ an oauth provider for JupyterHub's components.
|
||||
When OAuthenticator is used, an external service, such as GitHub or KeyCloak, is also an oauth provider.
|
||||
- **client**: An entity that requests OAuth **tokens** on a user's behalf,
|
||||
JupyterHub is _always_ an OAuth provider for JupyterHub's components.
|
||||
When OAuthenticator is used, an external service, such as GitHub or KeyCloak, is also an OAuth provider.
|
||||
- **client**: An entity that requests OAuth **tokens** on a user's behalf;
|
||||
generally a web server of some kind.
|
||||
OAuth **clients** are services that _delegate_ authentication and/or authorization
|
||||
to an OAuth **provider**.
|
||||
JupyterHub _services_ or single-user _servers_ are OAuth **clients** of the JupyterHub **provider**.
|
||||
When OAuthenticator is used, JupyterHub is itself _also_ an OAuth **client** for the external oauth **provider**, e.g. GitHub.
|
||||
When OAuthenticator is used, JupyterHub is itself _also_ an OAuth **client** for the external OAuth **provider**, e.g. GitHub.
|
||||
- **browser**: A user's web browser, which makes requests and stores things like cookies.
|
||||
- **token**: The secret value used to represent a user's authorization. This is the final product of the OAuth process.
|
||||
- **code**: A short-lived temporary secret that the **client** exchanges
|
||||
for a **token** at the conclusion of oauth,
|
||||
in what's generally called the "oauth callback handler."
|
||||
for a **token** at the conclusion of OAuth,
|
||||
in what's generally called the "OAuth callback handler."
|
||||
|
||||
## One oauth flow
|
||||
|
||||
OAuth **flow** is what we call the sequence of HTTP requests involved in authenticating a user and issuing a token, ultimately used for authorized access to a service or single-user server.
|
||||
OAuth **flow** is what we call the sequence of HTTP requests involved in authenticating a user and issuing a token, ultimately used for authorizing access to a service or single-user server.
|
||||
|
||||
A single oauth flow generally goes like this:
|
||||
A single OAuth flow typically goes like this:
|
||||
|
||||
### OAuth request and redirect
|
||||
|
||||
1. A **browser** makes an HTTP request to an oauth **client**.
|
||||
2. There are no credentials, so the client _redirects_ the browser to an "authorize" page on the oauth **provider** with some extra information:
|
||||
- the oauth **client id** of the client itself.
|
||||
- the **redirect uri** to be redirected back to after completion.
|
||||
1. A **browser** makes an HTTP request to an OAuth **client**.
|
||||
2. There are no credentials, so the client _redirects_ the browser to an "authorize" page on the OAuth **provider** with some extra information:
|
||||
- the OAuth **client ID** of the client itself.
|
||||
- the **redirect URI** to be redirected back to after completion.
|
||||
- the **scopes** requested, which the user should be presented with to confirm.
|
||||
This is the "X would like to be able to Y on your behalf. Allow this?" page you see on all the "Login with ..." pages around the Internet.
|
||||
3. During this authorize step,
|
||||
the browser must be _authenticated_ with the provider.
|
||||
This is often already stored in a cookie,
|
||||
but if not the provider webapp must begin its _own_ authentication process before serving the authorization page.
|
||||
This _may_ even begin another oauth flow!
|
||||
This _may_ even begin another OAuth flow!
|
||||
4. After the user tells the provider that they want to proceed with the authorization,
|
||||
the provider records this authorization in a short-lived record called an **oauth code**.
|
||||
5. Finally, the oauth provider redirects the browser _back_ to the oauth client's "redirect uri"
|
||||
(or "oauth callback uri"),
|
||||
with the oauth code in a url parameter.
|
||||
the provider records this authorization in a short-lived record called an **OAuth code**.
|
||||
5. Finally, the oauth provider redirects the browser _back_ to the oauth client's "redirect URI"
|
||||
(or "OAuth callback URI"),
|
||||
with the OAuth code in a URL parameter.
|
||||
|
||||
That's the end of the requests made between the **browser** and the **provider**.
|
||||
That marks the end of the requests made between the **browser** and the **provider**.
|
||||
|
||||
### State after redirect
|
||||
|
||||
At this point:
|
||||
|
||||
- The browser is authenticated with the _provider_.
|
||||
- The user's authorized permissions are recorded in an _oauth code_.
|
||||
- The _provider_ knows that the given oauth client's requested permissions have been granted, but the client doesn't know this yet.
|
||||
- All requests so far have been made directly by the browser.
|
||||
No requests have originated at the client or provider.
|
||||
- The user's authorized permissions are recorded in an _OAuth code_.
|
||||
- The _provider_ knows that the permissions requested by the OAuth client have been granted, but the client doesn't know this yet.
|
||||
- All the requests so far have been made directly by the browser.
|
||||
No requests have originated from the client or provider.
|
||||
|
||||
### OAuth Client Handles Callback Request
|
||||
|
||||
Now we get to finish the OAuth process.
|
||||
At this stage, we get to finish the OAuth process.
|
||||
Let's dig into what the OAuth client does when it handles
|
||||
the OAuth callback request.
|
||||
|
||||
@@ -95,12 +95,12 @@ the OAuth callback request.
|
||||
makes a second API request to the _provider_
|
||||
to retrieve information about the owner of the token (the user).
|
||||
This is the step where behavior diverges for different OAuth providers.
|
||||
Up to this point, all oauth providers are the same, following the oauth specification.
|
||||
However, oauth does not define a standard for exchanging tokens for information about their owner or permissions ([OpenID Connect](https://openid.net/connect/) does that),
|
||||
Up to this point, all OAuth providers are the same, following the OAuth specification.
|
||||
However, OAuth does not define a standard for issuing tokens in exchange for information about their owner or permissions ([OpenID Connect](https://openid.net/connect/) does that),
|
||||
so this step may be different for each OAuth provider.
|
||||
- Finally, the oauth client stores its own record that the user is authorized in a cookie.
|
||||
- Finally, the OAuth client stores its own record that the user is authorized in a cookie.
|
||||
This could be the token itself, or any other appropriate representation of successful authentication.
|
||||
- Last of all, now that credentials have been established,
|
||||
- Now that credentials have been established,
|
||||
the browser can be redirected to the _original_ URL where it started,
|
||||
to try the request again.
|
||||
If the client wasn't able to keep track of the original URL all this time
|
||||
@@ -114,7 +114,7 @@ So that's _one_ OAuth process.
|
||||
## Full sequence of OAuth in JupyterHub
|
||||
|
||||
Let's go through the above OAuth process in JupyterHub,
|
||||
with specific examples of each HTTP request and what information is contained.
|
||||
with specific examples of each HTTP request and what information it contains.
|
||||
For bonus points, we are using the double-OAuth example of JupyterHub configured with GitHubOAuthenticator.
|
||||
|
||||
To disambiguate, we will call the OAuth process where JupyterHub is the **provider** "internal OAuth,"
|
||||
@@ -184,7 +184,7 @@ The first:
|
||||
|
||||
- JupyterHub->GitHub
|
||||
- `POST https://github.com/login/oauth/access_token`
|
||||
- request made with oauth **code** from url parameter
|
||||
- request made with OAuth **code** from URL parameter
|
||||
- response includes an access **token**
|
||||
|
||||
The second:
|
||||
@@ -271,15 +271,15 @@ To handle this, OAuth tokens and the various places they are stored can _expire_
|
||||
which should have the same effect as no credentials,
|
||||
and trigger the authorization process again.
|
||||
|
||||
In JupyterHub's internal oauth, we have these layers of information that can go stale:
|
||||
In JupyterHub's internal OAuth, we have these layers of information that can go stale:
|
||||
|
||||
- The oauth client has a **cache** of Hub responses for tokens,
|
||||
- The OAuth client has a **cache** of Hub responses for tokens,
|
||||
so it doesn't need to make API requests to the Hub for every request it receives.
|
||||
This cache has an expiry of five minutes by default,
|
||||
and is governed by the configuration `HubAuth.cache_max_age` in the single-user server.
|
||||
- The internal oauth token is stored in a cookie, which has its own expiry (default: 14 days),
|
||||
- The internal OAuth token is stored in a cookie, which has its own expiry (default: 14 days),
|
||||
governed by `JupyterHub.cookie_max_age_days`.
|
||||
- The internal oauth token can also itself expire,
|
||||
- The internal OAuth token itself can also expire,
|
||||
which is by default the same as the cookie expiry,
|
||||
since it makes sense for the token itself and the place it is stored to expire at the same time.
|
||||
This is governed by `JupyterHub.cookie_max_age_days` first,
|
||||
@@ -317,9 +317,9 @@ triggering the external login process anew before letting a user proceed.
|
||||
- If the token has expired, but is still in the cookie:
|
||||
when the token response cache expires,
|
||||
the next time the server asks the hub about the token,
|
||||
no user will be identified and the internal oauth process begins again.
|
||||
no user will be identified and the internal OAuth process begins again.
|
||||
- If the token _cookie_ expires, the next browser request will be made with no credentials,
|
||||
and the internal oauth process will begin again.
|
||||
and the internal OAuth process will begin again.
|
||||
This will usually have the form of a transparent redirect browsers won't notice.
|
||||
However, if this occurs on an API request in a long-lived page visit
|
||||
such as a JupyterLab session, the API request may fail and require
|
||||
@@ -352,7 +352,7 @@ Logging out of JupyterHub means clearing and revoking many of these credentials:
|
||||
### A tale of two tokens
|
||||
|
||||
**TODO**: discuss API token issued to server at startup ($JUPYTERHUB_API_TOKEN)
|
||||
and oauth-issued token in the cookie,
|
||||
and OAuth-issued token in the cookie,
|
||||
and some details of how JupyterLab currently deals with that.
|
||||
They are different, and JupyterLab should be making requests using the token from the cookie,
|
||||
not the token from the server,
|
||||
|
@@ -2,7 +2,7 @@
|
||||
|
||||
## Background
|
||||
|
||||
The thing which users directly connect to is the proxy, by default
|
||||
The thing which users directly connect to is the proxy, which by default is
|
||||
`configurable-http-proxy`. The proxy either redirects users to the
|
||||
hub (for login and managing servers), or to their own single-user
|
||||
servers. Thus, as long as the proxy stays running, access to existing
|
||||
@@ -10,16 +10,15 @@ servers continues, even if the hub itself restarts or goes down.
|
||||
|
||||
When you first configure the hub, you may not even realize this
|
||||
because the proxy is automatically managed by the hub. This is great
|
||||
for getting started and even most use, but everytime you restart the
|
||||
hub, all user connections also get restarted. But it's also simple to
|
||||
for getting started and even most use-cases, although, everytime you restart the
|
||||
hub, all user connections are also restarted. However, it is also simple to
|
||||
run the proxy as a service separate from the hub, so that you are free
|
||||
to reconfigure the hub while only interrupting users who are currently
|
||||
actively starting the hub.
|
||||
|
||||
The default JupyterHub proxy is
|
||||
[configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy),
|
||||
and that page has some docs. If you are using a different proxy, such
|
||||
as Traefik, these instructions are probably not relevant to you.
|
||||
[configurable-http-proxy](https://github.com/jupyterhub/configurable-http-proxy). If you are using a different proxy, such
|
||||
as [Traefik](https://github.com/traefik/traefik), these instructions are probably not relevant to you.
|
||||
|
||||
## Configuration options
|
||||
|
||||
@@ -40,9 +39,14 @@ set to the URL which the hub uses to connect _to the proxy's API_.
|
||||
## Proxy configuration
|
||||
|
||||
You need to configure a service to start the proxy. An example
|
||||
command line for this is `configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error`. (Details for how to
|
||||
do this is out of scope for this tutorial - for example it might be a
|
||||
systemd service on within another docker cotainer). The proxy has no
|
||||
command line argument for this is:
|
||||
|
||||
```bash
|
||||
$ configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error
|
||||
```
|
||||
|
||||
(Details on how to do this is out of the scope of this tutorial. For example, it might be a
|
||||
systemd service configured within another docker container). The proxy has no
|
||||
configuration files, all configuration is via the command line and
|
||||
environment variables.
|
||||
|
||||
@@ -57,9 +61,9 @@ match the token given to `c.ConfigurableHTTPProxy.auth_token`.
|
||||
|
||||
You should check the [configurable-http-proxy
|
||||
options](https://github.com/jupyterhub/configurable-http-proxy) to see
|
||||
what other options are needed, for example SSL options. Note that
|
||||
these are configured in the hub if the hub is starting the proxy - you
|
||||
need to move the options to here.
|
||||
what other options are needed, for example, SSL options. Note that
|
||||
these options are configured in the hub if the hub is starting the proxy, so you
|
||||
need to configure the options there.
|
||||
|
||||
## Docker image
|
||||
|
||||
|
@@ -161,7 +161,7 @@ your server again.
|
||||
|
||||
##### Proxy settings (403 GET)
|
||||
|
||||
When your whole JupyterHub sits behind an organization proxy (_not_ a reverse proxy like NGINX as part of your setup and _not_ the configurable-http-proxy) the environment variables `HTTP_PROXY`, `HTTPS_PROXY`, `http_proxy`, and `https_proxy` might be set. This confuses the Jupyterhub single-user servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the single-user server thinking it has the wrong auth token. To circumvent this you should add `<hub_url>,<hub_ip>,localhost,127.0.0.1` to the environment variables `NO_PROXY` and `no_proxy`.
|
||||
When your whole JupyterHub sits behind an organization proxy (_not_ a reverse proxy like NGINX as part of your setup and _not_ the configurable-http-proxy) the environment variables `HTTP_PROXY`, `HTTPS_PROXY`, `http_proxy`, and `https_proxy` might be set. This confuses the JupyterHub single-user servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the single-user server thinking it has the wrong auth token. To circumvent this you should add `<hub_url>,<hub_ip>,localhost,127.0.0.1` to the environment variables `NO_PROXY` and `no_proxy`.
|
||||
|
||||
### Launching Jupyter Notebooks to run as an externally managed JupyterHub service with the `jupyterhub-singleuser` command returns a `JUPYTERHUB_API_TOKEN` error
|
||||
|
||||
@@ -324,8 +324,7 @@ Or use syslog:
|
||||
|
||||
### Toree integration with HDFS rack awareness script
|
||||
|
||||
The Apache Toree kernel will have an issue when running with JupyterHub if the standard HDFS
|
||||
rack awareness script is used. This will materialize in the logs as a repeated WARN:
|
||||
The Apache Toree kernel will have an issue when running with JupyterHub if the standard HDFS rack awareness script is used. This will materialize in the logs as a repeated WARN:
|
||||
|
||||
```bash
|
||||
16/11/29 16:24:20 WARN ScriptBasedMapping: Exception running /etc/hadoop/conf/topology_script.py some.ip.address
|
||||
|
Reference in New Issue
Block a user