Merge pull request #5056 from minrk/forced-login

add forced login example
This commit is contained in:
Min RK
2025-04-29 08:52:36 +02:00
committed by GitHub
7 changed files with 372 additions and 1 deletions

View File

@@ -0,0 +1,130 @@
# Logging users in via URL
Sometimes, JupyterHub is integrated into an existing application that has already handled user login, etc..
It is often preferable in these applications to be able to link users to their running JupyterHub server without _prompting_ the user to login again with the Hub when the Hub should really be an implementation detail,
and not part of the user experience.
One way to do this has been to use [API only mode](#howto:api-only), issue tokens for users, and redirect users to a URL like `/users/name/?token=abc123`.
This is [disabled by default](#HubAuth.allow_token_in_url) in JupyterHub 5, because it presents a vulnerability for users to craft links that let _other_ users login as them, which can lead to inter-user attacks.
But that leaves the question: how do I as an _application developer_ embedding JupyterHub link users to their own running server without triggering another login prompt?
The problem with `?token=...` in the URL is specifically that _users_ can get and create these tokens, and share URLs.
This wouldn't be an issue if only authorized applications could issue tokens that behave this way.
The single-user server doesn't exactly have the hooks to manage this easily, but the [Authenticator](#Authenticator) API does.
## Problem statement
We want our external application to be able to:
1. authenticate users
2. (maybe) create JupyterHub users
3. start JupyterHub servers
4. redirect users into running servers _without_ any login prompts/loading pages from JupyterHub, and without any prior JupyterHub credentials
Step 1 is up to the application and not JupyterHub's problem.
Step 2 and 3 use the JupyterHub [REST API](#jupyterhub-rest-API).
The service would need the scopes:
```
admin:users # creating users
servers # start/stop servers
```
That leaves the last step: sending users to their running server with credentials, without prompting login.
This is where things can get tricky!
### Ideal case: oauth
_Ideally_, the best way to set this up is with the external service as an OAuth provider,
though in some cases it works best to use proxy-based authentication like Shibboleth / [REMOTE_USER](https://github.com/cwaldbieser/jhub_remote_user_authenticator).
The main things to know are:
- Links to `/hub/user-redirect/some/path` will ultimately land users at `/users/theirserver/some/path` after completing login, ensuring the server is running, etc.
- Setting `Authenticator.auto_login = True` allows beginning the login process without JupyterHub's "Login with..." prompt
_If_ your OAuth provider allows logging in to external services via your oauth provider without prompting, this is enough.
Not all do, though.
If you've already ensured the server is running, this will _appear_ to the user as if they are being sent directly to their running server.
But what _actually_ happens is quite a series of redirects, state checks, and cookie-setting:
1. visiting `/hub/user-redirect/some/path` checks if the user is logged in
1. if not, begin the login process (`/hub/login?next=/hub/user-redirect/...`)
2. redirects to your oauth provider to authenticate the user
3. redirects back to `/hub/oauth_callback` to complete login
4. redirects back to `/hub/user-redirect/...`
2. once authenticated, checks that the user's server is running
1. if not running, begins launch of the server
2. redirects to `/hub/spawn-pending/?next=...`
3. once the server is running, redirects to the actual user server `/users/username/some/path`
Now we're done, right? Actually, no, because the browser doesn't have credentials for their user server!
This sequence of redirects happens all the time in JupyterHub launch, and is usually totally transparent.
4. at the user server, check for a token in cookie
1. if not present or not valid, begin oauth with the Hub (redirect to `/hub/api/oauth2/authorize/...`)
2. hub redirects back to `/users/user/oauth_callback` to complete oauth
3. redirect again to the URL that started this internal oauth
5. finally, arrive at `/users/username/some/path`, the ultimate destination, with valid JupyterHub credentials
The steps that will show users something other than the page you want them to are:
- Step 1.1 will be a prompt e.g. with "Login with..." unless you set `c.Authenticator.auto_login = True`
- Step 1.2 _may_ be a prompt from your oauth provider. This isn't controlled by JupyterHub, and may not be avoidable.
- Step 2.2 will show the spawn pending page only if the server is not already running
Otherwise, this is all transparent redirects to the final destination.
#### Using an authentication proxy (REMOTE_USER)
If you use an Authentication proxy like Shibboleth that sets e.g. the REMOTE_USER header,
you can use an Authenticator like [RemoteUserAuthenticator](https://github.com/cwaldbieser/jhub_remote_user_authenticator) to automatically login users based on headers in the request.
The same process will work, but instead of step 1.1 redirecting to the oauth provider, it logs in immediately.
If you do support an auth proxy, you also need to be extremely sure that requests only come from the auth proxy, and don't accept any requests setting the REMOTE_USER header coming from other sources.
### Custom case
But let's say you can't use OAuth or REMOTE_USER, and you still want to hide JupyterHub implementation details.
All you really want is a way to write a URL that will take users to their servers without any login prompts.
You can do this if you create an Authenticator with `auto_login=True` that logs users in based on something in the _request_, e.g. a query parameter.
We have an _example_ in the JupyterHub repo in `examples/forced-login` that does this.
It is a sample 'external service' where you type in a username and a destination path.
When you 'login' with this username:
1. a token is issued
2. the token is stored and associated with the username
3. redirect to `/hub/login?login_token=...&next=/hub/user-redirect/destination/path`
Then on the JupyterHub side, there is the `ForcedLoginAuthenticator`.
This class implements `authenticate`, which:
1. has `auto_login = True` so visiting `/hub/login` calls `authenticate()` directly instead of serving a page
2. gets the token from the `login_token` URL parameter
3. makes a POST request to the external application with the token, requesting a username
4. the external application returns the username and deletes the token, so it cannot be re-used
5. Authenticator returns the username
This doesn't _bypass_ JupyterHub authentication, as some deployments have done, but it does _hide_ it.
If your service launches servers via the API, you could run this in [API only mode](#howto:api-only) by adding `/hub/login` as well:
```python
c.JupyterHub.hub_routespec = "/hub/api/"
c.Proxy.additional_routes = {"/hub/login": "http://hub:8080"}
```
```{literalinclude} ../../../examples/forced-login/jupyterhub_config.py
:language: python
:start-at: class ForcedLoginAuthenticator
:end-before: c = get_config()
```
**Why does this work?**
This is still logging in with a token in the URL, right?
Yes, but the key difference is that users cannot issue these tokens.
The sample application is still technically vulnerable, because the token link should really be non-transferrable, even if it can only be used once.
The only defense the sample application has against this is rapidly expiring tokens (they expire after 30 seconds).
You can use state cookies, etc. to manage that more rigorously, as done in OAuth (at which point, maybe implement OAuth itself, why not?).

View File

@@ -14,7 +14,7 @@ separate-proxy
templates
upgrading
log-messages
forced-login
```
(config-examples)=

View File

@@ -0,0 +1,51 @@
# Forced login example
Example for forcing user login via URL without disabling token-in-url protection.
An external application issues tokens associated with usernames.
A JupyterHub Authenticator only allows login via these tokens in a URL parameter (`/hub/login?login_token=....`),
which are then exchanged for a username, which is used to login the user.
Each token can be used for login only once, and must be used within 30 seconds of issue.
To run:
in one shell:
```
python3 external_app.py
```
in another:
```
jupyterhub
```
Then visit http://127.0.0.1:9000
Sometimes, JupyterHub is integrated into an existing application,
which has already handled login, etc.
It is often preferable in these applications to be able to link users to their running JupyterHub server without _prompting_ the user for login to the Hub when the Hub should really be an implementation detail.
One way to do this has been to use "API only mode", issue tokens for users, and redirect users to a URL like `/users/name/?token=abc123`.
This is [disabled by default]() in JupyterHub 5, because it presents a vulnerability for users to craft links that let _other_ users login as them, which can lead to inter-user attacks.
But that leaves the question: how do I as an _application developer_ generate a link that can login a user?
_Ideally_, the best way to set this up is with the external service as an OAuth provider,
though in some cases it works best to use proxy-based authentication like Shibboleth / [REMOTE_USER]().
If your service is an OAuth provider, sharing links to `/hub/user-redirect/lab/tree/path/to/notebook...` should work just fine.
JupyterHub will:
1. authenticate the user
2. redirect to your identity provider via oauth (you can set `Authenticator.auto_login = True` if you want to skip prompting the user)
3. complete oauth
4. start their single-user server if it's not running (show the launch progress page while it's waiting)
5. redirect to their server once it's up
6. oauth (again), this time between the single-user server and the Hub
If your application chooses to launch the server and wait for it to be ready before redirecting
[API only mode]() is sometimes useful

View File

@@ -0,0 +1,100 @@
"""An external app for laucnhing JupyuterHub with specified usernames
This one serves a form with a single username input field
After entering the username, generate a token and redirect to hub login with that token,
which is then exchanged for a username.
Users cannot login to JupyterHub directly, only via this app.
"""
import hashlib
import logging
import os
import secrets
import time
from pathlib import Path
from typing import Annotated
from fastapi import Body, FastAPI, Form, status
from fastapi.responses import HTMLResponse, JSONResponse, RedirectResponse
from yarl import URL
from jupyterhub.utils import url_path_join
app_dir = Path(__file__).parent.resolve()
index_html = app_dir / "index.html"
app = FastAPI()
log = logging.getLogger("uvicorn.error")
_tokens_to_username = {}
jupyterhub_url = URL(os.environ.get("JUPYTERHUB_URL", "http://127.0.0.1:8000/"))
# how many seconds do they have to complete the exchange before the token expires?
token_lifetime = 30
def _hash(token):
"""Hash a token for storage"""
return hashlib.sha256(token.encode("utf8", "replace")).hexdigest()
@app.get("/")
async def get():
with index_html.open() as f:
return HTMLResponse(f.read())
@app.post("/")
async def launch(username: Annotated[str, Form()], path: Annotated[str, Form()]):
"""Begin login
1. issue token for login
2. associate token with username
3. redirect to /hub/login?login_token=...
"""
token = secrets.token_urlsafe(32)
hashed_token = _hash(token)
log.info(f"Creating token for {username}, redirecting to {path}")
_tokens_to_username[hashed_token] = (username, time.monotonic() + token_lifetime)
login_url = (jupyterhub_url / "hub/login").extend_query(
login_token=token, next=url_path_join("/hub/user-redirect", path)
)
log.info(login_url)
return RedirectResponse(login_url, status_code=status.HTTP_303_SEE_OTHER)
@app.post("/login", response_class=JSONResponse)
async def login(token: Annotated[str, Body(embed=True)]):
"""
Callback to exchange a token for a username
token is consumed, can only be used once
"""
now = time.monotonic()
hashed_token = _hash(token)
if hashed_token not in _tokens_to_username:
return JSONResponse(
status_code=status.HTTP_404_NOT_FOUND, content={"message": "invalid token"}
)
username, expires_at = _tokens_to_username.pop(hashed_token)
if expires_at < now:
return JSONResponse(
status_code=status.HTTP_400_BAD_REQUEST,
content={"message": "token expired"},
)
return {"name": username}
def main():
"""Launches the application on port 5000 with uvicorn"""
import uvicorn
uvicorn.run(app, port=9000)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,22 @@
<!doctype html>
<html>
<head>
<title>External Service Login</title>
</head>
<body>
<h1>Login to JupyterHub</h1>
<form action="" method="POST">
<label for="username">
Username:
<input type="text" name="username" autocomplete="off" />
</label>
<br />
<label for="path">
Redirect path:
<input type="text" name="path" autocomplete="off" value="/lab" />
</label>
<br />
<button>Login</button>
</form>
</body>
</html>

View File

@@ -0,0 +1,65 @@
import json
from tornado import web
from tornado.httpclient import AsyncHTTPClient, HTTPClientError
from traitlets import Unicode
from jupyterhub.auth import Authenticator
from jupyterhub.utils import url_path_join
class ForcedLoginAuthenticator(Authenticator):
"""Authenticator to force login with a token provided by an external service
The external service issues tokens, which are exchanged for a username.
Visiting `/hub/login?login_token=...` logs in a user
Each token can be used only once.
"""
auto_login = True # begin login without prompt (token is in url)
allow_all = True # external login app controls this
token_provider_url = Unicode(
config=True, help="""The URL of the token/username provider"""
)
async def authenticate(self, handler, data):
token = handler.get_argument("login_token", None)
if not token:
raise web.HTTPError(
400, f"Login with external provider at {self.token_provider_url}"
)
client = AsyncHTTPClient()
try:
response = await client.fetch(
url_path_join(self.token_provider_url, "/login"),
method="POST",
headers={"Content-Type": "application/json"},
body=json.dumps({"token": token}),
)
except HTTPClientError as e:
self.log.info(
"Error exchanging token for username: %s",
e.response.body.decode("utf8", "replace"),
)
if e.code == 404:
raise web.HTTPError(
403,
f"Invalid token. Login with external provider at {self.token_provider_url}",
)
else:
raise
# pass through the response
return json.loads(response.body.decode())
c = get_config() # noqa
# use our Authenticator
c.JupyterHub.authenticator_class = ForcedLoginAuthenticator
# tell it where the external launch app is
c.ForcedLoginAuthenticator.token_provider_url = "http://127.0.0.1:9000/"
# local testing config (fake spawner, localhost only)
c.JupyterHub.ip = "127.0.0.1"
c.JupyterHub.spawner_class = "simple"

View File

@@ -0,0 +1,3 @@
fastapi
jupyterhub
yarl