mirror of
https://github.com/jupyterhub/jupyterhub.git
synced 2025-10-17 15:03:02 +00:00
add docs on custom authenticators and spawners
This commit is contained in:
78
docs/authenticators.md
Normal file
78
docs/authenticators.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# Writing a custom Authenticator
|
||||
|
||||
The [Authenticator][] is the mechanism for authorizing users.
|
||||
Basic authenticators use simple username and password authentication.
|
||||
JupyterHub ships only with a [PAM][]-based Authenticator,
|
||||
for logging in with local user accounts.
|
||||
|
||||
You can use custom Authenticator subclasses to enable authentication via other systems.
|
||||
One such example is using [GitHub OAuth][].
|
||||
|
||||
Because the username is passed from the Authenticator to the Spawner,
|
||||
a custom Authenticator and Spawner are often used together.
|
||||
|
||||
|
||||
## Basics of Authenticators
|
||||
|
||||
A basic Authenticator has one central method:
|
||||
|
||||
|
||||
### Authenticator.authenticate
|
||||
|
||||
Authenticator.authenticate(handler, data)
|
||||
|
||||
This method is passed the tornado RequestHandler and the POST data from the login form.
|
||||
Unless the login form has been customized, `data` will have two keys:
|
||||
|
||||
- `username` (self-explanatory)
|
||||
- `password` (also self-explanatory)
|
||||
|
||||
`authenticate`'s job is simple:
|
||||
|
||||
- return a username (non-empty str)
|
||||
of the authenticated user if authentication is successful
|
||||
- return `None` otherwise
|
||||
|
||||
Writing an Authenticator that looks up passwords in a dictionary
|
||||
requires only overriding this one method:
|
||||
|
||||
```python
|
||||
from tornado import gen
|
||||
from IPython.utils.traitlets import Dict
|
||||
from jupyterhub.auth import Authenticator
|
||||
|
||||
class DictionaryAuthenticator(Authenticator):
|
||||
|
||||
passwords = Dict(config=True,
|
||||
help="""dict of username:password for authentication"""
|
||||
)
|
||||
|
||||
@gen.coroutine
|
||||
def authenticate(self, handler, data):
|
||||
if self.passwords.get(data['username']) == data['password']:
|
||||
return data['username']
|
||||
```
|
||||
|
||||
|
||||
### Authenticator.whitelist
|
||||
|
||||
Authenticators can specify a whitelist of usernames to allow authentication.
|
||||
For local user authentication (e.g. PAM), this lets you limit which users
|
||||
can login.
|
||||
|
||||
|
||||
## OAuth and other non-password logins
|
||||
|
||||
Some login mechanisms, such as [OAuth][], don't map onto username+password.
|
||||
For these, you can override the login handlers.
|
||||
|
||||
You can see an example implementation of an Authenticator that uses [GitHub OAuth][]
|
||||
at [OAuthenticator][].
|
||||
|
||||
|
||||
[Authenticator]: ../jupyterhub/auth.py
|
||||
[PAM]: http://en.wikipedia.org/wiki/Pluggable_authentication_module
|
||||
[OAuth]: http://en.wikipedia.org/wiki/OAuth
|
||||
[GitHub OAuth]: https://developer.github.com/v3/oauth/
|
||||
[OAuthenticator]: https://github.com/jupyter/oauthenticator
|
||||
|
75
docs/howitworks.md
Normal file
75
docs/howitworks.md
Normal file
@@ -0,0 +1,75 @@
|
||||
# How JupyterHub works
|
||||
|
||||
JupyterHub is a multi-user server that manages and proxies multiple instances of the single-user <del>IPython</del> Jupyter notebook server.
|
||||
|
||||
There are three basic processes involved:
|
||||
|
||||
- multi-user Hub (Python/Tornado)
|
||||
- configurable http proxy (nodejs)
|
||||
- multiple single-user IPython notebook servers (Python/IPython/Tornado)
|
||||
|
||||
The proxy is the only process that listens on a public interface.
|
||||
The Hub sits behind the proxy at `/hub`.
|
||||
Single-user servers sit behind the proxy at `/user/[username]`.
|
||||
|
||||
|
||||
## Logging in
|
||||
|
||||
When a new browser logs in to JupyterHub, the following events take place:
|
||||
|
||||
- Login data is handed to the [Authenticator](#authentication) instance for validation
|
||||
- The Authenticator returns the username, if login information is valid
|
||||
- A single-user server instance is [Spawned](#spawning) for the logged-in user
|
||||
- When the server starts, the proxy is notified to forward `/user/[username]/*` to the single-user server
|
||||
- Two cookies are set, one for `/hub/` and another for `/user/[username]`,
|
||||
containing an encrypted token.
|
||||
- The browser is redirected to `/user/[username]`, which is handled by the single-user server
|
||||
|
||||
Logging into a single-user server is authenticated via the Hub:
|
||||
|
||||
- On request, the single-user server forwards the encrypted cookie to the Hub for verification
|
||||
- The Hub replies with the username if it is a valid cookie
|
||||
- If the user is the owner of the server, access is allowed
|
||||
- If it is the wrong user or an invalid cookie, the browser is redirected to `/hub/login`
|
||||
|
||||
|
||||
## Customizing JupyterHub
|
||||
|
||||
There are two basic extension points for JupyterHub: How users are authenticated,
|
||||
and how their server processes are started.
|
||||
Each is governed by a customizable class,
|
||||
and JupyterHub ships with just the most basic version of each.
|
||||
|
||||
To enable custom authentication and/or spawning,
|
||||
subclass Authenticator or Spawner,
|
||||
and override the relevant methods.
|
||||
|
||||
|
||||
### Authentication
|
||||
|
||||
Authentication is customizable via the Authenticator class.
|
||||
Authentication can be replaced by any mechanism,
|
||||
such as OAuth, Kerberos, etc.
|
||||
|
||||
JupyterHub only ships with [PAM](http://en.wikipedia.org/wiki/Pluggable_authentication_module) authentication,
|
||||
which requires the server to be run as root,
|
||||
or at least with access to the PAM service,
|
||||
which regular users typically do not have
|
||||
(on Ubuntu, this requires being added to the `shadow` group).
|
||||
|
||||
[More info on custom Authenticators](authenticators.md).
|
||||
|
||||
|
||||
### Spawning
|
||||
|
||||
Each single-user server is started by a Spawner.
|
||||
The Spawner represents an abstract interface to a process,
|
||||
and needs to be able to take three actions:
|
||||
|
||||
1. start the process
|
||||
2. poll whether the process is still running
|
||||
3. stop the process
|
||||
|
||||
[More info on custom Spawners](spawners.md).
|
||||
|
||||
[An example using Docker](https://github.com/jupyter/dockerspawner).
|
86
docs/spawners.md
Normal file
86
docs/spawners.md
Normal file
@@ -0,0 +1,86 @@
|
||||
# Writing a custom Spawner
|
||||
|
||||
Each single-user server is started by a [Spawner][].
|
||||
The Spawner represents an abstract interface to a process,
|
||||
and a custom Spawner needs to be able to take three actions:
|
||||
|
||||
1. start the process
|
||||
2. poll whether the process is still running
|
||||
3. stop the process
|
||||
|
||||
## Spawner.start
|
||||
|
||||
`Spawner.start` should start the single-user server for a single user.
|
||||
Information about the user can be retrieved from `self.user`,
|
||||
an object encapsulating the user's name, authentication, and server info.
|
||||
|
||||
When `Spawner.start` returns, it should have stored the IP and port
|
||||
of the single-user server in `self.user.server`.
|
||||
|
||||
**NOTE:** when writing coroutines, *never* `yield` in between a db change and a commit.
|
||||
Most `Spawner.start`s should have something looking like:
|
||||
|
||||
```python
|
||||
def start(self):
|
||||
self.user.server.ip = 'localhost' # or other host or IP address, as seen by the Hub
|
||||
self.user.server.port = 1234 # port selected somehow
|
||||
self.db.commit() # always commit before yield, if modifying db values
|
||||
yield self._actually_start_server_somehow()
|
||||
```
|
||||
|
||||
When `Spawner.start` returns, the single-user server process should actually be running,
|
||||
not just requested. JupyterHub can handle `Spawner.start` being very slow
|
||||
(such as PBS-style batch queues, or instantiating whole AWS instances)
|
||||
via relaxing the `Spawner.start_timeout` config value.
|
||||
|
||||
|
||||
## Spawner.poll
|
||||
|
||||
`Spawner.poll` should check if the spawner is still running.
|
||||
It should return `None` if it is still running,
|
||||
and an integer exit status, otherwise.
|
||||
|
||||
For the local process case, this uses `os.kill(PID, 0)`
|
||||
to check if the process is still around.
|
||||
|
||||
|
||||
## Spawner.stop
|
||||
|
||||
`Spawner.stop` should stop the process. It must be a tornado coroutine,
|
||||
and should return when the process has finished exiting.
|
||||
|
||||
|
||||
## Spawner state
|
||||
|
||||
JupyterHub should be able to stop and restart without having to teardown
|
||||
single-user servers. This means that a Spawner may need to persist
|
||||
some information that it can be restored.
|
||||
A dictionary of JSON-able state can be used to store this information.
|
||||
|
||||
Unlike start/stop/poll, the state methods must not be coroutines.
|
||||
|
||||
In the single-process case, this is only the process ID of the server:
|
||||
|
||||
```python
|
||||
def get_state(self):
|
||||
"""get the current state"""
|
||||
state = super().get_state()
|
||||
if self.pid:
|
||||
state['pid'] = self.pid
|
||||
return state
|
||||
|
||||
def load_state(self, state):
|
||||
"""load state from the database"""
|
||||
super().load_state(state)
|
||||
if 'pid' in state:
|
||||
self.pid = state['pid']
|
||||
|
||||
def clear_state(self):
|
||||
"""clear any state (called after shutdown)"""
|
||||
super().clear_state()
|
||||
self.pid = 0
|
||||
```
|
||||
|
||||
|
||||
|
||||
[Spawner]: ../jupyterhub/spawner.py
|
Reference in New Issue
Block a user