404 is also used to identify that a particular resource
(like a kernel or terminal) is not present, maybe because
it is deleted. That comes from the notebook server, while
here we are responding from JupyterHub. Saying that the
user server they are trying to request the resource (kernel, etc)
from does not exist seems right.
Non-running user servers making requests is a fairly
common occurance - user servers get culled while their
browser tabs are left open. So we now have a background level
of 503s responses on the hub *all* the time, making it
very difficult to detect *real* 503s, which should ideally
be closely monitored and alerted on.
I *think* 404 is a more appropriate response, as the resource
(API) being requested is no longer present.
Issue background
Registering an external service means it won't be run as a process by JupyterHub or similar as I understand it, and such external services may be registered only to get a /services/<service_name> route registered with JupyterHub's configured proxy rather than to actually use an api_token and speak with JupyterHub.
In the past, it was okay for a external service without an api_token to be registered, but not it isn't. This PR fixes that.
The situation when I run into this is when I register grafana as an external service like this (but in reality via a z2jh config with slightly different syntax).
```python
c.JupyterHub.services = [
{
"name": "grafana",
"url": "http://grafana",
}
]
```
JupyterHub has a [documentation about Services](https://jupyterhub.readthedocs.io/en/stable/reference/services.html properties-of-a-service), where one can see that the default value of api_token is None.
Issue details
This is an error me and GeorgianaElena have run into using JupyterHub 1.4.1, but I'm not sure at what point the regression was introduced besides it was around in 1.4.1.
I wrote some notes tracking this issue down. This is the summary I wrote.
```
This test was made to reproduce an error like this:
ValueError: Tokens must be at least 8 characters, got ''
The error had the following stack trace in 1.4.1:
jupyterhub/app.py:2213: in init_api_tokens
await self._add_tokens(self.service_tokens, kind='service')
jupyterhub/app.py:2182: in _add_tokens
obj.new_api_token(
jupyterhub/orm.py:424: in new_api_token
return APIToken.new(token=token, service=self, **kwargs)
jupyterhub/orm.py:699: in new
cls.check_token(db, token)
This test also make _add_tokens receive a token_dict that is buggy:
{"": "external_2"}
It turned out that whatever passes token_dict to _add_tokens failed to
ignore service's api_tokens that were None, and instead passes them as blank
strings.
It turned out that init_api_tokens was passing self.service_tokens, and that
self.service_tokens had been populated with blank string tokens for external
services registered with JupyterHub.
```
Signed-off-by: Erik Sundell <erik.i.sundell@gmail.com>
**Environment**
* image: k8s-hub (`jupyterhub/k8s-hub:0.11.1`);
* `authenticator_class: dummy`;
* db: cocroachdb (`sqlalchemy-cocroachdb`).
**Description:**
`save_bearer_token` method (`provider.py`) passes a float value to the `expires_at` field (int).
A user can create a notebook, it gets successfully scheduled, and then, once the pod is up and ready, the user is unable to enter the notebook, because jupyterhub cannot save a token. In logs, we can see the following:
```
[I 2021-05-29 14:45:04.302 JupyterHub log:181] 302 GET /hub/api/oauth2/authorize?client_id=jupyterhub-user-user2&redirect_uri=%2Fuser%2Fuser2%2Foauth_callback&response_type=code&state=[secret] -> /user/user2/oauth_callback?code=[secret]&state=[secret] (user2 40.113.125.116) 73.98ms
[E 2021-05-29 14:45:04.424 JupyterHub web:1789] Uncaught exception POST /hub/api/oauth2/token (10.42.80.10)
HTTPServerRequest(protocol='http', host='hub:8081', method='POST', uri='/hub/api/oauth2/token', version='HTTP/1.1', remote_ip='10.42.80.10')
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1702, in _execute
result = method(*self.path_args, **self.path_kwargs)
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/apihandlers/auth.py", line 324, in post
headers, body, status = self.oauth_provider.create_token_response(
File "/usr/local/lib/python3.8/dist-packages/oauthlib/oauth2/rfc6749/endpoints/base.py", line 116, in wrapper
return f(endpoint, uri, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/oauthlib/oauth2/rfc6749/endpoints/token.py", line 118, in create_token_response
return grant_type_handler.create_token_response(
File "/usr/local/lib/python3.8/dist-packages/oauthlib/oauth2/rfc6749/grant_types/authorization_code.py", line 313, in create_token_response
self.request_validator.save_token(token, request)
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/oauth/provider.py", line 281, in save_token
return self.save_bearer_token(token, request, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/jupyterhub/oauth/provider.py", line 354, in save_bearer_token
self.db.commit()
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 1042, in commit
self.transaction.commit()
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 504, in commit
self._prepare_impl()
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
self.session.flush()
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2536, in flush
self._flush(objects)
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2678, in _flush
transaction.rollback(_capture_exception=True)
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/session.py", line 2638, in _flush
flush_context.execute()
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/unitofwork.py", line 586, in execute
persistence.save_obj(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 239, in save_obj
_emit_insert_statements(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/orm/persistence.py", line 1135, in _emit_insert_statements
result = cached_connections[connection].execute(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
util.raise_(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DatatypeMismatch) value type decimal doesn't match type int of column "expires_at"
HINT: you will need to rewrite or cast the expression
[SQL: INSERT INTO oauth_access_tokens (client_id, grant_type, expires_at, refresh_token, refresh_expires_at, user_id, session_id, hashed, prefix, created, last_activity) VALUES (%(client_id)s, %(grant_type)s, %(expires_at)s, %(refresh_token)s, %(refresh_expires_at)s, %(user_id)s, %(session_id)s, %(hashed)s, %(prefix)s, %(created)s, %(last_activity)s) RETURNING oauth_access_tokens.id]
[parameters: {'client_id': 'jupyterhub-user-user2', 'grant_type': 'authorization_code', 'expires_at': 1622303104.418992, 'refresh_token': 'FVJ8S4is0367LlEMnxIiEIoTOeoxhf', 'refresh_expires_at': None, 'user_id': 662636890939424770, 'session_id': '4e041a2bfcb34a34a00033a281bc1236', 'hashed': 'sha512:1:3b18deae37fbf50a:03df035736960af14e19196e1d13fd74f55c21f17405119f80e75817ff37c7567fab089a3d40b97a57f94b54065ee56f7260895352516b9facb989d656f05be8', 'prefix': 't11z', 'created': datetime.datetime(2021, 5, 29, 14, 45, 4, 421305), 'last_activity': None}]
(Background on this error at: http://sqlalche.me/e/13/f405)
[W 2021-05-29 14:45:04.430 JupyterHub base:110] Rolling back session due to database error (psycopg2.errors.DatatypeMismatch) value type decimal doesn't match type int of column "expires_at"
HINT: you will need to rewrite or cast the expression
[SQL: INSERT INTO oauth_access_tokens (client_id, grant_type, expires_at, refresh_token, refresh_expires_at, user_id, session_id, hashed, prefix, created, last_activity) VALUES (%(client_id)s, %(grant_type)s, %(expires_at)s, %(refresh_token)s, %(refresh_expires_at)s, %(user_id)s, %(session_id)s, %(hashed)s, %(prefix)s, %(created)s, %(last_activity)s) RETURNING oauth_access_tokens.id]
[parameters: {'client_id': 'jupyterhub-user-user2', 'grant_type': 'authorization_code', 'expires_at': 1622303104.418992, 'refresh_token': 'FVJ8S4is0367LlEMnxIiEIoTOeoxhf', 'refresh_expires_at': None, 'user_id': 662636890939424770, 'session_id': '4e041a2bfcb34a34a00033a281bc1236', 'hashed': 'sha512:1:3b18deae37fbf50a:03df035736960af14e19196e1d13fd74f55c21f17405119f80e75817ff37c7567fab089a3d40b97a57f94b54065ee56f7260895352516b9facb989d656f05be8', 'prefix': 't11z', 'created': datetime.datetime(2021, 5, 29, 14, 45, 4, 421305), 'last_activity': None}]
(Background on this error at: http://sqlalche.me/e/13/f405)
[E 2021-05-29 14:45:04.443 JupyterHub log:173] {
"Host": "hub:8081",
"User-Agent": "python-requests/2.25.1",
"Accept-Encoding": "gzip, deflate",
"Accept": "*/*",
"Connection": "keep-alive",
"Content-Type": "application/x-www-form-urlencoded",
"Authorization": "token [secret]",
"Content-Length": "190"
}
[E 2021-05-29 14:45:04.444 JupyterHub log:181] 500 POST /hub/api/oauth2/token (user2 10.42.80.10) 63.28ms
```
Everything went well, when I changed:
`expires_at=orm.OAuthAccessToken.now() + token['expires_in'],`
to:
`expires_at=int(orm.OAuthAccessToken.now() + token['expires_in']),`
That's what this PR is about.
As a sidenote, `black` formatter adjusted the `orm_client = orm.OAuthClient(identifier=client_id,)` line, but I guess it should be fine. Please, feel free to revert this change if needed.
(Upd): added the missing `int` conversion.
Signed-off-by: Erik Sundell <erik.i.sundell@gmail.com>
- update references to default branch name in docs, workflows
- use HEAD in github urls, which always works regardless of default branch name
- fix petstore URLs since the old petstore links seem to have stopped working
to merge, in order:
- [x] approve this PR
- [x] rename the default branch to main in settings
- [x] merge this PR
Related tangent: I've been using [this git default-branch](https://github.com/minrk/git-stuff/blob/main/bin/git-default-branch) to help with my aliases and friends working with repos with different branch names.
Signed-off-by: Min RK <benjaminrk@gmail.com>
...where I thought it already was! Instead of on the test class.
and fix the logic for when it is called a bit:
- call on *all* Spawners, not just the default
- call on named server deletion when remove=True
closes 3451, finishes 3337
Signed-off-by: Min RK <benjaminrk@gmail.com>
should never occur in real applications where only one loop is run, but may occur in tests if the Proxy object lives longer than the loop that is running when it's created (imported?).
I *suspect* this is the source of our intermittent test failures with:
> got Future <Future pending> attached to a different loop
But since they are intermittent, it's hard to be sure, even if this PR passes.
The issue: we were allocating an asyncio.Lock(), which in turn grabs a handle on the current event loop, at *method definition time* in the decorator, instead of *call time*.
The solution: allocate the method at call time *and* double-check to ensure we never use a lock across event loops by storing the locks per-loop.
This should change nothing for 'real' hub instances, where only one loop is ever running, only tests where we start and stop loops a bunch.
Signed-off-by: Min RK <benjaminrk@gmail.com>
and clarify warning when a base handler isn't patched that auth is still being applied
- reorganize patch steps into functions for easier re-use
- patch notebook and jupyter_server handlers if they are already imported
- run patch after initialize to ensure extensions have done their importing before we check what's present
- apply class-level patch even when instance-level patch is happening to avoid triggering patch on every request
This change isn't as big as it looks, because it's mostly moving some re-used code to a couple of functions.
closes https://github.com/jupyter-server/jupyter_server/issues/488
Signed-off-by: Min RK <benjaminrk@gmail.com>
Pin references to github actions we rely on in workflows with jobs that reference GitHub secrets that could get exposed.
Signed-off-by: Min RK <benjaminrk@gmail.com>
When the hub is running in API-only mode, it's
very useful to have the proxy know where to send
URLs that would normally be serviced by the hub.
For example, / might go to a service that renders
a home page, while `/user` might go to a service that
tells the user their server is dead.
Right now, this happens 'out of band', with a process
that has to talk to the proxy directly. This is a
bit messy - the routes need to be re-added when the
proxy restarts, the hub might try to remove them, etc.
By adding support for this in the hub itself, all
this complexity is now removed and the hub continues
to own all the routes in the proxy
This allows for more flexible customization of the login page,
since it allows to re-use the login form in an extending template
by reusing the new block.
This was not cleanly possible before since the main container
was part of the very same block as the form code.
fixes#3414
so changing cookie age changes oauth token expiry,
since these are what are stored in those cookies anyway,
it makes sense for them to expire at the same time
When an oauth client changes, we delete all the tokens
associated with that client. This invalidates all user sessions
for that oauth client, and the oauth client's users will need to
go through the OAuth workflow again after the cache period (specified
by cache_max_age in HubAuth, 5min by default). This is fine in theory,
since oauth client information doesn't change frequently.
However, we were deleting and re-adding all oauth clients each time
the hub started! This was unnecessary, since the data was going to
be the same 99% of the time. Rest of the time, we should just update,
preventing unnecessary churn.
This PR does that.
Ref https://github.com/yuvipanda/jupyterhub-configurator/issues/2
Ref https://github.com/berkeley-dsep-infra/datahub/issues/2284
get_current_user returns a User model instead of a dict.
using cookies for Hub auth is deprecated, so removed
that option and refactored get_current_user
makes testing a PR even easier since we build an sdist and wheel for every PR and push
since artifacts are double-archived, it's not quite as simple as giving a URL to install from,
but this at least makes it available. To use:
- download and unpack zip
- `pip install path/to/whl`
apply patch directly to BaseHandler instead of each handler instance
so that overrides can still take effect (i.e. APIHandler raising 403 instead of redirecting)
Merge request #3257fixed#3256 only on getting-started/services-basics.md
There is still a reference to jupyterhub example cull-idle in reference/services.md
setting per_page in constructor resolves before max_per_page limit is updated from config,
preventing max_per_page from being increased beyond the default limit
we already loaded these values anyway in the first instance,
so remove the redundant Pagination object
Running the Curl as is return a 500 with ```json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
``` Converting the payload to a proper Json
The feature is disabled by default.
If enabled (by setting `login_term_url`), user will have to check the
checkbox to accept the terms and conditions in order to login.
jinja2's async support requires Python 3.6+. That should
be an implementation detail - so we render it in the main
thread (current behavior) but pretend we did not
write_error is a synchronous method called by an async
method from inside the event loop. This means we can't just
schedule an async render_templates in the same loop and wait
for it - that would deadlock.
jinja2 compiled your code differently based on wether you
enable async support or not. Templates compiled with async
support can't be used in cases like ours, where we already
have an event loop running and calling a sync function. So
we maintain two almost identical jinja2 environments
testing other branches is useful, and there's little cost to removing the conditions:
- we don't run PRs from our repo, so test runs aren't duplicated on the repo
- testing on a fork without opening a PR is still useful (I use this often)
- if we push a branch, it should probably be tested (e.g. backport branch), and filters make this extra work
- the cost of running a few extra tests is low, especially given actions' current quotas and parallelism
The jupyterhub/tests/test_spawner.py::test_spawner_routing[has~x] test
failed in py37+ but not in py36, and I think it is foundational to the
socket library of Python that has changed.
This is a stacktrace from Python/3.7.9/x64/lib/python3.7/site-packages/urllib3/util/connection.py:61
```
> for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
E socket.gaierror: [Errno -2] Name or service not known
```
Here is relevant documentation about socket.getaddrinfo.
https://docs.python.org/3.7/library/socket.html#socket.getaddrinfo
- Fixes typo (eolving -> evolving)
- re-use the word current instead of momentary for comprehensibility
- references JupyterHubs current state with its instead of the for comprehensibility
Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>
allows selecting users based on the 'ready' 'active' or 'inactive' states of their servers
- ready: users who have any servers in the 'ready' state
- active: users who have any servers in the 'active' state (i.e. ready OR pending)
- inactive: users who have *no* servers in the 'active' state (inactive + active = all users, no overlap)
Does not change the user model, so a user with *any* ready servers will still return all their servers
Fixes issues with OAuth flows when internal_ssl is enabled.
When internal_ssl was enabled requests to non-internal endpoints failed
because the system CAs were not being loaded.
This caused failures with public OAuth providers with public CAs since
they would fail to validate.
We are users of the napoleon sphinx extension, which helps us parse our
Google Style Python Docstrings, and its syntax suggest we should use
indentation when we use more then one string for an entry in an
Arguments: or Returns: list.
For more details, see: https://github.com/jupyterhub/jupyterhub/pull/3151#issuecomment-676186565
- Explicitly mention min-8-char constraint
- Connect the api_token in the configuration with the one mentioned in auth requests
Co-authored-by: Mike Situ <msitu@ceresimaging.net>
for easier reuse with jupyter_server
mixins have a lot of assumptions about the NotebookApp structure.
Need to make sure these are met by jupyter_server (that's what tests are for!)
When using the `KubeSpawner` it is typical to disable the
`slow_spawn_timeout` by setting it to 0. `zero-to-jupyterhub-k8s`
does this by default [1]. However, this causes an immediate `TimeoutError`
which gets logged as a warning like this:
>User hub-stress-test-123 is slow to start (timeout=0)
This avoids the warning by checking the value and if disabled simply
returns without logging the warning.
[1] https://github.com/jupyterhub/zero-to-jupyterhub-k8s/commit/b4738edc5Closes#3126
- Related issue: #3120. Closes: #3120.
- I realized that spawner.clear_state() is called before
spawner.post_stop_hook(). This caused was a bit surprising to me,
and caused some issues.
- I tried the naive strategy of moving clear_state to later and
setting the orm_state to `{}` at the point where it used to be
clear.
- This tries to maintain the exception behavior of clear_state and
post_stop_hook, but is exactly identical.
- To review:
- I'm not sure this is a good idea!
- Carefully consider the implications of this. I am not at all sure
about unintended side-effects or what intended semantics are.
This was added in PR #2721 and by default results in just printing
out "10" without any context when starting the hub service. This
simply removes the orphan print statement.
I'm open to changing this to a debug log statement with context if
someone finds that useful, e.g.:
`self.log.debug('Effective init_spawners_timeout: %s', init_spawners_timeout)`
Bug #2852 describes an issue where templates cannot be found by
JupyterHub when using the Docker images built out of this repo. The
issue turned out to be due to missing node_modules at the time of build.
There is a hook in the `package.json` that causes node_modules to be
copied to the static/components directory post-install. If this is not
run, those components are not in the static directory and thus are not
included in the wheel when it is built.
Fix#2905 fixed one problem--the `bower-lite` hook script wasn't copied
to the Docker image, and so the hook couldn't run, but the other issue
is that the client dependencies are never explicitly built. They must be
built prior to the wheel build, and the hook script must have run so
they are copied to the ./static folder, which is included in the wheel
build thanks to [MANIFEST.in][1]
.. note::
This removes the verbose flag from the wheel build command. The
reason is that it generates a lot of writes to stdout. It seems that
wheel can (or always) is switching to non-blocking mode, which can cause
EAGAIN to be raised, which leads to fun errors like:
BlockingIOError(.., 'write could not complete without blocking', ..)
The wheels fail to build if this error is raised. Removing the verbosity
flag is a quick solution (it drastically reduces writes to STDOUT), but
comes at the cost of more trouble debugging a failed wheel build. Adding
the "-v" back in the Dockerfile when debugging a build failure is still
possible. [Credit: @vbraun][2]
.. note::
This commit also removes some extraneous COPY operations during the
Docker build, in particular the /src/jupyterhub/share directory is
not used unless users have explicitly override their
jupyterhub_config.py to include it somehow. If the default
data_files_path behavior is used, JupyterHub should find the proper
static directory when the application loads.
Fixes: #2852
[1]: https://packaging.python.org/guides/using-manifest-in/
[2]:
https://github.com/travis-ci/travis-ci/issues/4704#issuecomment-348435959
- base Expiring class
- ensures expiring values (OAuthCode, OAuthAccessToken, APIToken) are not returned from `find`
- all expire appropriately via purge_expired
behaves more like one would expect (same as try get-key, except: return default)
without relying on cache presence or underlying key type (integer only)
This does some of the test with the latest traitlets.
We are looking into making a 5.0 release and would like to have some
confidence that it does not break too many things.
They are less relevant than other request and could very well end up
cluttering the logs. It is not uncomming for these requests to be made
every second or every other second.
In case there are multiple singleuser notebooks at different
versions we want to log each of those mismatches as a warning
so this changes the global _version_mismatch_warning_logged flag
from a bool to a dict keyed by the hub/singleuser version mismatch
combination. A test wrinkle is added for that scenario.
Part of #2970
As a new contributor to jupyterhub it took awhile to get
up and running locally mainly because I didn't have sqlite
installed but also because I was flipping between README,
CONTRIBUTING and the actual contributing docs which are all
a little bit different.
This does a few things:
- Updates the contributor sphinx docs to mention that how
one chooses to isolate their development environment is
up to them with a link to the detailed forum thread on
that topic.
- Updates the contributor sphinx docs to mention sqlite and
database setup in general. While in here some trailing
whitespaces are cleaned up.
- Leave a comment in CONTRIBUTING.md about the redundant
information in the docs on getting a development environment
setup. Long-term we should really get those merged so there
is a single authoritative document on how to get a dev env
setup for contributing to jupyterhub.
- Link to the jupyterhub gitter channel for asking questions.
If your jupyterhub and jupyterhub-singleuser instances
are running at different minor or greater versions a
warning gets logged per active server which can be a lot
when you have hundreds of active servers.
This adds a flag to that version mismatch logging logic
such that the warning is only logged once per restart
of the hub server.
Closes issue #2970
APIHandler.server_model unconditionally returns the Spawner's
user_options dict but it wasn't mentioned in the API reference
so it's added here. The description is taken from the docstring
on Spawner.user_options.
Closes issue #2965
Authorization header has the form "<type> <credentials>"
rather than checking for "token" only, preserve type value, which could be Bearer, Basic, etc.
query on Server objects instead of User objects
avoids lots of ORM work on startup since there are typically a small number of running servers
relative to the total number of users
this also means that the users dict is not fully populated. Is that okay? I hope so.
Not exactly all though as some will be ignored by the .dockerignore
file. This change ensures we don't get future issues caused by a failure
to update what needs to be copied to the build stage and not like we've
had recently.
This fixes#2852 by adding a script part of package.json. But is this
enough? Should we perhaps look in MANIFEST.in and copy some more files
listed there?
This is all thanks to people coming together and helping out figuring
out the issue in https://github.com/jupyterhub/jupyterhub/issues/2852.
Thank you @shingo78 for spotting that we missed bower-lite and its role
and all others who reported and helped debug this!
Updated capitalisation of names. Addressed revisions.
Fleshed out the prerequists and explanation of access control.
Added part of configuration section to set JupyterLab as the default interface.
corrected need for sudo
Added warning to reverse-proxy section to recommend use of HTTPS and firewall.
- A trivial bug caused by my last change to #2397 - made possible by
the fact we didn't have a way to reliable test PAM stuff.
- Thanks to @narnish for noticing.
- Closes: #2875
- We now default to ubuntu bionic (18.04) and try once with ubuntu xenial
(16.04).
- We now always test Python 3.8 but allow it to fail, as compared to not
allowing it to fail and only testing it on tagged commits. This is a
bugfix I'd say.
- We now no longer test Python 3.5 and Python 3.6 dedicatedly without
any custom configuration like usage of subdomain, which allows us to
reduce the number of build jobs in a way I think makes a great sense to
compromise.
Some notes:
- Added a conda-forge and DockerHub badge
- Added logo's and made us conform with the team-compass badges section
as can be found here:
https://jupyterhub-team-compass.readthedocs.io/en/latest/building-blocks/readme-badges.html
- Concluded that our CircleCI badge is good because it let's us overview
the repo's build systems, but that it is bad because it is only is about
documentation preview in PRs which isn't useful in a README's header in
a way.
- Noted there was a CircleCI token in the badge, that I believe is meant
to be used with private repo access rather than public repo access. I'm
not sure we need that but I made it a markdown/html comment for now.
- Decided to not manually add a line break between badges. I figured it
could make sense to break manually before the social badges instead of
automatically letting it wrap at some point, but we don't really know
the size of the window viewing so it felt like a bad idea to hardcode
that.
- When the Dockerfile was turned into a multi-stage build, it seems
the share/ directory was not copied to the final image. This
resulted in certain components (static/components/, static/css/)
being missing, which resulted in the JupyterHub share directory not
being findable (in jupyterhub/_data.py). This led to all kinds of
weird havoc, like templates not being findable (#2852).
- I am still unsure if this is the right fix, please check this well.
- Closes: #2852
- While debugging another problem, I noticed some failures to build
the C extensions in the logs. Adding build-essential should fix
that (also as mentioned in the logs themselves).
- Extensions failed for tornado, sqlalchemy, and pyrsistent(pvectorc)
and can be found by searching the previous output for "fail".
Closes#2819 by exiting JupyterHub directly with an error if a config
file has been specified for the config_file traitlet, for example
through the -f or --config flag, but isn't available on the file
system.
- In the cull script, the max_age and inactive_limit are used from the
outer scope. In the case that you add extra logic, one may want to
modify these values.
- In that case, you either have to rename them locally, or access the
outer scope with "nonlocal", the first of which is too much work,
the second of which has a high chance of introducing bugs (as it did
for me).
- This change introduces a fix for everyone. It doesn't change basic
functionality, but makes local modifications simpler.
- Pass in user object & request object only explicitly.
Much better interface that is harder to break by internal
refactoring. We can always add more parameters if needed?
/user-redirect/ is used to help link to a particular url
in the logged in user's authenticated notebook. For example,
if I'm logged in as user 'yuvipanda' and hit the URL
/hub/user-redirect/git-pull, it'll redirect me to
/user/yuvipanda/git-pull. This is extremely useful in
connecting hub links to notebook server extensions, such
as nbgitpuller.
Admins might want to customize how this redirection is done -
for example, redirect users to different running servers
based on the nbgitpuller repository they are linking from.
Adding a hook here helps accomplish that.
allows services to be explicitly blessed to skip the extra oauth confirmation page
added in 1.0
This confirmation page is unhelpful for many admin-managed services,
and is mainly intended for cross-user access.
The default behavior is unchanged, but services can now opt-out of confirmation
(as is done already for the user's own servers).
Use with caution, as this eliminates users' ability to confirm that a service
should be able to authenticate them.
- API requests to non-running servers are not uncommon when you cull
servers and people leave tabs open and active. It returns with 503
and logs all headers, which can take up half of our total log lines
- This avoids logging headers for all 502 and 503 return statuses.
#2747 presented an alternative (more complex) implementation, but this
turned out to be appropriate.
- Closes: #2747
In current versions of MySQL and MariaDB `innodb_file_format`
and `innodb_large_prefix` have been removed. This allows them to not
exist and makes sure the format for the rows are `Dynamic` (default
for current versions).
If init_spawners takes too long (default: 10 seconds) to complete,
app start will be allowed to continue while finishing in the background.
Adds new `check` pending state for the initial check.
Checking lots of spawners can take a long time,
so allowing this to be async limits the impact on startup time
at the expense of starting the Hub in a not-quite-fully-ready state.
- Introduce the EventLog class from BinderHub for emitting
structured event data
- Instrument server starts and stops to emit events
- Defaults to not saving any events anywhere
The flask example in the documentation was still using the
input argument `cookie_cache_max_age` when instantiating
`HubAuth` object. `cookie_cache_max_age` is deprecated since
JupyterHub 0.8 and should be replaced by `cache_max_age`.
- Install pip in the docs conda env (or conda complains).
- Do not override page.html, the next/previous buttons are now handled by
alabaster_jupyterhub (this actually remove the duplicated next/prev
buttons)
- use alabaster_jupyterhub when building locally, this make it easy for
new contributor to get the _exact_ same appearance than on
readthedocs.
- cull_idle_servers.py gets the full server state, so is capable of
doing any kind of arbitrary logic on the profile in order to be more
flexible in culling.
- This patch does not change anything, but gives an embedded
(commented out) example of how you can easily add custom logic to
the script.
- This was added as a tempate/demo for #2598.
* Add missing responses (doesn't include all possible responses yet)
* Refactor invalid multi in body parameters into a single parameter
* Change form type into valid formData
* Fix use of required fields
* Apply a few other minor fixes
Fixes https://github.com/jupyterhub/jupyterhub/issues/2566 to some
degree by making the announcement stand out using twitter-bootstrap
classes `alert` and `alert-warning`. Perhaps we could theme twitter
bootstrap or this alert specifically with jupyter related colors as well
though?
Windows doesn't have support for signal handling so it can't use the
signal handling capabilities of asyncio. Use the previous atexit
strategy on the Windows case instead.
Signed-off-by: Alejandro Del Castillo <alejandro.delcastillo@ni.com>
Big thanks to Erik, Tim, and Min for the great comments!
Change names to be more clear, add function doc comments,
change scoping on some functions, add handle_logout to let
people take custom logout actions, extract
render_logout_page from get method, add TODO.
AS A developer of a Logout handler
I WANT to be able to call a function to kill spawners and
do other backend logout stuff and a separate function to
forward the user along the logout chain.
I believe this PR adds (moderately private) methods to the
Logout Handler to do just that.
If you are reporting an issue with JupyterHub, please use the [GitHub issue](https://github.com/jupyterhub/jupyterhub/issues) search feature to check if your issue has been asked already. If it has, please add your comments to the existing issue.
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
**Additional context**
Add any other context about the problem here.
- Running `jupyter troubleshoot` from the command line, if possible, and posting
its output would also be helpful.
- Running in `--debug` mode can also be helpful for troubleshooting.
[](https://github.com/jupyterhub/jupyterhub/actions)
# see me at: http://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyter/jupyterhub/master/docs/rest-api.yml#/default
swagger:'2.0'
# see me at: https://petstore3.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/HEAD/docs/rest-api.yml#/default
swagger:"2.0"
info:
title:JupyterHub
description:The REST API for JupyterHub
version:0.9.0dev
version:1.5.0
license:
name:BSD-3-Clause
schemes:
- [http, https]
schemes:[http, https]
securityDefinitions:
token:
type:apiKey
@@ -28,7 +27,7 @@ paths:
This endpoint is not authenticated for the purpose of clients and user
to identify the JupyterHub version before setting up authentication.
responses:
'200':
"200":
description:The JupyterHub version
schema:
type:object
@@ -44,7 +43,7 @@ paths:
JupyterHub's version and executable path,
and which Authenticator and Spawner are active.
responses:
'200':
"200":
description:Detailed JupyterHub info
schema:
type:object
@@ -79,13 +78,28 @@ paths:
/users:
get:
summary:List users
parameters:
- name:state
in:query
required:false
type:string
enum:["inactive","active","ready"]
description:|
Return only users who have servers in the given state.
If unspecified, return all users.
active: all users with any active servers (ready OR pending)
ready: all users who have any ready servers (running, not pending)
inactive: all users who have *no* active servers (complement of active)
Added in JupyterHub 1.3
responses:
'200':
"200":
description:The Hub's user list
schema:
type:array
items:
$ref:'#/definitions/User'
$ref:"#/definitions/User"
post:
summary:Create multiple users
parameters:
@@ -104,13 +118,13 @@ paths:
description:whether the created users should be admins
type:boolean
responses:
'201':
"201":
description:The users have been created
schema:
type:array
description:The created users
items:
$ref:'#/definitions/User'
$ref:"#/definitions/User"
/users/{name}:
get:
summary:Get a user by name
@@ -121,10 +135,10 @@ paths:
required:true
type:string
responses:
'200':
"200":
description:The User model
schema:
$ref:'#/definitions/User'
$ref:"#/definitions/User"
post:
summary:Create a single user
parameters:
@@ -134,10 +148,10 @@ paths:
required:true
type:string
responses:
'201':
"201":
description:The user has been created
schema:
$ref:'#/definitions/User'
$ref:"#/definitions/User"
patch:
summary:Modify a user
description:Change a user's name or admin status
@@ -161,10 +175,10 @@ paths:
type:boolean
description:update admin (optional, if another key is updated i.e. name)
responses:
'200':
"200":
description:The updated user info
schema:
$ref:'#/definitions/User'
$ref:"#/definitions/User"
delete:
summary:Delete a user
parameters:
@@ -174,14 +188,12 @@ paths:
required:true
type:string
responses:
'204':
"204":
description:The user has been deleted
/users/{name}/activity:
post:
summary:
Notify Hub of activity for a given user.
description:
Notify the Hub of activity by the user,
summary:Notify Hub of activity for a given user.
description:Notify the Hub of activity by the user,
e.g. accessing a service or (more likely)
actively using a server.
parameters:
@@ -190,7 +202,7 @@ paths:
in:path
required:true
type:string
- body:
- name:body
in:body
schema:
type:object
@@ -202,34 +214,37 @@ paths:
Timestamp of last-seen activity for this user.
Only needed if this is not activity associated
with using a given server.
required:false
servers:
description:|
Register activity for specific servers by name.
The keys of this dict are the names of servers.
The default server has an empty name ('').
required:false
type:object
properties:
'<server name>':
"<server name>":
description:|
Activity for a single server.
type:object
required:
- last_activity
properties:
last_activity:
required:true
type:string
format:date-time
description:|
Timestamp of last-seen activity on this server.
example:
last_activity:'2019-02-06T12:54:14Z'
last_activity:"2019-02-06T12:54:14Z"
servers:
'':
last_activity:'2019-02-06T12:54:14Z'
"":
last_activity:"2019-02-06T12:54:14Z"
gpu:
last_activity:'2019-02-06T12:54:14Z'
last_activity:"2019-02-06T12:54:14Z"
responses:
"401":
$ref:"#/responses/Unauthorized"
"404":
description:Nosuch user
/users/{name}/server:
post:
summary:Start a user's single-user notebook server
@@ -239,19 +254,23 @@ paths:
in:path
required:true
type:string
- options:
- name:options
description:|
Spawn options can be passed as a JSON body
when spawning via the API instead of spawn form.
The structure of the options
will depend on the Spawner's configuration.
The body itself will be available as `user_options` for the
Spawner.
in:body
required:false
schema:
type:object
responses:
'201':
"201":
description:The user's notebook server has started
'202':
"202":
description:The user's notebook server has not yet started, but has been requested
delete:
summary:Stop a user's server
@@ -262,9 +281,9 @@ paths:
required:true
type:string
responses:
'204':
"204":
description:The user's notebook server has stopped
'202':
"202":
description:The user's notebook server has not yet stopped as it is taking a while to stop
/users/{name}/servers/{server_name}:
post:
@@ -276,11 +295,14 @@ paths:
required:true
type:string
- name:server_name
description:name given to a named-server
description:|
name given to a named-server.
Note that depending on your JupyterHub infrastructure there are chracterter size limitation to `server_name`. Default spawner with K8s pod will not allow Jupyter Notebooks to be spawned with a name that contains more than 253 characters (keep in mind that the pod will be spawned with extra characters to identify the user and hub).
in:path
required:true
type:string
- options:
- name:options
description:|
Spawn options can be passed as a JSON body
when spawning via the API instead of spawn form.
@@ -288,11 +310,12 @@ paths:
will depend on the Spawner's configuration.
in:body
required:false
schema:
type:object
responses:
'201':
"201":
description:The user's notebook named-server has started
'202':
"202":
description:The user's notebook named-server has not yet started, but has been requested
delete:
summary:Stop a user's named-server
@@ -307,76 +330,106 @@ paths:
in:path
required:true
type:string
- name:remove
- name:body
in:body
required:false
schema:
type:object
properties:
remove:
type:boolean
description:|
Whether to fully remove the server, rather than just stop it.
Removing a server deletes things like the state of the stopped server.
in:body
required:false
type:boolean
Default: false.
responses:
'204':
"204":
description:The user's notebook named-server has stopped
'202':
"202":
description:The user's notebook named-server has not yet stopped as it is taking a while to stop
/users/{name}/tokens:
parameters:
- name:name
description:username
in:path
required:true
type:string
get:
summary:List tokens for the user
responses:
'200':
"200":
description:The list of tokens
schema:
type:array
items:
$ref:'#/definitions/Token'
$ref:"#/definitions/Token"
"401":
$ref:"#/responses/Unauthorized"
"404":
description:Nosuch user
post:
summary:Create a new token for the user
parameters:
- name:expires_in
- name:token_params
in:body
required:false
schema:
type:object
properties:
expires_in:
type:number
required:false
in:body
description:lifetime (in seconds) after which the requested token will expire.
- name:note
note:
type:string
required:false
in:body
description:A note attached to the token for future bookkeeping
responses:
'201':
"201":
description:The newly created token
schema:
$ref:'#/definitions/Token'
$ref:"#/definitions/Token"
"400":
description:Body must be a JSON dict or empty
/users/{name}/tokens/{token_id}:
parameters:
- name:name
description:username
in:path
required:true
type:string
- name:token_id
in:path
required:true
type:string
get:
summary:Get the model for a token by id
responses:
'200':
"200":
description:The info for the new token
schema:
$ref:'#/definitions/Token'
$ref:"#/definitions/Token"
delete:
summary:Delete (revoke) a token by id
responses:
'204':
"204":
description:The token has been deleted
/user:
get:
summary:Return authenticated user's model
description:
parameters:
responses:
'200':
"200":
description:The authenticated user's model is returned.
schema:
$ref:"#/definitions/User"
/groups:
get:
summary:List groups
responses:
'200':
"200":
description:The list of groups
schema:
type:array
items:
$ref:'#/definitions/Group'
$ref:"#/definitions/Group"
/groups/{name}:
get:
summary:Get a group by name
@@ -387,10 +440,10 @@ paths:
required:true
type:string
responses:
'200':
"200":
description:The group model
schema:
$ref:'#/definitions/Group'
$ref:"#/definitions/Group"
post:
summary:Create a group
parameters:
@@ -400,10 +453,10 @@ paths:
required:true
type:string
responses:
'201':
"201":
description:The group has been created
schema:
$ref:'#/definitions/Group'
$ref:"#/definitions/Group"
delete:
summary:Delete a group
parameters:
@@ -413,7 +466,7 @@ paths:
required:true
type:string
responses:
'204':
"204":
description:The group has been deleted
/groups/{name}/users:
post:
@@ -437,10 +490,10 @@ paths:
items:
type:string
responses:
'200':
"200":
description:The users have been added to the group
schema:
$ref:'#/definitions/Group'
$ref:"#/definitions/Group"
delete:
summary:Remove users from a group
parameters:
@@ -462,18 +515,18 @@ paths:
items:
type:string
responses:
'200':
"200":
description:The users have been removed from the group
/services:
get:
summary:List services
responses:
'200':
"200":
description:The service list
schema:
type:array
items:
$ref:'#/definitions/Service'
$ref:"#/definitions/Service"
/services/{name}:
get:
summary:Get a service by name
@@ -484,16 +537,16 @@ paths:
required:true
type:string
responses:
'200':
"200":
description:The Service model
schema:
$ref:'#/definitions/Service'
$ref:"#/definitions/Service"
/proxy:
get:
summary:Get the proxy's routing table
description:A convenience alias for getting the routing table directly from the proxy
responses:
'200':
"200":
description:Routing table
schema:
type:object
@@ -501,7 +554,7 @@ paths:
post:
summary:Force the Hub to sync with the proxy
responses:
'200':
"200":
description:Success
patch:
summary:Notify the Hub about a new proxy
@@ -527,7 +580,7 @@ paths:
type:string
description:CONFIGPROXY_AUTH_TOKEN for the new proxy
responses:
'200':
"200":
description:Success
/authorizations/token:
post:
@@ -539,16 +592,17 @@ paths:
Logging in via this method is only available when the active Authenticator
accepts passwords (e.g. not OAuth).
parameters:
- name:username
- name:credentials
in:body
required:false
schema:
type:object
properties:
username:
type:string
- name:password
in:body
required:false
password:
type:string
responses:
'200':
"200":
description:The new API token
schema:
type:object
@@ -556,7 +610,7 @@ paths:
token:
type:string
description:The new API token.
'403':
"403":
description:The user can not be authenticated.
/authorizations/token/{token}:
get:
@@ -567,9 +621,9 @@ paths:
required:true
type:string
responses:
'200':
"200":
description:The user or service identified by the API token
Redirect users to this URL to begin the OAuth process.
It is not an API endpoint.
@@ -618,6 +672,11 @@ paths:
in:query
required:true
type:string
responses:
"200":
description:Success
"400":
description:OAuth2Error
/oauth2/token:
post:
summary:Request an OAuth2 token
@@ -629,31 +688,31 @@ paths:
parameters:
- name:client_id
description:The client id
in:form
in:formData
required:true
type:string
- name:client_secret
description:The client secret
in:form
in:formData
required:true
type:string
- name:grant_type
description:The grant type (always 'authorization_code')
in:form
in:formData
required:true
type:string
- name:code
description:The code provided by the authorization redirect
in:form
in:formData
required:true
type:string
- name:redirect_uri
description:The redirect url
in:form
in:formData
required:true
type:string
responses:
'200':
"200":
description:JSON response including the token
schema:
type:object
@@ -668,14 +727,28 @@ paths:
post:
summary:Shutdown the Hub
parameters:
- name:proxy
- name:body
in:body
schema:
type:object
properties:
proxy:
type:boolean
description:Whether the proxy should be shutdown as well (default from Hub config)
- name:servers
in:body
servers:
type:boolean
description:Whether users' notebook servers should be shutdown as well (default from Hub config)
responses:
"202":
description:Shutdown successful
"400":
description:Unexpeced value for proxy or servers
# Descriptions of common responses
responses:
NotFound:
description:The specified resource was not found
Unauthorized:
description:Authentication/Authorization error
definitions:
User:
type:object
@@ -703,11 +776,10 @@ definitions:
format:date-time
description:Timestamp of last-seen activity from the user
servers:
type:object
type:array
description:The active servers for this user.
items:
schema:
$ref:'#/definitions/Server'
$ref:"#/definitions/Server"
Server:
type:object
properties:
@@ -745,6 +817,9 @@ definitions:
state:
type:object
description:Arbitrary internal state from this server's spawner. Only available on the hub's users list or get-user-by-name method, and only if a hub admin. None otherwise.
user_options:
type:object
description:User specified options for the user's spawned instance of a single-user server.
Group:
type:object
properties:
@@ -798,7 +873,7 @@ definitions:
description:The user that owns a token (undefined if owned by a service)
service:
type:string
description:The service that owns the token (undefined of owned by a user)
description:The service that owns the token (undefined if owned by a user)
note:
type:string
description:A note about the token, typically describing what it was created for.
The same JupyterHub API spec, as found here, is available in an interactive form
`here (on swagger's petstore) <http://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/master/docs/rest-api.yml#!/default>`__.
`here (on swagger's petstore) <https://petstore3.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/HEAD/docs/rest-api.yml#!/default>`__.
The `OpenAPI Initiative`_ (fka Swagger™) is a project used to describe
JupyterHub can be configured to record structured events from a running server using Jupyter's `Telemetry System`_. The types of events that JupyterHub emits are defined by `JSON schemas`_ listed at the bottom of this page_.
- [Press release on Jupyter and Cori](http://www.nersc.gov/news-publications/nersc-news/nersc-center-news/2016/jupyter-notebooks-will-open-up-new-possibilities-on-nerscs-cori-supercomputer/)
- [Moving and sharing data](https://www.nersc.gov/assets/Uploads/03-MovingAndSharingData-Cholia.pdf)
@@ -28,7 +30,7 @@ Please submit pull requests to update information or to add new institutions or
### University of California Davis
- [Spinning up multiple Jupyter Notebooks on AWS for a tutorial](https://github.com/mblmicdiv/course2017/blob/master/exercises/sourmash-setup.md)
- [Spinning up multiple Jupyter Notebooks on AWS for a tutorial](https://github.com/mblmicdiv/course2017/blob/HEAD/exercises/sourmash-setup.md)
Although not technically a JupyterHub deployment, this tutorial setup
may be helpful to others in the Jupyter community.
@@ -67,6 +69,7 @@ easy to do with RStudio too.
### University of Colorado Boulder
- (CU Research Computing) CURC
- [JupyterHub User Guide](https://www.rc.colorado.edu/support/user-guide/jupyterhub.html)
- Slurm job dispatched on Crestone compute cluster
- log troubleshooting
@@ -77,13 +80,17 @@ easy to do with RStudio too.
- Earth Lab at CU
- [Tutorial on Parallel R on JupyterHub](https://earthdatascience.org/tutorials/parallel-r-on-jupyterhub/)
### George Washington University
- [Jupyter Hub](http://go.gwu.edu/jupyter) with university single-sign-on. Deployed early 2017.
### HTCondor
- [HTCondor Python Bindings Tutorial from HTCondor Week 2017 includes information on their JupyterHub tutorials](https://research.cs.wisc.edu/htcondor/HTCondorWeek2017/presentations/TueBockelman_Python.pdf)
- [nbgraderutils](https://github.com/dice-group/nbgraderutils): Use JupyterHub + nbgrader + iJava kernel for online Java exercises. Used in lecture Statistical Natural Language Processing.
### Penn State University
- [Press release](https://news.psu.edu/story/523093/2018/05/24/new-open-source-web-apps-available-students-and-faculty): "New open-source web apps available for students and faculty" (but Hub is currently down; checked 04/26/19)
- [Deploy JupyterHub on a Supercomputer with SSH](https://zonca.github.io/2017/05/jupyterhub-hpc-batchspawner-ssh.html)
- [Run Jupyterhub on a Supercomputer](https://zonca.github.io/2015/04/jupyterhub-hpc.html)
- [Deploy JupyterHub on a VM for a Workshop](https://zonca.github.io/2016/04/jupyterhub-sdsc-cloud.html)
@@ -134,7 +146,10 @@ easy to do with RStudio too.
- Kristen Thyng - Oceanography
- [Teaching with JupyterHub and nbgrader](http://kristenthyng.com/blog/2016/09/07/jupyterhub+nbgrader/)
### Elucidata
- What's new in Jupyter Notebooks @[Elucidata](https://elucidata.io/):
- Using Jupyter Notebooks with Jupyterhub on GCP, managed by GKE - https://medium.com/elucidata/why-you-should-be-using-a-jupyter-notebook-8385a4ccd93d
In short, where you see `/user/name/notebooks/foo.ipynb` use `/hub/user-redirect/notebooks/foo.ipynb` (replace `/user/name` with `/hub/user-redirect`).
Sharing links to notebooks is a common activity,
and can look different based on what you mean.
Your first instinct might be to copy the URL you see in the browser,
e.g. `hub.jupyter.org/user/yourname/notebooks/coolthing.ipynb`.
However, let's break down what this URL means:
`hub.jupyter.org/user/yourname/` is the URL prefix handled by _your server_,
which means that sharing this URL is asking the person you share the link with
to come to _your server_ and look at the exact same file.
In most circumstances, this is forbidden by permissions because the person you share with does not have access to your server.
What actually happens when someone visits this URL will depend on whether your server is running and other factors.
But what is our actual goal?
A typical situation is that you have some shared or common filesystem,
such that the same path corresponds to the same document
(either the exact same document or another copy of it).
Typically, what folks want when they do sharing like this
is for each visitor to open the same file _on their own server_,
so Breq would open `/user/breq/notebooks/foo.ipynb` and
Seivarden would open `/user/seivarden/notebooks/foo.ipynb`, etc.
JupyterHub has a special URL that does exactly this!
It's called `/hub/user-redirect/...`.
So if you replace `/user/yourname` in your URL bar
with `/hub/user-redirect` any visitor should get the same
URL on their own server, rather than visiting yours.
In JupyterLab 2.0, this should also be the result of the "Copy Shareable Link"
2. If you need to allow for even more users, a dynamic amount of servers can be used on a cloud,
take a look at the `Zero to JupyterHub with Kubernetes <https://github.com/jupyterhub/zero-to-jupyterhub-k8s>`__ .
Four subsystems make up JupyterHub:
* a **Hub** (tornado process) that is the heart of JupyterHub
* a **configurable http proxy** (node-http-proxy) that receives the requests from the client's browser
* multiple **single-user Jupyter notebook servers** (Python/IPython/tornado) that are monitored by Spawners
* an **authentication class** that manages how users can access the system
Besides these central pieces, you can add optional configurations through a `config.py` file and manage users kernels on an admin panel. A simplification of the whole system can be seen in the figure below:
There are two broad categories of user environments that depend on what
@@ -141,8 +137,8 @@ When JupyterHub uses **container-based** Spawners (e.g. KubeSpawner or
DockerSpawner), the 'system-wide' environment is really the container image
which you are using for users.
In both cases, you want to *avoid putting configuration in user home
directories* because users can change those configuration settings. Also,
In both cases, you want to _avoid putting configuration in user home
directories_ because users can change those configuration settings. Also,
home directories typically persist once they are created, so they are
difficult for admins to update later.
@@ -179,3 +175,13 @@ The number of named servers per user can be limited by setting
```python
c.JupyterHub.named_server_limit_per_user=5
```
## Switching to Jupyter Server
[Jupyter Server](https://jupyter-server.readthedocs.io/en/latest/) is a new Tornado Server backend for Jupyter web applications (e.g. JupyterLab 3.0 uses this package as its default backend).
By default, the single-user notebook server uses the (old) `NotebookApp` from the [notebook](https://github.com/jupyter/notebook) package. You can switch to using Jupyter Server's `ServerApp` backend (this will likely become the default in future releases) by setting the `JUPYTERHUB_SINGLEUSER_APP` environment variable to:
The thing which users directly connect to is the proxy, by default
@@ -22,7 +21,6 @@ The default JupyterHub proxy is
and that page has some docs. If you are using a different proxy, such
as Traefik, these instructions are probably not relevant to you.
## Configuration options
`c.JupyterHub.cleanup_servers = False` should be set, which tells the
@@ -37,16 +35,12 @@ it yourself).
token for authenticating communication with the proxy.
`c.ConfigurableHTTPProxy.api_url = 'http://localhost:8001'` should be
set to the URL which the hub uses to connect *to the proxy's API*.
set to the URL which the hub uses to connect _to the proxy's API_.
## Proxy configuration
You need to configure a service to start the proxy. An example
command line for this is `configurable-http-proxy --ip=127.0.0.1
--port=8000 --api-ip=127.0.0.1 --api-port=8001
--default-target=http://localhost:8081
--error-target=http://localhost:8081/hub/error`. (Details for how to
command line for this is `configurable-http-proxy --ip=127.0.0.1 --port=8000 --api-ip=127.0.0.1 --api-port=8001 --default-target=http://localhost:8081 --error-target=http://localhost:8081/hub/error`. (Details for how to
do this is out of scope for this tutorial - for example it might be a
systemd service on within another docker cotainer). The proxy has no
configuration files, all configuration is via the command line and
@@ -54,7 +48,7 @@ environment variables.
`--api-ip` and `--api-port` (which tells the proxy where to listen) should match the hub's `ConfigurableHTTPProxy.api_url`.
`--ip`, `-port`, and other options configure the *user* connections to the proxy.
`--ip`, `-port`, and other options configure the _user_ connections to the proxy.
`--default-target` and `--error-target` should point to the hub, and used when users navigate to the proxy originally.
@@ -67,14 +61,12 @@ what other options are needed, for example SSL options. Note that
these are configured in the hub if the hub is starting the proxy - you
need to move the options to here.
## Docker image
You can use [jupyterhub configurable-http-proxy docker
@@ -50,11 +50,8 @@ A Service may have the following properties:
If a service is also to be managed by the Hub, it has a few extra options:
-`command: (str/Popen list`) - Command for JupyterHub to spawn the service.
- Only use this if the service should be a subprocess.
- If command is not specified, the Service is assumed to be managed
externally.
- If a command is specified for launching the Service, the Service will
-`command: (str/Popen list)` - Command for JupyterHub to spawn the service. - Only use this if the service should be a subprocess. - If command is not specified, the Service is assumed to be managed
externally. - If a command is specified for launching the Service, the Service will
be started and managed by the Hub.
-`environment: dict` - additional environment variables for the Service.
-`user: str` - the name of a system user to manage the Service. If
@@ -91,9 +88,9 @@ This example would be configured as follows in `jupyterhub_config.py`:
### Authenticating tornado services with JupyterHub
Since most Jupyter services are written with tornado,
we include a mixin class, [`HubAuthenticated`][HubAuthenticated],
we include a mixin class, [`HubAuthenticated`][hubauthenticated],
for quickly authenticating your own tornado services with JupyterHub.
Tornado's `@web.authenticated` method calls a Handler's `.get_current_user`
@@ -309,34 +306,34 @@ class MyHandler(HubAuthenticated, web.RequestHandler):
...
```
The HubAuth will automatically load the desired configuration from the Service
environment variables.
If you want to limit user access, you can whitelist users through either the
If you want to limit user access, you can specify allowed users through either the
`.hub_users` attribute or `.hub_groups`. These are sets that check against the
username and user group list, respectively. If a user matches neither the user
list nor the group list, they will not be allowed access. If both are left
undefined, then any user will be allowed.
### Implementing your own Authentication with JupyterHub
If you don't want to use the reference implementation
(e.g. you find the implementation a poor fit for your Flask app),
you can implement authentication via the Hub yourself.
We recommend looking at the [`HubAuth`][HubAuth] class implementation for reference,
We recommend looking at the [`HubAuth`][hubauth] class implementation for reference,
and taking note of the following process:
1. retrieve the cookie `jupyterhub-services` from the request.
2. Make an API request `GET /hub/api/authorizations/cookie/jupyterhub-services/cookie-value`,
where cookie-value is the url-encoded value of the `jupyterhub-services` cookie.
This request must be authenticated with a Hub API token in the `Authorization` header.
This request must be authenticated with a Hub API token in the `Authorization` header,
for example using the `api_token` from your [external service's configuration](#externally-managed-services).
For example, with [requests][]:
```python
r = requests.get(
'/'.join((["http://127.0.0.1:8081/hub/api",
'/'.join(["http://127.0.0.1:8081/hub/api",
"authorizations/cookie/jupyterhub-services",
quote(encrypted_cookie, safe=''),
]),
@@ -353,22 +350,21 @@ and taking note of the following process:
```json
{
"name": "inara",
"groups": ["serenity", "guild"],
"groups": ["serenity", "guild"]
}
```
An example of using an Externally-Managed Service and authentication is
in [nbviewer README][nbviewer example] section on securing the notebook viewer,
and an example of its configuration is found [here](https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/base.py#L94).
and an example of its configuration is found [here](https://github.com/jupyter/nbviewer/blob/ed942b10a52b6259099e2dd687930871dc8aac22/nbviewer/providers/base.py#L95).
nbviewer can also be run as a Hub-Managed Service as described [nbviewer README][nbviewer example]
@@ -8,26 +8,26 @@ and a custom Spawner needs to be able to take three actions:
- poll whether the process is still running
- stop the process
## Examples
Custom Spawners for JupyterHub can be found on the [JupyterHub wiki](https://github.com/jupyterhub/jupyterhub/wiki/Spawners).
Some examples include:
- [DockerSpawner](https://github.com/jupyterhub/dockerspawner) for spawning user servers in Docker containers
*`dockerspawner.DockerSpawner` for spawning identical Docker containers for
-`dockerspawner.DockerSpawner` for spawning identical Docker containers for
each users
*`dockerspawner.SystemUserSpawner` for spawning Docker containers with an
-`dockerspawner.SystemUserSpawner` for spawning Docker containers with an
environment and home directory for each users
* both `DockerSpawner` and `SystemUserSpawner` also work with Docker Swarm for
- both `DockerSpawner` and `SystemUserSpawner` also work with Docker Swarm for
launching containers on remote machines
- [SudoSpawner](https://github.com/jupyterhub/sudospawner) enables JupyterHub to
run without being root, by spawning an intermediate process via `sudo`
- [BatchSpawner](https://github.com/jupyterhub/batchspawner) for spawning remote
servers using batch systems
- [RemoteSpawner](https://github.com/zonca/remotespawner) to spawn notebooks
and a remote server and tunnel the port via SSH
- [YarnSpawner](https://github.com/jupyterhub/yarnspawner) for spawning notebook
servers in YARN containers on a Hadoop cluster
- [SSHSpawner](https://github.com/NERSC/sshspawner) to spawn notebooks
on a remote server using SSH
## Spawner control methods
@@ -39,7 +39,7 @@ an object encapsulating the user's name, authentication, and server info.
The return value of `Spawner.start` should be the (ip, port) of the running server.
**NOTE:** When writing coroutines, *never*`yield` in between a database change and a commit.
**NOTE:** When writing coroutines, _never_`yield` in between a database change and a commit.
Most `Spawner.start` functions will look similar to this example:
@@ -72,13 +72,12 @@ It should return `None` if it is still running,
and an integer exit status, otherwise.
For the local process case, `Spawner.poll` uses `os.kill(PID, 0)`
to check if the local process is still running.
to check if the local process is still running. On Windows, it uses `psutil.pid_exists`.
### Spawner.stop
`Spawner.stop` should stop the process. It must be a tornado coroutine, which should return when the process has finished exiting.
## Spawner state
JupyterHub should be able to stop and restart without tearing down
@@ -110,7 +109,6 @@ def clear_state(self):
self.pid=0
```
## Spawner options form
(new in 0.4)
@@ -127,7 +125,7 @@ If the `Spawner.options_form` is defined, when a user tries to start their serve
If `Spawner.options_form` is undefined, the user's server is spawned directly, and no spawn page is rendered.
See [this example](https://github.com/jupyterhub/jupyterhub/blob/master/examples/spawn-form/jupyterhub_config.py) for a form that allows custom CLI args for the local spawner.
See [this example](https://github.com/jupyterhub/jupyterhub/blob/HEAD/examples/spawn-form/jupyterhub_config.py) for a form that allows custom CLI args for the local spawner.
### `Spawner.options_from_form`
@@ -168,8 +166,7 @@ which would return:
When `Spawner.start` is called, this dictionary is accessible as `self.user_options`.
### How can I kill ports from JupyterHub managed services that have been orphaned?
I started JupyterHub + nbgrader on the same host without containers. When I try to restart JupyterHub + nbgrader with this configuration, errors appear that the service accounts cannot start because the ports are being used.
How can I kill the processes that are using these ports?
Run the following command:
sudo kill -9 $(sudo lsof -t -i:<service_port>)
Where `<service_port>` is the port used by the nbgrader course service. This configuration is specified in `jupyterhub_config.py`.
### Why am I getting a Spawn failed error message?
After successfully logging in to JupyterHub with a compatible authenticators, I get a 'Spawn failed' error message in the browser. The JupyterHub logs have `jupyterhub KeyError: "getpwnam(): name not found: <my_user_name>`.
This issue occurs when the authenticator requires a local system user to exist. In these cases, you need to use a spawner
that does not require an existing system user account, such as `DockerSpawner` or `KubeSpawner`.
### How can I run JupyterHub with sudo but use my current env vars and virtualenv location?
When launching JupyterHub with `sudo jupyterhub` I get import errors and my environment variables don't work.
When launching services with `sudo ...` the shell won't have the same environment variables or `PATH`s in place. The most direct way to solve this issue is to use the full path to your python environment and add environment variables. For example:
```bash
sudo MY_ENV=abc123 \
/home/foo/venv/bin/python3 \
/srv/jupyterhub/jupyterhub
```
### How can I view the logs for JupyterHub or the user's Notebook servers when using the DockerSpawner?
Use `docker logs <container>` where `<container>` is the container name defined within `docker-compose.yml`. For example, to view the logs of the JupyterHub container use:
docker logs hub
By default, the user's notebook server is named `jupyter-<username>` where `username` is the user's username within JupyterHub's db. So if you wanted to see the logs for user `foo` you would use:
docker logs jupyter-foo
You can also tail logs to view them in real time using the `-f` option:
docker logs -f hub
## Errors
@@ -88,11 +135,11 @@ There are two likely reasons for this:
1. The single-user server cannot connect to the Hub's API (networking
configuration problems)
2. The single-user server cannot *authenticate* its requests (invalid token)
2. The single-user server cannot _authenticate_ its requests (invalid token)
#### Symptoms
The main symptom is a failure to load *any* page served by the single-user
The main symptom is a failure to load _any_ page served by the single-user
server, met with a 500 error. This is typically the first page at `/user/<your_name>`
after logging in or clicking "Start my server". When a single-user notebook server
receives a request, the notebook server makes an API request to the Hub to
@@ -108,7 +155,7 @@ You should see a similar 200 message, as above, in the Hub log when you first
visit your single-user notebook server. If you don't see this message in the log, it
may mean that your single-user notebook server isn't connecting to your Hub.
If you see 403 (forbidden) like this, it's a token problem:
If you see 403 (forbidden) like this, it's likely a token problem:
```
403 GET /hub/api/authorizations/cookie/jupyterhub-token-name/[secret] (@10.0.1.4) 4.14ms
@@ -152,6 +199,42 @@ After this, when you start your server via JupyterHub, it will build a
new container. If this was the underlying cause of the issue, you should see
your server again.
##### Proxy settings (403 GET)
When your whole JupyterHub sits behind a organization proxy (_not_ a reverse proxy like NGINX as part of your setup and _not_ the configurable-http-proxy) the environment variables `HTTP_PROXY`, `HTTPS_PROXY`, `http_proxy` and `https_proxy` might be set. This confuses the jupyterhub-singleuser servers: When connecting to the Hub for authorization they connect via the proxy instead of directly connecting to the Hub on localhost. The proxy might deny the request (403 GET). This results in the singleuser server thinking it has a wrong auth token. To circumvent this you should add `<hub_url>,<hub_ip>,localhost,127.0.0.1` to the environment variables `NO_PROXY` and `no_proxy`.
### Launching Jupyter Notebooks to run as an externally managed JupyterHub service with the `jupyterhub-singleuser` command returns a `JUPYTERHUB_API_TOKEN` error
[JupyterHub services](https://jupyterhub.readthedocs.io/en/stable/reference/services.html) allow processes to interact with JupyterHub's REST API. Example use-cases include:
- **Secure Testing**: provide a canonical Jupyter Notebook for testing production data to reduce the number of entry points into production systems.
- **Grading Assignments**: provide access to shared Jupyter Notebooks that may be used for management tasks such grading assignments.
- **Private Dashboards**: share dashboards with certain group members.
If possible, try to run the Jupyter Notebook as an externally managed service with one of the provided [jupyter/docker-stacks](https://github.com/jupyter/docker-stacks).
Standard JupyterHub installations include a [jupyterhub-singleuser](https://github.com/jupyterhub/jupyterhub/blob/9fdab027daa32c9017845572ad9d5ba1722dbc53/setup.py#L116) command which is built from the `jupyterhub.singleuser:main` method. The `jupyterhub-singleuser` command is the default command when JupyterHub launches single-user Jupyter Notebooks. One of the goals of this command is to make sure the version of JupyterHub installed within the Jupyter Notebook coincides with the version of the JupyterHub server itself.
If you launch a Jupyter Notebook with the `jupyterhub-singleuser` command directly from the command line the Jupyter Notebook won't have access to the `JUPYTERHUB_API_TOKEN` and will return:
```
JUPYTERHUB_API_TOKEN env is required to run jupyterhub-singleuser.
Did you launch it manually?
```
If you plan on testing `jupyterhub-singleuser` independently from JupyterHub, then you can set the api token environment variable. For example, if were to run the single-user Jupyter Notebook on the host, then:
export JUPYTERHUB_API_TOKEN=my_secret_token
jupyterhub-singleuser
With a docker container, pass in the environment variable with the run command:
docker run -d \
-p 8888:8888 \
-e JUPYTERHUB_API_TOKEN=my_secret_token \
jupyter/datascience-notebook:latest
[This example](https://github.com/jupyterhub/jupyterhub/tree/HEAD/examples/service-notebook/external) demonstrates how to combine the use of the `jupyterhub-singleuser` environment variables when launching a Notebook as an externally managed service.
## How do I...?
@@ -170,7 +253,6 @@ You would then set in your `jupyterhub_config.py` file the `ssl_key` and
c.JupyterHub.ssl_cert = your_host-chained.crt
c.JupyterHub.ssl_key = your_host.key
#### Example
Your certificate provider gives you the following files: `example_host.crt`,
@@ -193,7 +275,7 @@ where `ssl_cert` is example-chained.crt and ssl_key to your private key.
Then restart JupyterHub.
See also [JupyterHub SSL encryption](getting-started.md#ssl-encryption).
See also [JupyterHub SSL encryption](./getting-started/security-basics.html#ssl-encryption).
### Install JupyterHub without a network connection
@@ -252,8 +334,7 @@ notebook servers to default to JupyterLab:
### How do I set up JupyterHub for a workshop (when users are not known ahead of time)?
1. Set up JupyterHub using OAuthenticator for GitHub authentication
2. Configure whitelist to be an empty list in` jupyterhub_config.py`
3. Configure admin list to have workshop leaders be listed with administrator privileges.
2. Configure admin list to have workshop leaders be listed with administrator privileges.
Users will need a GitHub account to login and be authenticated by the Hub.
@@ -281,7 +362,6 @@ Or use syslog:
jupyterhub | logger -t jupyterhub
## Troubleshooting commands
The following commands provide additional detail about installed packages,
[FastAPI](https://fastapi.tiangolo.com/) is a popular new web framework attractive for its type hinting, async support, automatic doc generation (Swagger), and more. Their [Feature highlights](https://fastapi.tiangolo.com/features/) sum it up nicely.
# Swagger UI with OAuth demo

# Try it out locally
1. Install `fastapi` and other dependencies, then launch Jupyterhub
```
pip install -r requirements.txt
jupyterhub --ip=127.0.0.1
```
2. Visit http://127.0.0.1:8000/services/fastapi or http://127.0.0.1:8000/services/fastapi/docs
3. Try interacting programmatically. If you create a new token in your control panel or pull out the `JUPYTERHUB_API_TOKEN` in the single user environment, you can skip the third step here.
```
$ curl -X GET http://127.0.0.1:8000/services/fastapi/
{"Hello":"World"}
$ curl -X GET http://127.0.0.1:8000/services/fastapi/me
{"detail":"Must login with token parameter, cookie, or header"}
$ curl -X POST http://127.0.0.1:8000/hub/api/authorizations/token \
2. Visit http://127.0.0.1:8000/services/fastapi/docs. When going through the OAuth flow or getting a token from the control panel, you can log in with `testuser` / `passwd`.
# PUBLIC_HOST
If you are running your service behind a proxy, or on a Docker / Kubernetes infrastructure, you might run into an error during OAuth that says `Mismatching redirect URI`. In the Jupterhub logs, there will be a warning along the lines of: `[W 2021-04-06 23:40:06.707 JupyterHub provider:498] Redirect uri https://jupyterhub.my.cloud/services/fastapi/oauth_callback != /services/fastapi/oauth_callback`. This happens because Swagger UI adds the request host, as seen in the browser, to the Authorization URL.
To solve that problem, the `oauth_redirect_uri` value in the service initialization needs to match what Swagger will auto-generate and what the service will use when POST'ing to `/oauth2/token`. In this example, setting the `PUBLIC_HOST` environment variable to your public-facing Hub domain (e.g. `https://jupyterhub.my.cloud`) should make it work.
# Notes on security.py
FastAPI has a concept of a [dependency injection](https://fastapi.tiangolo.com/tutorial/dependencies) using a `Depends` object (and a subclass `Security`) that is automatically instantiated/executed when it is a parameter for your endpoint routes. You can utilize a `Depends` object for re-useable common parameters or authentication mechanisms like the [`get_user`](https://fastapi.tiangolo.com/tutorial/security/get-current-user) pattern.
JupyterHub OAuth has three ways to authenticate: a `token` url parameter; a `Authorization: Bearer <token>` header; and a (deprecated) `jupyterhub-services` cookie. FastAPI has helper functions that let us create `Security` (dependency injection) objects for each of those. When you need to allow multiple / optional authentication dependencies (`Security` objects), then you can use the argument `auto_error=False` and it will return `None` instead of raising an `HTTPException`.
Endpoints that need authentication (`/me` and `/debug` in this example) can leverage the `get_user` pattern and effectively pull the user model from the Hub API when a request has authenticated with cookie / token / header all using the simple syntax,
"Function that needs to work with an authenticated user"
return{"Hello":user.name}
```
# Notes on client.py
FastAPI is designed to be an asynchronous web server, so the interactions with the Hub API should be made asynchronously as well. Instead of using `requests` to get user information from a token/cookie, this example uses [`httpx`](https://www.python-httpx.org/). `client.py` defines a small function that creates a `Client` (equivalent of `requests.Session`) with the Hub API url as it's `base_url` and adding the `JUPYTERHUB_API_TOKEN` to every header.
Consider this a very minimal alternative to using `jupyterhub.services.auth.HubOAuth`
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.