Of 18355 lines of logs in a 5day old hub instance,
8228 are just this message. That's 44% of the logs! We now
have prometheus metrics to monitor performance of this if
needed, and people can always turn on debug logging.
- circle CI no longer used
- ubuntu/debian nodejs may be too old (12.0+ required)
- remove mention of mailing list
- Python 3.6 required
- Emphasise JupyterLab over notebook
so it can be applied to all cookie-authenticated POST requests
also parse the content-type header to handle e.g. `Content-Type: application/json; charset`
The content-type of Hub API requests used for user management, specifically for creating a user
is not validated and so the ‘text/plain’ type is accepted, where it must be ‘application/json’.
This commit adds validation for `Content-type` header for the /hub/api/users endpoint to only
allow requests with content-type as `application/json`
improves backward compatibility for clients that haven't implemented pagination
by requesting the max page size by default instead of the new default page size
use `Accept: application/jupyterhub-pagination+json` to opt-in to the new response format
With a paginated API, we need to return pagination info (next page arguments, whether a next page exists, etc.),
but a simple list response doesn't give a good way to do that.
We can follow precedents and use a dict with an `items` field for the actual items,
and a `_pagination` field for info about pagination, including offset, limit, url for the next request
and govern GET /users|groups|services endpoints with these
Greatly simplifies filtering and pagination,
because these filters can be expressed in db filters,
unlike the potentially complex `read:users`.
Now the query itself will never return a model that should be excluded.
While writing the tests, I added more cleanup between tests.
We now ensure cleanup of all users and groups after each test,
which required updating some group tests which relied on this state leaking
distutils is slated for deprecation in the stdlib
we can use packaging for version parsing and setuptools in setup.py
packaging is technically an extra dependency, but rarely missing because it's so widespread
- ensure create_certs is called for managed services
- wait for services with http, which checks ssl connections (without http, only tcp was checked, which doesn't verify it works!)
adding `read:` to everything isn't right because not everything has a `read:` counterpart and not every `read:` has a write counterpart
includes a test verifying that every scope has a definition
- access:services for services
- access:users:servers for servers
- tokens automatically have access to their issuing client (if their owner does, too)
- Check access scope in HubAuth integration
check_db_locks checks for db lock state after the end of a function,
but wasn't properly waiting when it wrapped an async function,
meaning it would run the check while the async function was still outstanding,
causing possible spurious failures
and fix warning condition for intersection overlap
- only warn when there's a group only on one side and a user or server only on the other,
otherwise there is no lost information to warn about (group and/or defined on both sides)
- correctly resolve servers as sub-scopes of user
- update references to default branch name in docs, workflows
- use HEAD in github urls, which always works regardless of default branch name
- fix petstore URLs since the old petstore links seem to have stopped working
should never occur in real applications where only one loop is run,
but may occur in tests if the Proxy object lives longer than the loop in which it runs
I suspect this is the source of our intermittent test failures with
> got Future <Future pending> attached to a different loop
- remove long-deprecated `POST /api/authorizations/token` for creating tokens
- deprecate but do not remove `GET /api/authorizations/token/:token` in favor of GET /api/user
- remove shared-cookie auth for services from HubAuth, rely on OAuth for browser-auth instead
- use `/hub/api/user` to resolve user instead of `/authorizations/token` which is now deprecated
instead of on the test class
and fix the logic for when it is called a bit:
- call on *all* Spawners, not just the default
- call on named server deletion when remove=True
and clarify warning when a base handler isn't patched
- reorganize patch steps into functions for easier re-use
- patch notebook and jupyter_server handlers if they are already imported
- run patch after initialize to ensure extensions have done their importing before we check
- Attach role limit to OAuthClient
- Attach authorized roles to OAuthCode
- pass roles from code to API token on completion
standard 'scopes' in oauth process are matched against our 'roles' instead of our low-level scopes
These only affected servers upgrading directly from 0.8 or earlier with still-running servers
0.8 was a long time ago, it's okay to require restarting servers for an upgrade that long
- 3-255 characters
- ascii lowercase, numbers, -
- must start with letter
- must not end with -
this lets us avoid url escaping issues in e.g. oauth params
When the hub is running in API-only mode, it's
very useful to have the proxy know where to send
URLs that would normally be serviced by the hub.
For example, / might go to a service that renders
a home page, while `/user` might go to a service that
tells the user their server is dead.
Right now, this happens 'out of band', with a process
that has to talk to the proxy directly. This is a
bit messy - the routes need to be re-added when the
proxy restarts, the hub might try to remove them, etc.
By adding support for this in the hub itself, all
this complexity is now removed and the hub continues
to own all the routes in the proxy
This allows for more flexible customization of the login page,
since it allows to re-use the login form in an extending template
by reusing the new block.
This was not cleanly possible before since the main container
was part of the very same block as the form code.
fixes#3414
- merge oauth token fields into APITokens
- create oauth client 'jupyterhub' which owns current API tokens
- db upgrade is currently to drop both token tables, and force recreation on next start
so changing cookie age changes oauth token expiry,
since these are what are stored in those cookies anyway,
it makes sense for them to expire at the same time
When an oauth client changes, we delete all the tokens
associated with that client. This invalidates all user sessions
for that oauth client, and the oauth client's users will need to
go through the OAuth workflow again after the cache period (specified
by cache_max_age in HubAuth, 5min by default). This is fine in theory,
since oauth client information doesn't change frequently.
However, we were deleting and re-adding all oauth clients each time
the hub started! This was unnecessary, since the data was going to
be the same 99% of the time. Rest of the time, we should just update,
preventing unnecessary churn.
This PR does that.
Ref https://github.com/yuvipanda/jupyterhub-configurator/issues/2
Ref https://github.com/berkeley-dsep-infra/datahub/issues/2284
get_current_user returns a User model instead of a dict.
using cookies for Hub auth is deprecated, so removed
that option and refactored get_current_user
makes testing a PR even easier since we build an sdist and wheel for every PR and push
since artifacts are double-archived, it's not quite as simple as giving a URL to install from,
but this at least makes it available. To use:
- download and unpack zip
- `pip install path/to/whl`
apply patch directly to BaseHandler instead of each handler instance
so that overrides can still take effect (i.e. APIHandler raising 403 instead of redirecting)
Merge request #3257fixed#3256 only on getting-started/services-basics.md
There is still a reference to jupyterhub example cull-idle in reference/services.md
setting per_page in constructor resolves before max_per_page limit is updated from config,
preventing max_per_page from being increased beyond the default limit
we already loaded these values anyway in the first instance,
so remove the redundant Pagination object
Running the Curl as is return a 500 with ```json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
``` Converting the payload to a proper Json
The feature is disabled by default.
If enabled (by setting `login_term_url`), user will have to check the
checkbox to accept the terms and conditions in order to login.
jinja2's async support requires Python 3.6+. That should
be an implementation detail - so we render it in the main
thread (current behavior) but pretend we did not
write_error is a synchronous method called by an async
method from inside the event loop. This means we can't just
schedule an async render_templates in the same loop and wait
for it - that would deadlock.
jinja2 compiled your code differently based on wether you
enable async support or not. Templates compiled with async
support can't be used in cases like ours, where we already
have an event loop running and calling a sync function. So
we maintain two almost identical jinja2 environments
testing other branches is useful, and there's little cost to removing the conditions:
- we don't run PRs from our repo, so test runs aren't duplicated on the repo
- testing on a fork without opening a PR is still useful (I use this often)
- if we push a branch, it should probably be tested (e.g. backport branch), and filters make this extra work
- the cost of running a few extra tests is low, especially given actions' current quotas and parallelism
The jupyterhub/tests/test_spawner.py::test_spawner_routing[has~x] test
failed in py37+ but not in py36, and I think it is foundational to the
socket library of Python that has changed.
This is a stacktrace from Python/3.7.9/x64/lib/python3.7/site-packages/urllib3/util/connection.py:61
```
> for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
E socket.gaierror: [Errno -2] Name or service not known
```
Here is relevant documentation about socket.getaddrinfo.
https://docs.python.org/3.7/library/socket.html#socket.getaddrinfo
- Fixes typo (eolving -> evolving)
- re-use the word current instead of momentary for comprehensibility
- references JupyterHubs current state with its instead of the for comprehensibility
Co-authored-by: Erik Sundell <erik.i.sundell@gmail.com>
allows selecting users based on the 'ready' 'active' or 'inactive' states of their servers
- ready: users who have any servers in the 'ready' state
- active: users who have any servers in the 'active' state (i.e. ready OR pending)
- inactive: users who have *no* servers in the 'active' state (inactive + active = all users, no overlap)
Does not change the user model, so a user with *any* ready servers will still return all their servers
Fixes issues with OAuth flows when internal_ssl is enabled.
When internal_ssl was enabled requests to non-internal endpoints failed
because the system CAs were not being loaded.
This caused failures with public OAuth providers with public CAs since
they would fail to validate.
We are users of the napoleon sphinx extension, which helps us parse our
Google Style Python Docstrings, and its syntax suggest we should use
indentation when we use more then one string for an entry in an
Arguments: or Returns: list.
For more details, see: https://github.com/jupyterhub/jupyterhub/pull/3151#issuecomment-676186565
- Explicitly mention min-8-char constraint
- Connect the api_token in the configuration with the one mentioned in auth requests
Co-authored-by: Mike Situ <msitu@ceresimaging.net>
for easier reuse with jupyter_server
mixins have a lot of assumptions about the NotebookApp structure.
Need to make sure these are met by jupyter_server (that's what tests are for!)
When using the `KubeSpawner` it is typical to disable the
`slow_spawn_timeout` by setting it to 0. `zero-to-jupyterhub-k8s`
does this by default [1]. However, this causes an immediate `TimeoutError`
which gets logged as a warning like this:
>User hub-stress-test-123 is slow to start (timeout=0)
This avoids the warning by checking the value and if disabled simply
returns without logging the warning.
[1] https://github.com/jupyterhub/zero-to-jupyterhub-k8s/commit/b4738edc5Closes#3126
- Related issue: #3120. Closes: #3120.
- I realized that spawner.clear_state() is called before
spawner.post_stop_hook(). This caused was a bit surprising to me,
and caused some issues.
- I tried the naive strategy of moving clear_state to later and
setting the orm_state to `{}` at the point where it used to be
clear.
- This tries to maintain the exception behavior of clear_state and
post_stop_hook, but is exactly identical.
- To review:
- I'm not sure this is a good idea!
- Carefully consider the implications of this. I am not at all sure
about unintended side-effects or what intended semantics are.
This was added in PR #2721 and by default results in just printing
out "10" without any context when starting the hub service. This
simply removes the orphan print statement.
I'm open to changing this to a debug log statement with context if
someone finds that useful, e.g.:
`self.log.debug('Effective init_spawners_timeout: %s', init_spawners_timeout)`
Bug #2852 describes an issue where templates cannot be found by
JupyterHub when using the Docker images built out of this repo. The
issue turned out to be due to missing node_modules at the time of build.
There is a hook in the `package.json` that causes node_modules to be
copied to the static/components directory post-install. If this is not
run, those components are not in the static directory and thus are not
included in the wheel when it is built.
Fix#2905 fixed one problem--the `bower-lite` hook script wasn't copied
to the Docker image, and so the hook couldn't run, but the other issue
is that the client dependencies are never explicitly built. They must be
built prior to the wheel build, and the hook script must have run so
they are copied to the ./static folder, which is included in the wheel
build thanks to [MANIFEST.in][1]
.. note::
This removes the verbose flag from the wheel build command. The
reason is that it generates a lot of writes to stdout. It seems that
wheel can (or always) is switching to non-blocking mode, which can cause
EAGAIN to be raised, which leads to fun errors like:
BlockingIOError(.., 'write could not complete without blocking', ..)
The wheels fail to build if this error is raised. Removing the verbosity
flag is a quick solution (it drastically reduces writes to STDOUT), but
comes at the cost of more trouble debugging a failed wheel build. Adding
the "-v" back in the Dockerfile when debugging a build failure is still
possible. [Credit: @vbraun][2]
.. note::
This commit also removes some extraneous COPY operations during the
Docker build, in particular the /src/jupyterhub/share directory is
not used unless users have explicitly override their
jupyterhub_config.py to include it somehow. If the default
data_files_path behavior is used, JupyterHub should find the proper
static directory when the application loads.
Fixes: #2852
[1]: https://packaging.python.org/guides/using-manifest-in/
[2]:
https://github.com/travis-ci/travis-ci/issues/4704#issuecomment-348435959
- base Expiring class
- ensures expiring values (OAuthCode, OAuthAccessToken, APIToken) are not returned from `find`
- all expire appropriately via purge_expired
behaves more like one would expect (same as try get-key, except: return default)
without relying on cache presence or underlying key type (integer only)
This does some of the test with the latest traitlets.
We are looking into making a 5.0 release and would like to have some
confidence that it does not break too many things.
They are less relevant than other request and could very well end up
cluttering the logs. It is not uncomming for these requests to be made
every second or every other second.
In case there are multiple singleuser notebooks at different
versions we want to log each of those mismatches as a warning
so this changes the global _version_mismatch_warning_logged flag
from a bool to a dict keyed by the hub/singleuser version mismatch
combination. A test wrinkle is added for that scenario.
Part of #2970
As a new contributor to jupyterhub it took awhile to get
up and running locally mainly because I didn't have sqlite
installed but also because I was flipping between README,
CONTRIBUTING and the actual contributing docs which are all
a little bit different.
This does a few things:
- Updates the contributor sphinx docs to mention that how
one chooses to isolate their development environment is
up to them with a link to the detailed forum thread on
that topic.
- Updates the contributor sphinx docs to mention sqlite and
database setup in general. While in here some trailing
whitespaces are cleaned up.
- Leave a comment in CONTRIBUTING.md about the redundant
information in the docs on getting a development environment
setup. Long-term we should really get those merged so there
is a single authoritative document on how to get a dev env
setup for contributing to jupyterhub.
- Link to the jupyterhub gitter channel for asking questions.
If your jupyterhub and jupyterhub-singleuser instances
are running at different minor or greater versions a
warning gets logged per active server which can be a lot
when you have hundreds of active servers.
This adds a flag to that version mismatch logging logic
such that the warning is only logged once per restart
of the hub server.
Closes issue #2970
APIHandler.server_model unconditionally returns the Spawner's
user_options dict but it wasn't mentioned in the API reference
so it's added here. The description is taken from the docstring
on Spawner.user_options.
Closes issue #2965
Authorization header has the form "<type> <credentials>"
rather than checking for "token" only, preserve type value, which could be Bearer, Basic, etc.
query on Server objects instead of User objects
avoids lots of ORM work on startup since there are typically a small number of running servers
relative to the total number of users
this also means that the users dict is not fully populated. Is that okay? I hope so.
Not exactly all though as some will be ignored by the .dockerignore
file. This change ensures we don't get future issues caused by a failure
to update what needs to be copied to the build stage and not like we've
had recently.
This fixes#2852 by adding a script part of package.json. But is this
enough? Should we perhaps look in MANIFEST.in and copy some more files
listed there?
This is all thanks to people coming together and helping out figuring
out the issue in https://github.com/jupyterhub/jupyterhub/issues/2852.
Thank you @shingo78 for spotting that we missed bower-lite and its role
and all others who reported and helped debug this!
Updated capitalisation of names. Addressed revisions.
Fleshed out the prerequists and explanation of access control.
Added part of configuration section to set JupyterLab as the default interface.
corrected need for sudo
Added warning to reverse-proxy section to recommend use of HTTPS and firewall.
- A trivial bug caused by my last change to #2397 - made possible by
the fact we didn't have a way to reliable test PAM stuff.
- Thanks to @narnish for noticing.
- Closes: #2875
- We now default to ubuntu bionic (18.04) and try once with ubuntu xenial
(16.04).
- We now always test Python 3.8 but allow it to fail, as compared to not
allowing it to fail and only testing it on tagged commits. This is a
bugfix I'd say.
- We now no longer test Python 3.5 and Python 3.6 dedicatedly without
any custom configuration like usage of subdomain, which allows us to
reduce the number of build jobs in a way I think makes a great sense to
compromise.
Some notes:
- Added a conda-forge and DockerHub badge
- Added logo's and made us conform with the team-compass badges section
as can be found here:
https://jupyterhub-team-compass.readthedocs.io/en/latest/building-blocks/readme-badges.html
- Concluded that our CircleCI badge is good because it let's us overview
the repo's build systems, but that it is bad because it is only is about
documentation preview in PRs which isn't useful in a README's header in
a way.
- Noted there was a CircleCI token in the badge, that I believe is meant
to be used with private repo access rather than public repo access. I'm
not sure we need that but I made it a markdown/html comment for now.
- Decided to not manually add a line break between badges. I figured it
could make sense to break manually before the social badges instead of
automatically letting it wrap at some point, but we don't really know
the size of the window viewing so it felt like a bad idea to hardcode
that.
- When the Dockerfile was turned into a multi-stage build, it seems
the share/ directory was not copied to the final image. This
resulted in certain components (static/components/, static/css/)
being missing, which resulted in the JupyterHub share directory not
being findable (in jupyterhub/_data.py). This led to all kinds of
weird havoc, like templates not being findable (#2852).
- I am still unsure if this is the right fix, please check this well.
- Closes: #2852
- While debugging another problem, I noticed some failures to build
the C extensions in the logs. Adding build-essential should fix
that (also as mentioned in the logs themselves).
- Extensions failed for tornado, sqlalchemy, and pyrsistent(pvectorc)
and can be found by searching the previous output for "fail".
Closes#2819 by exiting JupyterHub directly with an error if a config
file has been specified for the config_file traitlet, for example
through the -f or --config flag, but isn't available on the file
system.
- In the cull script, the max_age and inactive_limit are used from the
outer scope. In the case that you add extra logic, one may want to
modify these values.
- In that case, you either have to rename them locally, or access the
outer scope with "nonlocal", the first of which is too much work,
the second of which has a high chance of introducing bugs (as it did
for me).
- This change introduces a fix for everyone. It doesn't change basic
functionality, but makes local modifications simpler.
- Pass in user object & request object only explicitly.
Much better interface that is harder to break by internal
refactoring. We can always add more parameters if needed?
/user-redirect/ is used to help link to a particular url
in the logged in user's authenticated notebook. For example,
if I'm logged in as user 'yuvipanda' and hit the URL
/hub/user-redirect/git-pull, it'll redirect me to
/user/yuvipanda/git-pull. This is extremely useful in
connecting hub links to notebook server extensions, such
as nbgitpuller.
Admins might want to customize how this redirection is done -
for example, redirect users to different running servers
based on the nbgitpuller repository they are linking from.
Adding a hook here helps accomplish that.
allows services to be explicitly blessed to skip the extra oauth confirmation page
added in 1.0
This confirmation page is unhelpful for many admin-managed services,
and is mainly intended for cross-user access.
The default behavior is unchanged, but services can now opt-out of confirmation
(as is done already for the user's own servers).
Use with caution, as this eliminates users' ability to confirm that a service
should be able to authenticate them.
- API requests to non-running servers are not uncommon when you cull
servers and people leave tabs open and active. It returns with 503
and logs all headers, which can take up half of our total log lines
- This avoids logging headers for all 502 and 503 return statuses.
#2747 presented an alternative (more complex) implementation, but this
turned out to be appropriate.
- Closes: #2747
In current versions of MySQL and MariaDB `innodb_file_format`
and `innodb_large_prefix` have been removed. This allows them to not
exist and makes sure the format for the rows are `Dynamic` (default
for current versions).
If init_spawners takes too long (default: 10 seconds) to complete,
app start will be allowed to continue while finishing in the background.
Adds new `check` pending state for the initial check.
Checking lots of spawners can take a long time,
so allowing this to be async limits the impact on startup time
at the expense of starting the Hub in a not-quite-fully-ready state.
- Introduce the EventLog class from BinderHub for emitting
structured event data
- Instrument server starts and stops to emit events
- Defaults to not saving any events anywhere
The flask example in the documentation was still using the
input argument `cookie_cache_max_age` when instantiating
`HubAuth` object. `cookie_cache_max_age` is deprecated since
JupyterHub 0.8 and should be replaced by `cache_max_age`.
- Install pip in the docs conda env (or conda complains).
- Do not override page.html, the next/previous buttons are now handled by
alabaster_jupyterhub (this actually remove the duplicated next/prev
buttons)
- use alabaster_jupyterhub when building locally, this make it easy for
new contributor to get the _exact_ same appearance than on
readthedocs.
- cull_idle_servers.py gets the full server state, so is capable of
doing any kind of arbitrary logic on the profile in order to be more
flexible in culling.
- This patch does not change anything, but gives an embedded
(commented out) example of how you can easily add custom logic to
the script.
- This was added as a tempate/demo for #2598.
* Add missing responses (doesn't include all possible responses yet)
* Refactor invalid multi in body parameters into a single parameter
* Change form type into valid formData
* Fix use of required fields
* Apply a few other minor fixes
Fixes https://github.com/jupyterhub/jupyterhub/issues/2566 to some
degree by making the announcement stand out using twitter-bootstrap
classes `alert` and `alert-warning`. Perhaps we could theme twitter
bootstrap or this alert specifically with jupyter related colors as well
though?
Windows doesn't have support for signal handling so it can't use the
signal handling capabilities of asyncio. Use the previous atexit
strategy on the Windows case instead.
Signed-off-by: Alejandro Del Castillo <alejandro.delcastillo@ni.com>
Big thanks to Erik, Tim, and Min for the great comments!
Change names to be more clear, add function doc comments,
change scoping on some functions, add handle_logout to let
people take custom logout actions, extract
render_logout_page from get method, add TODO.
AS A developer of a Logout handler
I WANT to be able to call a function to kill spawners and
do other backend logout stuff and a separate function to
forward the user along the logout chain.
I believe this PR adds (moderately private) methods to the
Logout Handler to do just that.
update several links (html targets don't work anymore)
had to add rest-api redirect so link would resolve,
since there isn't a ref for files in _static
- /user/:name no longer triggers implicit spawn at any point
- add /spawn-pending/:user/:server handler for pending page. This page has no side effects.
- spawn links point to /spawn/:user/:server to finish hooking up links for named servers and options_form handling
- It took me a bit longer than I would have liked for me to figure out
how to run the proxy separate from the hub. When I had to do this a
second time for a different hub, it also took me too long.
- This adds a page dedicated to running the proxy separate from the
hub, since it is relatively easy and has a high usability
improvement.
- Currently work in progress.
TEXT is wrong on Oracle, LargeBinary is wrong everywhere else.
Text seems to be the high-level type that maps to the right thing both places.
This results in no change on supported implementations, as Text == TEXT there.
- We don't need the extra normalization of that function.
- Also add in username_map support here. It probably isn't needed
most of the time with PAM, but it keeps things consistent and is
easier than documenting an exception.
Traitlets require quotes around literals, to avoid interpreting them as
as datatypes other than string. However, quotes are problematic on the
notebook_dir case. On Windows, Popen will mis-interpret the quotes and
escape them, which trips the process spawn. To avoid problems, only
quote if necessary.
Signed-off-by: Alejandro del Castillo <alejandro.delcastillo@ni.com>
adds Authenticator.auth_refresh_age and Authenticator.refresh_pre_spawn config
- auth_refresh_age allows auth to expire (default: 5 minutes) before calling Authenticator.refresh_user.
- refresh_pre_spawn forces refresh prior to spawn (in case of auth tokens, etc.)
this introduces a race between the early RuntimeError being tested
and the no_patience causing handlers to return early if async start isn’t complete.
With tornado coroutines, an early RuntimeError could be guaranteed to resolve promptly, but asyncio isn’t as consistent,
possibly causing some of the recent flaky tests.
Windows doesn't have a pwd module. To avoid an import error on Windows,
move import statement inside functions that use pwd.
Signed-off-by: Alejandro del Castillo <alejandro.delcastillo@ni.com>
The current request handler might be needed to determine if the auth
data needs to be refreshed.
Signed-off-by: Alejandro del Castillo <alejandro.delcastillo@ni.com>
Use setuptools console_scripts functionality to create top level jupyter
& jupyterhub-single user entry point scripts on *nix, and executables on
Windows.
Signed-off-by: Alejandro del Castillo <alejandro.delcastillo@ni.com>
when a token doesn't identify a user, the response is None.
These results are cached, but the cache checked for `is None`,
causing failed-auth responses to effectively not be cached.
Hoist admin status determination from authentication to a secondary function called by get_authenticated_user
Create mock objects for struct_group and struct_passwd, migrate existing mock group objects to it
Remove old admin mock stuff for authenticate
- trust subdomain_host by default
- JupyterHub.trusted_alt_names is inherited by Spawners by default. Do we need Spawner.ssl_alt_names to be separately configurable?
One of the example was using quotes instead of backticks.
Backticks are the "older" way of doing things, which has a number of
disadvantes:
http://mywiki.wooledge.org/BashFAQ/082
Here I'm more worried about readability as depending on font and "smart"
editor helping on the web, many people may confuse ` with ', it could
end up modifying formatting on makrdown powered website... etc...
jupyterhub.authenticators for authenticators, jupyterhub.spawners for spawners
This has the effect that authenticators and spawners can be selected by name instead of full import string (e.g. 'github' or 'dummy' or 'kubernetes')
and, perhaps more importantly, the autogenerated configuration file will include a section for each installed and registered class.
- Expands the previous documentation on upgrading JupyterHub
to include more information.
- Remove specific documentation on 0.7 -> 0.8 upgrade, since
this seems to be a straight copy of the markdown version of
upgrading docs. The important thing about the 0.7 -> 0.8 upgrade
(requiring versions of JupyterHub to match) is now in the
main document.
- Move from markdown to rst
Info on upgrading is important & relevant. This consolidates
the index to be a bit better. Next step is to consolidate the
documentation into one page.
Removes the 'tutorials' index page as well, since that only
had a reference to z2jh (which is now referenced from the
'distribution' section). The distribution section has
better visibility too
Currently, the sections in index.rst are using ** for bold,
rather than true section headers. This prevents them from being
linkable. Since we'd like to link to the 'contributing' section
from CONTRIBUTING.md, we change this by moving everything to
section headers. We also move to the toctree directive, since
it keeps the bullets aligned properly (they were hanging if
we used simple * markers)
This also replaces CONTRIBUTING.md content with a link to
the docs.
- Move from CONTRIBUTING.md to a subdirectory in docs, so
we can expand and add more documentation.
- Move from markdown to reStructuredTest
- Add a direct blurb in the JupyterHub docs index page on
how contribution.
- More prominent link to the Code of Conduct
- Add section on getting in touch with the JupyterHub community
define some pending/ready helpers as static constants on orm.Spawner
allows treating orm.Spawner the same as Spawner wrappers,
as long as `.active` etc. checks are performed first
and generate no events if not pending
Reason: race condition is unavoidable between first pending check and check inside _generate_progress.
In this event, return immediately.
The current list in the docs is out of date. The list
in the wiki is more up-to-date, and easier for folks
to change over time. In the long run, we should decide
where lists like this belong.
- delete oauth clients for servers when they shutdown
- avoid deleting oauth clients for servers still running across an 0.8 -> 0.9 upgrade, when the oauth client ids changed from `user-NAME` to `jupyterhub-user-NAME`
- refresh_user may return True in the common case, identifying that everything is up-to-date
- return False for "needs login"
- return auth_data dict when an update can be performed without logging in again
- `.get_current_user` is called in the `prepare` stage for all handlers
- use `.current_user` to access current user in methods
- adds Authenticator.refresh_user for refreshing user auth (unused at this point)
With changes to CHP requiring a second, different
authority, the complexity of managing trust within
JupyterHub has risen. To solve this, Certipy now
has a feature to specify what components should
trust what and builds trust bundles accordingly.
Mainly small fixes, but the token page could be completely broken
This release will include the spawner.handler addition,
but not the oauthlib change currently in master
To better accommodate external certificate management
as well as building of trust, Certipy was refactored.
This included general improvements to file and
record handling. In the process, some of Certipy's
APIs changed slightly, but should be more stable now
going forward.
Setup general ssl request, not just to api
Basic tests comprised of non-ssl test copies
Create the context only when request is http
Refactor ssl key, cert, ca names
Configure the AsyncHTTPClient at app start
Change tests to import existing ones with ssl on
Override __new__ in MockHub to turn on SSL
Add Localhost to trusted alt names
Update to match refactored certipy names
Add the FQDN to cert alt names for hub
Ensure notebooks do not trust each other
Drop certs in user's home directory
Refactor cert creation and movement
Make alt names configurable
Make attaching alt names more generic
Setup ssl_context for the singleuser hub check
If you are reporting an issue with JupyterHub, please use the [GitHub issue](https://github.com/jupyterhub/jupyterhub/issues) search feature to check if your issue has been asked already. If it has, please add your comments to the existing issue.
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
**Additional context**
Add any other context about the problem here.
- Running `jupyter troubleshoot` from the command line, if possible, and posting
its output would also be helpful.
- Running in `--debug` mode can also be helpful for troubleshooting.
Welcome! As a [Jupyter](https://jupyter.org) project, we follow the [Jupyter contributor guide](https://jupyter.readthedocs.io/en/latest/contributor/content-contributor.html).
Welcome! As a [Jupyter](https://jupyter.org) project,
you can follow the [Jupyter contributor guide](https://jupyter.readthedocs.io/en/latest/contributing/content-contributor.html).
Make sure to also follow [Project Jupyter's Code of Conduct](https://github.com/jupyter/governance/blob/HEAD/conduct/code_of_conduct.md)
for a friendly and welcoming collaborative environment.
## Set up your development system
## Setting up a development environment
For a development install, clone the [repository](https://github.com/jupyterhub/jupyterhub)
Please note that this repository is participating in a study into the sustainability of open source projects. Data will be gathered about this repository for approximately the next 12 months, starting from 2021-06-11.
Data collected will include the number of contributors, number of PRs, time taken to close/merge these PRs, and issues closed.
For more information, please visit
[our informational page](https://sustainable-open-science-and-software.github.io/) or download our [participant information sheet](https://sustainable-open-science-and-software.github.io/assets/PIS_sustainable_software.pdf).
[](https://github.com/jupyterhub/jupyterhub/actions)
The same JupyterHub API spec, as found here, is available in an interactive form
`here (on swagger's petstore) <http://petstore.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/master/docs/rest-api.yml#!/default>`__.
`here (on swagger's petstore) <https://petstore3.swagger.io/?url=https://raw.githubusercontent.com/jupyterhub/jupyterhub/HEAD/docs/rest-api.yml#!/default>`__.
The `OpenAPI Initiative`_ (fka Swagger™) is a project used to describe
JupyterHub can be configured to record structured events from a running server using Jupyter's `Telemetry System`_. The types of events that JupyterHub emits are defined by `JSON schemas`_ listed at the bottom of this page_.
- [Press release on Jupyter and Cori](http://www.nersc.gov/news-publications/nersc-news/nersc-center-news/2016/jupyter-notebooks-will-open-up-new-possibilities-on-nerscs-cori-supercomputer/)
- [Moving and sharing data](https://www.nersc.gov/assets/Uploads/03-MovingAndSharingData-Cholia.pdf)
@@ -28,7 +30,7 @@ Please submit pull requests to update information or to add new institutions or
### University of California Davis
- [Spinning up multiple Jupyter Notebooks on AWS for a tutorial](https://github.com/mblmicdiv/course2017/blob/master/exercises/sourmash-setup.md)
- [Spinning up multiple Jupyter Notebooks on AWS for a tutorial](https://github.com/mblmicdiv/course2017/blob/HEAD/exercises/sourmash-setup.md)
Although not technically a JupyterHub deployment, this tutorial setup
may be helpful to others in the Jupyter community.
@@ -59,6 +61,13 @@ easy to do with RStudio too.
- [jupyterhub-deploy-teaching](https://github.com/jupyterhub/jupyterhub-deploy-teaching) based on work by Brian Granger for Cal Poly's Data Science 301 Course
### Chameleon
[Chameleon](https://www.chameleoncloud.org) is a NSF-funded configurable experimental environment for large-scale computer science systems research with [bare metal reconfigurability](https://chameleoncloud.readthedocs.io/en/latest/technical/baremetal.html). Chameleon users utilize JupyterHub to document and reproduce their complex CISE and networking experiments.
- [Shared JupyterHub](https://jupyter.chameleoncloud.org): provides a common "workbench" environment for any Chameleon user.
- [Trovi](https://www.chameleoncloud.org/experiment/share): a sharing portal of experiments, tutorials, and examples, which users can launch as a dedicated isolated environments on Chameleon's JupyterHub.
### Clemson University
- Advanced Computing
@@ -67,6 +76,7 @@ easy to do with RStudio too.
### University of Colorado Boulder
- (CU Research Computing) CURC
- [JupyterHub User Guide](https://www.rc.colorado.edu/support/user-guide/jupyterhub.html)
- Slurm job dispatched on Crestone compute cluster
- log troubleshooting
@@ -77,16 +87,25 @@ easy to do with RStudio too.
- Earth Lab at CU
- [Tutorial on Parallel R on JupyterHub](https://earthdatascience.org/tutorials/parallel-r-on-jupyterhub/)
### George Washington University
- [Jupyter Hub](http://go.gwu.edu/jupyter) with university single-sign-on. Deployed early 2017.
### HTCondor
- [HTCondor Python Bindings Tutorial from HTCondor Week 2017 includes information on their JupyterHub tutorials](https://research.cs.wisc.edu/htcondor/HTCondorWeek2017/presentations/TueBockelman_Python.pdf)
- [nbgraderutils](https://github.com/dice-group/nbgraderutils): Use JupyterHub + nbgrader + iJava kernel for online Java exercises. Used in lecture Statistical Natural Language Processing.
### Penn State University
- [Press release](https://news.psu.edu/story/523093/2018/05/24/new-open-source-web-apps-available-students-and-faculty): "New open-source web apps available for students and faculty" (but Hub is currently down; checked 04/26/19)
- [Deploy JupyterHub on a Supercomputer with SSH](https://zonca.github.io/2017/05/jupyterhub-hpc-batchspawner-ssh.html)
- [Run Jupyterhub on a Supercomputer](https://zonca.github.io/2015/04/jupyterhub-hpc.html)
- [Deploy JupyterHub on a VM for a Workshop](https://zonca.github.io/2016/04/jupyterhub-sdsc-cloud.html)
@@ -124,7 +153,10 @@ easy to do with RStudio too.
- Kristen Thyng - Oceanography
- [Teaching with JupyterHub and nbgrader](http://kristenthyng.com/blog/2016/09/07/jupyterhub+nbgrader/)
### Elucidata
- What's new in Jupyter Notebooks @[Elucidata](https://elucidata.io/):
- Using Jupyter Notebooks with Jupyterhub on GCP, managed by GKE - https://medium.com/elucidata/why-you-should-be-using-a-jupyter-notebook-8385a4ccd93d
## Service Providers
@@ -141,7 +173,6 @@ easy to do with RStudio too.
[Everware](https://github.com/everware) Reproducible and reusable science powered by jupyterhub and docker. Like nbviewer, but executable. CERN, Geneva [website](http://everware.xyz/)
In short, where you see `/user/name/notebooks/foo.ipynb` use `/hub/user-redirect/notebooks/foo.ipynb` (replace `/user/name` with `/hub/user-redirect`).
Sharing links to notebooks is a common activity,
and can look different based on what you mean.
Your first instinct might be to copy the URL you see in the browser,
e.g. `hub.jupyter.org/user/yourname/notebooks/coolthing.ipynb`.
However, let's break down what this URL means:
`hub.jupyter.org/user/yourname/` is the URL prefix handled by _your server_,
which means that sharing this URL is asking the person you share the link with
to come to _your server_ and look at the exact same file.
In most circumstances, this is forbidden by permissions because the person you share with does not have access to your server.
What actually happens when someone visits this URL will depend on whether your server is running and other factors.
But what is our actual goal?
A typical situation is that you have some shared or common filesystem,
such that the same path corresponds to the same document
(either the exact same document or another copy of it).
Typically, what folks want when they do sharing like this
is for each visitor to open the same file _on their own server_,
so Breq would open `/user/breq/notebooks/foo.ipynb` and
Seivarden would open `/user/seivarden/notebooks/foo.ipynb`, etc.
JupyterHub has a special URL that does exactly this!
It's called `/hub/user-redirect/...`.
So if you replace `/user/yourname` in your URL bar
with `/hub/user-redirect` any visitor should get the same
URL on their own server, rather than visiting yours.
In JupyterLab 2.0, this should also be the result of the "Copy Shareable Link"
2. If you need to allow for even more users, a dynamic amount of servers can be used on a cloud,
take a look at the `Zero to JupyterHub with Kubernetes <https://github.com/jupyterhub/zero-to-jupyterhub-k8s>`__ .
Four subsystems make up JupyterHub:
* a **Hub** (tornado process) that is the heart of JupyterHub
* a **configurable http proxy** (node-http-proxy) that receives the requests from the client's browser
* multiple **single-user Jupyter notebook servers** (Python/IPython/tornado) that are monitored by Spawners
* an **authentication class** that manages how users can access the system
Besides these central pieces, you can add optional configurations through a `config.py` file and manage users kernels on an admin panel. A simplification of the whole system can be seen in the figure below:
Role Based Access Control (RBAC) in JupyterHub serves to provide fine grained control of access to Jupyterhub's API resources.
RBAC is new in JupyterHub 2.0.
## Motivation
The JupyterHub API requires authorization to access its APIs.
This ensures that an arbitrary user, or even an unauthenticated third party, are not allowed to perform such actions.
For instance, the behaviour prior to adoption of RBAC is that creating or deleting users requires _admin rights_.
The prior system is functional, but lacks flexibility. If your Hub serves a number of users in different groups, you might want to delegate permissions to other users or automate certain processes.
Prior to RBAC, appointing a 'group-only admin' or a bot that culls idle servers, requires granting full admin rights to all actions. This poses a risk of the user or service intentionally or unintentionally accessing and modifying any data within the Hub and violates the [principle of least privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege).
To remedy situations like this, JupyterHub is transitioning to an RBAC system. By equipping users, groups and services with _roles_ that supply them with a collection of permissions (_scopes_), administrators are able to fine-tune which parties are granted access to which resources.
## Definitions
**Scopes** are specific permissions used to evaluate API requests. For example: the API endpoint `users/servers`, which enables starting or stopping user servers, is guarded by the scope `servers`.
Scopes are not directly assigned to requesters. Rather, when a client performs an API call, their access will be evaluated based on their assigned roles.
**Roles** are collections of scopes that specify the level of what a client is allowed to do. For example, a group administrator may be granted permission to control the servers of group members, but not to create, modify or delete group members themselves.
Within the RBAC framework, this is achieved by assigning a role to the administrator that covers exactly those privileges.
JupyterHub provides four roles that are available by default:
```{admonition} **Default roles**
- `user` role provides a {ref}`default user scope <default-user-scope-target>` `self` that grants access to the user's own resources.
- `admin` role contains all available scopes and grants full rights to all actions. This role **cannot be edited**.
- `token` role provides a {ref}`default token scope <default-token-scope-target>` `all` that resolves to the same permissions as the owner of the token has.
- `server` role allows for posting activity of "itself" only.
**These roles cannot be deleted.**
```
These default roles have a default collection of scopes,
but you can define the scopes associated with each role (excluding admin) to suit your needs,
as seen [below](overriding-default-roles).
The `user`, `admin`, and `token` roles by default all preserve the permissions prior to RBAC.
Only the `server` role is changed from pre-2.0, to reduce its permissions to activity-only
instead of the default of a full access token.
Additional custom roles can also be defined (see {ref}`define-role-target`).
Roles can be assigned to the following entities:
- Users
- Services
- Groups
- Tokens
An entity can have zero, one, or multiple roles, and there are no restrictions on which roles can be assigned to which entity. Roles can be added to or removed from entities at any time.
**Users** \
When a new user gets created, they are assigned their default role `user`. Additionaly, if the user is created with admin privileges (via `c.Authenticator.admin_users` in `jupyterhub_config.py` or `admin: true` via API), they will be also granted `admin` role. If existing user's admin status changes via API or `jupyterhub_config.py`, their default role will be updated accordingly (after next startup for the latter).
**Services** \
Services do not have a default role. Services without roles have no access to the guarded API end-points, so most services will require assignment of a role in order to function.
**Groups** \
A group does not require any role, and has no roles by default. If a user is a member of a group, they automatically inherit any of the group's permissions (see {ref}`resolving-roles-scopes-target` for more details). This is useful for assigning a set of common permissions to several users.
**Tokens** \
A token’s permissions are evaluated based on their owning entity. Since a token is always issued for a user or service, it can never have more permissions than its owner. If no specific role is requested for a new token, the token is assigned the `token` role.
(define-role-target)=
## Defining Roles
Roles can be defined or modified in the configuration file as a list of dictionaries. An example:
% TODO: think about loading users into roles if membership has been changed via API.
% What should be the result?
```python
# in jupyterhub_config.py
c.JupyterHub.load_roles = [
{
'name': 'server-rights',
'description': 'Allows parties to start and stop user servers',
'scopes': ['servers'],
'users': ['alice', 'bob'],
'services': ['idle-culler'],
'groups': ['admin-group'],
}
]
```
The role `server-rights` now allows the starting and stopping of servers by any of the following:
- users `alice` and `bob`
- the service `idle-culler`
- any member of the `admin-group`.
```{attention}
Tokens cannot be assigned roles through role definition but may be assigned specific roles when requested via API (see {ref}`requesting-api-token-target`).
```
Another example:
```python
# in jupyterhub_config.py
c.JupyterHub.load_roles = [
{
'description': 'Read-only user models',
'name': 'reader',
'scopes': ['read:users'],
'services': ['external'],
'users': ['maria', 'joe']
}
]
```
The role `reader` allows users `maria` and `joe` and service `external` to read (but not modify) any user’s model.
```{admonition} Requirements
:class: warning
In a role definition, the `name` field is required, while all other fields are optional.\
**Role names must:**
- be 3 - 255 characters
- use ascii lowercase, numbers, 'unreserved' URL punctuation `-_.~`
- start with a letter
- end with letter or number.
`users`, `services`, and `groups` only accept objects that already exist in the database or are defined previously in the file.
It is not possible to implicitly add a new user to the database by defining a new role.
```
If no scopes are defined for _new role_, JupyterHub will raise a warning. Providing non-existing scopes will result in an error.
In case the role with a certain name already exists in the database, its definition and scopes will be overwritten. This holds true for all roles except the `admin` role, which cannot be overwritten; an error will be raised if trying to do so. All the role bearers permissions present in the definition will change accordingly.
(overriding-default-roles)=
### Overriding default roles
Role definitions can include those of the "default" roles listed above (admin excluded),
if the default scopes associated with those roles do not suit your deployment.
For example, to specify what permissions the $JUPYTERHUB_API_TOKEN issued to all single-user servers
has,
define the `server` role.
To restore the JupyterHub 1.x behavior of servers being able to do anything their owners can do,
use the scope `all`:
```python
c.JupyterHub.load_roles = [
{
'name': 'server',
'scopes': ['all'],
}
]
```
or, better yet, identify the specific [scopes][] you want server environments to have access to.
[scopes]: available-scopes-target
If you don't want to get too detailed,
one option is the `self` scope,
which will have no effect on non-admin users,
but will restrict the token issued to admin user servers to only have access to their own resources,
instead of being able to take actions on behalf of all other users.
```python
c.JupyterHub.load_roles = [
{
'name': 'server',
'scopes': ['self'],
}
]
```
(removing-roles-target)=
## Removing roles
Only the entities present in the role definition in the `jupyterhub_config.py` remain the role bearers. If a user, service or group is removed from the role definition, they will lose the role on the next startup.
Once a role is loaded, it remains in the database until removing it from the `jupyterhub_config.py` and restarting the Hub. All previously defined role bearers will lose the role and associated permissions. Default roles, even if previously redefined through the config file and removed, will not be deleted from the database.
A scope has a syntax-based design that reveals which resources it provides access to. Resources are objects with a type, associated data, relationships to other resources, and a set of methods that operate on them (see [RESTful API](https://restful-api-design.readthedocs.io/en/latest/resources.html) documentation for more information).
`<resource>` in the RBAC scope design refers to the resource name in the [JupyterHub's API](../reference/rest-api.rst) endpoints in most cases. For instance, `<resource>` equal to `users` corresponds to JupyterHub's API endpoints beginning with _/users_.
(scope-conventions-target)=
## Scope conventions
-`<resource>` \
The top-level `<resource>` scopes, such as `users` or `groups`, grant read, write, and list permissions to the resource itself as well as its sub-resources. For example, the scope `users:activity` is included in the scope `users`.
-`read:<resource>` \
Limits permissions to read-only operations on single resources.
-`list:<resource>` \
Read-only access to listing endpoints.
Use `read:<resource>:<subresource>` to control what fields are returned.
-`admin:<resource>` \
Grants additional permissions such as create/delete on the corresponding resource in addition to read and write permissions.
-`access:<resource>` \
Grants access permissions to the `<resource>` via API or browser.
-`<resource>:<subresource>` \
The {ref}`vertically filtered <vertical-filtering-target>` scopes provide access to a subset of the information granted by the `<resource>` scope. E.g., the scope `users:activity` only provides permission to post user activity.
-`<resource>!<object>=<objectname>` \
{ref}`horizontal-filtering-target` is implemented by the `!<object>=<objectname>`scope structure. A resource (or sub-resource) can be filtered based on `user`, `server`, `group` or `service` name. For instance, `<resource>!user=charlie` limits access to only return resources of user `charlie`. \
Only one filter per scope is allowed, but filters for the same scope have an additive effect; a larger filter can be used by supplying the scope multiple times with different filters.
By adding a scope to an existing role, all role bearers will gain the associated permissions.
## Metascopes
Metascopes do not follow the general scope syntax. Instead, a metascope resolves to a set of scopes, which can refer to different resources, based on their owning entity. In JupyterHub, there are currently two metascopes:
1. default user scope `self`, and
2. default token scope `all`.
(default-user-scope-target)=
### Default user scope
Access to the user's own resources and subresources is covered by metascope `self`. This metascope includes the user's model, activity, servers and tokens. For example, `self` for a user named "gerard" includes:
-`users!user=gerard` where the `users` scope provides access to the full user model and activity. The filter restricts this access to the user's own resources.
-`servers!user=gerard` which grants the user access to their own servers without being able to create/delete any.
-`tokens!user=gerard` which allows the user to access, request and delete their own tokens.
-`access:servers!user=gerard` which allows the user to access their own servers via API or browser.
The `self` scope is only valid for user entities. In other cases (e.g., for services) it resolves to an empty set of scopes.
(default-token-scope-target)=
### Default token scope
The token metascope `all` covers the same scopes as the token owner's scopes during requests. For example, if a token owner has roles containing the scopes `read:groups` and `read:users`, the `all` scope resolves to the set of scopes `{read:groups, read:users}`.
If the token owner has default `user` role, the `all` scope resolves to `self`, which will subsequently be expanded to include all the user-specific scopes (or empty set in the case of services).
If the token owner is a member of any group with roles, the group scopes will also be included in resolving the `all` scope.
(horizontal-filtering-target)=
## Horizontal filtering
Horizontal filtering, also called _resource filtering_, is the concept of reducing the payload of an API call to cover only the subset of the _resources_ that the scopes of the client provides them access to.
Requested resources are filtered based on the filter of the corresponding scope. For instance, if a service requests a user list (guarded with scope `read:users`) with a role that only contains scopes `read:users!user=hannah` and `read:users!user=ivan`, the returned list of user models will be an intersection of all users and the collection `{hannah, ivan}`. In case this intersection is empty, the API call returns an HTTP 404 error, regardless if any users exist outside of the clients scope filter collection.
In case a user resource is being accessed, any scopes with _group_ filters will be expanded to filters for each _user_ in those groups.
### `!user` filter
The `!user` filter is a special horizontal filter that strictly refers to the **"owner only"** scopes, where _owner_ is a user entity. The filter resolves internally into `!user=<ownerusername>` ensuring that only the owner's resources may be accessed through the associated scopes.
For example, the `server` role assigned by default to server tokens contains `access:servers!user` and `users:activity!user` scopes. This allows the token to access and post activity of only the servers owned by the token owner.
The filter can be applied to any scope.
(vertical-filtering-target)=
## Vertical filtering
Vertical filtering, also called _attribute filtering_, is the concept of reducing the payload of an API call to cover only the _attributes_ of the resources that the scopes of the client provides them access to. This occurs when the client scopes are subscopes of the API endpoint that is called.
For instance, if a client requests a user list with the only scope being `read:users:groups`, the returned list of user models will contain only a list of groups per user.
In case the client has multiple subscopes, the call returns the union of the data the client has access to.
The payload of an API call can be filtered both horizontally and vertically simultaneously. For instance, performing an API call to the endpoint `/users/` with the scope `users:name!user=juliette` returns a payload of `[{name: 'juliette'}]` (provided that this name is present in the database).
(available-scopes-target)=
## Available scopes
Table below lists all available scopes and illustrates their hierarchy. Indented scopes indicate subscopes of the scope(s) above them.
There are four exceptions to the general {ref}`scope conventions <scope-conventions-target>`:
-`read:users:name` is a subscope of both `read:users` and `read:servers`. \
The `read:servers` scope requires access to the user name (server owner) due to named servers distinguished internally in the form `!server=username/servername`.
-`read:users:activity` is a subscope of both `read:users` and `users:activity`. \
Posting activity via the `users:activity`, which is not included in `users` scope, needs to check the last valid activity of the user.
-`read:roles:users` is a subscope of both `read:roles` and `admin:users`. \
Admin privileges to the _users_ resource include the information about user roles.
-`read:roles:groups` is a subscope of both `read:roles` and `admin:groups`. \
Similar to the `read:roles:users` above.
```{include} scope-table.md
```
```{Caution}
Note that only the {ref}`horizontal filtering <horizontal-filtering-target>` can be added to scopes to customize them. \
Metascopes `self` and `all`, `<resource>`, `<resource>:<subresource>`, `read:<resource>`, `admin:<resource>`, and `access:<resource>` scopes are predefined and cannot be changed otherwise.
```
### Scopes and APIs
The scopes are also listed in the [](../reference/rest-api.rst) documentation. Each API endpoint has a list of scopes which can be used to access the API; if no scopes are listed, the API is not authenticated and can be accessed without any permissions (i.e., no scopes).
Listed scopes by each API endpoint reflect the "lowest" permissions required to gain any access to the corresponding API. For example, posting user's activity (_POST /users/:name/activity_) needs `users:activity` scope. If scope `users` is passed during the request, the access will be granted as the required scope is a subscope of the `users` scope. If, on the other hand, `read:users:activity` scope is passed, the access will be denied.
Roles are stored in the database, where they are associated with users, services, etc., and can be added or modified as explained in {ref}`define-role-target` section. Users, services, groups, and tokens can gain, change, and lose roles. This is currently achieved via `jupyterhub_config.py` (see {ref}`define-role-target`) and will be made available via API in future. The latter will allow for changing a token's role, and thereby its permissions, without the need to issue a new token.
Roles and scopes utilities can be found in `roles.py` and `scopes.py` modules. Scope variables take on five different formats which is reflected throughout the utilities via specific nomenclature:
```{admonition} **Scope variable nomenclature**
:class: tip
- _scopes_ \
List of scopes with abbreviations (used in role definitions). E.g., `["users:activity!user"]`.
- _expanded scopes_ \
Set of expanded scopes without abbreviations (i.e., resolved metascopes, filters and subscopes). E.g., `{"users:activity!user=charlie", "read:users:activity!user=charlie"}`.
- _parsed scopes_ \
Dictionary JSON like format of expanded scopes. E.g., `{"users:activity": {"user": ["charlie"]}, "read:users:activity": {"users": ["charlie"]}}`.
- _intersection_ \
Set of expanded scopes as intersection of 2 expanded scope sets.
- _identify scopes_ \
Set of expanded scopes needed for identify (whoami) endpoints.
```
(resolving-roles-scopes-target)=
## Resolving roles and scopes
**Resolving roles** refers to determining which roles a user, service, token, or group has, extracting the list of scopes from each role and combining them into a single set of scopes.
**Resolving scopes** involves expanding scopes into all their possible subscopes (_expanded scopes_), parsing them into format used for access evaluation (_parsed scopes_) and, if applicable, comparing two sets of scopes (_intersection_). All procedures take into account the scope hierarchy, {ref}`vertical <vertical-filtering-target>` and {ref}`horizontal filtering <horizontal-filtering-target>`, limiting or elevated permissions (`read:<resource>` or `admin:<resource>`, respectively), and metascopes.
Roles and scopes are resolved on several occasions, for example when requesting an API token with specific roles or making an API request. The following sections provide more details.
(requesting-api-token-target)=
### Requesting API token with specific roles
API tokens grant access to JupyterHub's APIs. The RBAC framework allows for requesting tokens with specific existing roles. To date, it is only possible to add roles to a token through the _POST /users/:name/tokens_ API where the roles can be specified in the token parameters body (see [](../reference/rest-api.rst)).
RBAC adds several steps into the token issue flow.
If no roles are requested, the token is issued with the default `token` role (providing the requester is allowed to create the token).
If the token is requested with any roles, the permissions of requesting entity are checked against the requested permissions to ensure the token would not grant its owner additional privileges.
If, due to modifications of roles or entities, at API request time a token has any scopes that its owner does not, those scopes are removed. The API request is resolved without additional errors using the scopes _intersection_, but the Hub logs a warning (see {ref}`Figure 2 <api-request-chart>`).
Resolving a token's roles (yellow box in {ref}`Figure 1 <token-request-chart>`) corresponds to resolving all the token's owner roles (including the roles associated with their groups) and the token's requested roles into a set of scopes. The two sets are compared (Resolve the scopes box in orange in {ref}`Figure 1 <token-request-chart>`), taking into account the scope hierarchy but, solely for role assignment, omitting any {ref}`horizontal filter <horizontal-filtering-target>` comparison. If the token's scopes are a subset of the token owner's scopes, the token is issued with the requested roles; if not, JupyterHub will raise an error.
{ref}`Figure 1 <token-request-chart>` below illustrates the steps involved. The orange rectangles highlight where in the process the roles and scopes are resolved.
Figure 1. Resolving roles and scopes during API token request
```
### Making an API request
With the RBAC framework each authenticated JupyterHub API request is guarded by a scope decorator that specifies which scopes are required to gain the access to the API.
When an API request is performed, the requesting API token's roles are again resolved (yellow box in {ref}`Figure 2 <api-request-chart>`) to ensure the token does not grant more permissions than its owner has at the request time (e.g., due to changing/losing roles).
If the owner's roles do not include some scopes of the token's scopes, only the _intersection_ of the token's and owner's scopes will be used. For example, using a token with scope `users` whose owner's role scope is `read:users:name` will result in only the `read:users:name` scope being passed on. In the case of no _intersection_, an empty set of scopes will be used.
The passed scopes are compared to the scopes required to access the API as follows:
- if the API scopes are present within the set of passed scopes, the access is granted and the API returns its "full" response
- if that is not the case, another check is utilized to determine if subscopes of the required API scopes can be found in the passed scope set:
- if found, the RBAC framework employs the {ref}`filtering <vertical-filtering-target>` procedures to refine the API response to access only resource attributes corresponding to the passed scopes. For example, providing a scope `read:users:activity!group=class-C` for the _GET /users_ API will return a list of user models from group `class-C` containing only the `last_activity` attribute for each user model
- if not found, the access to API is denied
{ref}`Figure 2 <api-request-chart>` illustrates this process highlighting the steps where the role and scope resolutions as well as filtering occur in orange.
```{figure} ../images/rbac-api-request-chart.png
:align: center
:name: api-request-chart
Figure 2. Resolving roles and scopes when an API request is made
RBAC framework requires different database setup than any previous JupyterHub versions due to eliminating the distinction between OAuth and API tokens (see {ref}`oauth-vs-api-tokens-target` for more details). This requires merging the previously two different database tables into one. By doing so, all existing tokens created before the upgrade no longer comply with the new database version and must be replaced.
This is achieved by the Hub deleting all existing tokens during the database upgrade and recreating the tokens loaded via the `jupyterhub_config.py` file with updated structure. However, any manually issued or stored tokens are not recreated automatically and must be manually re-issued after the upgrade.
No other database records are affected.
(rbac-upgrade-steps-target)=
## Upgrade steps
1. All running **servers must be stopped** before proceeding with the upgrade.
2. To upgrade the Hub, follow the [Upgrading JupyterHub](../admin/upgrading.rst) instructions.
```{attention}
We advise against defining any new roles in the `jupyterhub.config.py` file right after the upgrade is completed and JupyterHub restarted for the first time. This preserves the 'current' state of the Hub. You can define and assign new roles on any other following startup.
```
3. After restarting the Hub **re-issue all tokens that were previously issued manually** (i.e., not through the `jupyterhub_config.py` file).
When the JupyterHub is restarted for the first time after the upgrade, all users, services and tokens stored in the database or re-loaded through the configuration file will be assigned their default role. Any newly added entities after that will be assigned their default role only if no other specific role is requested for them.
## Changing the permissions after the upgrade
Once all the {ref}`upgrade steps <rbac-upgrade-steps-target>` above are completed, the RBAC framework will be available for utilization. You can define new roles, modify default roles (apart from `admin`) and assign them to entities as described in the {ref}`define-role-target` section.
We recommended the following procedure to start with RBAC:
1. Identify which admin users and services you would like to grant only the permissions they need through the new roles.
2. Strip these users and services of their admin status via API or UI. This will change their roles from `admin` to `user`.
```{note}
Stripping entities of their roles is currently available only via `jupyterhub_config.py` (see {ref}`removing-roles-target`).
```
3. Define new roles that you would like to start using with appropriate scopes and assign them to these entities in `jupyterhub_config.py`.
4. Restart the JupyterHub for the new roles to take effect.
(oauth-vs-api-tokens-target)=
## OAuth vs API tokens
### Before RBAC
Previous JupyterHub versions utilize two types of tokens, OAuth token and API token.
OAuth token is issued by the Hub to a single-user server when the user logs in. The token is stored in the browser cookie and is used to identify the user who owns the server during the OAuth flow. This token by default expires when the cookie reaches its expiry time of 2 weeks (or after 1 hour in JupyterHub versions < 1.3.0).
API token is issued by the Hub to a single-user server when launched and is used to communicate with the Hub's APIs such as posting activity or completing the OAuth flow. This token has no expiry by default.
API tokens can also be issued to users via API ([_/hub/token_](../reference/urls.md) or [_POST /users/:username/tokens_](../reference/rest-api.rst)) and services via `jupyterhub_config.py` to perform API requests.
### With RBAC
The RBAC framework allows for granting tokens different levels of permissions via scopes attached to roles. The 'only identify' purpose of the separate OAuth tokens is no longer required. API tokens can be used used for every action, including the login and authentication, for which an API token with no role (i.e., no scope in {ref}`available-scopes-target`) is used.
OAuth tokens are therefore dropped from the Hub upgraded with the RBAC framework.
Note that in the RBAC system the `admin` field in the `idle-culler` service definition is omitted. Instead, the `idle-culler` role provides the service with only the permissions it needs.
If the optional actions of deleting the idle servers and/or removing inactive users are desired, **change the following scopes** in the `idle-culler` role definition:
- `servers` to `admin:servers` for deleting servers
- `read:users:name`, `read:users:activity` to `admin:users` for deleting users.
```
3. Restart JupyterHub to complete the process.
## API launcher
A service capable of creating/removing users and launching multiple servers should have access to:
1. _POST_ and _DELETE /users_
2. _POST_ and _DELETE /users/:name/server_ or _/users/:name/servers/:server_name_
3. Creating/deleting servers
The scopes required to access the API enpoints:
1. `admin:users`
2. `servers`
3. `admin:servers`
From the above, the role definition is:
```python
# in jupyterhub_config.py
c.JupyterHub.load_roles = [
{
"name": "api-launcher",
"description": "Manages servers",
"scopes": ["admin:users", "admin:servers"],
"services": [<service_name>]
}
]
```
If needed, the scopes can be modified to limit the permissions to e.g. a particular group with `!group=groupname` filter.
## Group admin roles
Roles can be used to specify different group member privileges.
For example, a group of students `class-A` may have a role allowing all group members to access information about their group. Teacher `johan`, who is a student of `class-A` but a teacher of another group of students `class-B`, can have additional role permitting him to access information about `class-B` students as well as start/stop their servers.
The roles can then be defined as follows:
```python
# in jupyterhub_config.py
c.JupyterHub.load_groups = {
'class-A': ['johan', 'student1', 'student2'],
'class-B': ['student3', 'student4']
}
c.JupyterHub.load_roles = [
{
'name': 'class-A-student',
'description': 'Grants access to information about the group',
'scopes': ['read:groups!group=class-A'],
'groups': ['class-A']
},
{
'name': 'class-B-student',
'description': 'Grants access to information about the group',
'scopes': ['read:groups!group=class-B'],
'groups': ['class-B']
},
{
'name': 'teacher',
'description': 'Allows for accessing information about teacher group members and starting/stopping their servers',
In the above example, `johan` has privileges inherited from `class-A-student` role and the `teacher` role on top of those.
```{note}
The scope filters (`!group=`) limit the privileges only to the particular groups. `johan` can access the servers and information of `class-B` group members only.
There are two broad categories of user environments that depend on what
@@ -141,7 +137,51 @@ When JupyterHub uses **container-based** Spawners (e.g. KubeSpawner or
DockerSpawner), the 'system-wide' environment is really the container image
which you are using for users.
In both cases, you want to *avoid putting configuration in user home
directories* because users can change those configuration settings. Also,
In both cases, you want to _avoid putting configuration in user home
directories_ because users can change those configuration settings. Also,
home directories typically persist once they are created, so they are
difficult for admins to update later.
## Named servers
By default, in a JupyterHub deployment each user has exactly one server.
JupyterHub can, however, have multiple servers per user.
This is most useful in deployments where users can configure the environment
in which their server will start (e.g. resource requests on an HPC cluster),
so that a given user can have multiple configurations running at the same time,
without having to stop and restart their one server.
To allow named servers:
```python
c.JupyterHub.allow_named_servers=True
```
Named servers were implemented in the REST API in JupyterHub 0.8,
and JupyterHub 1.0 introduces UI for managing named servers via the user home page:

as well as the admin page:

Named servers can be accessed, created, started, stopped, and deleted
from these pages. Activity tracking is now per-server as well.
The number of named servers per user can be limited by setting
```python
c.JupyterHub.named_server_limit_per_user=5
```
## Switching to Jupyter Server
[Jupyter Server](https://jupyter-server.readthedocs.io/en/latest/) is a new Tornado Server backend for Jupyter web applications (e.g. JupyterLab 3.0 uses this package as its default backend).
By default, the single-user notebook server uses the (old) `NotebookApp` from the [notebook](https://github.com/jupyter/notebook) package. You can switch to using Jupyter Server's `ServerApp` backend (this will likely become the default in future releases) by setting the `JUPYTERHUB_SINGLEUSER_APP` environment variable to:
This section covers more of the details of the JupyterHub architecture, as well as
what happens under-the-hood when you deploy and configure your JupyterHub.
..toctree::
:maxdepth:2
technical-overview
urls
websecurity
authenticators
spawners
services
proxy
separate-proxy
rest
server-api
monitoring
database
upgrading
templates
../events/index
config-user-env
config-examples
config-ghoauth
config-proxy
config-sudo
config-reference
oauth
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.