forward-port 4.1.0

This commit is contained in:
Min RK
2024-03-19 12:55:10 +01:00
parent 970693ef46
commit 83ce6d3f6b
20 changed files with 1122 additions and 136 deletions

View File

@@ -103,6 +103,9 @@ jobs:
subset: singleuser
- python: "3.11"
browser: browser
- python: "3.11"
subdomain: subdomain
browser: browser
- python: "3.12"
main_dependencies: main_dependencies

View File

@@ -16,7 +16,8 @@ works.
JupyterHub is designed to be a _simple multi-user server for modestly sized
groups_ of **semi-trusted** users. While the design reflects serving
semi-trusted users, JupyterHub can also be suitable for serving **untrusted** users.
semi-trusted users, JupyterHub can also be suitable for serving **untrusted** users,
but **is not suitable for untrusted users** in its default configuration.
As a result, using JupyterHub with **untrusted** users means more work by the
administrator, since much care is required to secure a Hub, with extra caution on
@@ -56,30 +57,63 @@ ensure that:
If any additional services are run on the same domain as the Hub, the services
**must never** display user-authored HTML that is neither _sanitized_ nor _sandboxed_
(e.g. IFramed) to any user that lacks authentication as the author of a file.
to any user that lacks authentication as the author of a file.
### Sharing access to servers
Because sharing access to servers (via `access:servers` scopes or the sharing feature in JupyterHub 5) by definition means users can serve each other files, enabling sharing is not suitable for untrusted users without also enabling per-user domains.
JupyterHub does not enable any sharing by default.
## Mitigate security issues
The several approaches to mitigating security issues with configuration
options provided by JupyterHub include:
### Enable subdomains
### Enable user subdomains
JupyterHub provides the ability to run single-user servers on their own
subdomains. This means the cross-origin protections between servers has the
desired effect, and user servers and the Hub are protected from each other. A
user's single-user server will be at `username.jupyter.mydomain.com`. This also
requires all user subdomains to point to the same address, which is most easily
accomplished with wildcard DNS. Since this spreads the service across multiple
domains, you will need wildcard SSL as well. Unfortunately, for many
institutional domains, wildcard DNS and SSL are not available. **If you do plan
to serve untrusted users, enabling subdomains is highly encouraged**, as it
resolves the cross-site issues.
domains. This means the cross-origin protections between servers has the
desired effect, and user servers and the Hub are protected from each other.
**Subdomains are the only way to reliably isolate user servers from each other.**
To enable subdomains, set:
```python
c.JupyterHub.subdomain_host = "https://jupyter.example.org"
```
When subdomains are enabled, each user's single-user server will be at e.g. `https://username.jupyter.example.org`.
This also requires all user subdomains to point to the same address,
which is most easily accomplished with wildcard DNS, where a single A record points to your server and a wildcard CNAME record points to your A record:
```
A jupyter.example.org 192.168.1.123
CNAME *.jupyter.example.org jupyter.example.org
```
Since this spreads the service across multiple domains, you will likely need wildcard SSL as well,
matching `*.jupyter.example.org`.
Unfortunately, for many institutional domains, wildcard DNS and SSL may not be available.
We also **strongly encourage** serving JupyterHub and user content on a domain that is _not_ a subdomain of any sensitive content.
For reasoning, see [GitHub's discussion of moving user content to github.io from \*.github.com](https://github.blog/2013-04-09-yummy-cookies-across-domains/).
**If you do plan to serve untrusted users, enabling subdomains is highly encouraged**,
as it resolves many security issues, which are difficult to unavoidable when JupyterHub is on a single-domain.
:::{important}
JupyterHub makes no guarantees about protecting users from each other unless subdomains are enabled.
If you want to protect users from each other, you **_must_** enable per-user domains.
:::
### Disable user config
If subdomains are unavailable or undesirable, JupyterHub provides a
configuration option `Spawner.disable_user_config`, which can be set to prevent
configuration option `Spawner.disable_user_config = True`, which can be set to prevent
the user-owned configuration files from being loaded. After implementing this
option, `PATH`s and package installation are the other things that the
admin must enforce.
@@ -89,23 +123,24 @@ admin must enforce.
For most Spawners, `PATH` is not something users can influence, but it's important that
the Spawner should _not_ evaluate shell configuration files prior to launching the server.
### Isolate packages using virtualenv
### Isolate packages in a read-only environment
Package isolation is most easily handled by running the single-user server in
a virtualenv with disabled system-site-packages. The user should not have
permission to install packages into this environment.
The user must not have permission to install packages into the environment where the singleuser-server runs.
On a shared system, package isolation is most easily handled by running the single-user server in
a root-owned virtualenv with disabled system-site-packages.
The user must not have permission to install packages into this environment.
The same principle extends to the images used by container-based deployments.
If users can select the images in which their servers run, they can disable all security.
If users can select the images in which their servers run, they can disable all security for their own servers.
It is important to note that the control over the environment only affects the
single-user server, and not the environment(s) in which the user's kernel(s)
It is important to note that the control over the environment is only required for the
single-user server, and not the environment(s) in which the users' kernel(s)
may run. Installing additional packages in the kernel environment does not
pose additional risk to the web application's security.
### Encrypt internal connections with SSL/TLS
By default, all communications on the server, between the proxy, hub, and single
-user notebooks are performed unencrypted. Setting the `internal_ssl` flag in
By default, all communications within JupyterHub—between the proxy, hub, and single
-user notebooksare performed unencrypted. Setting the `internal_ssl` flag in
`jupyterhub_config.py` secures the aforementioned routes. Turning this
feature on does require that the enabled `Spawner` can use the certificates
generated by the `Hub` (the default `LocalProcessSpawner` can, for instance).
@@ -119,6 +154,104 @@ Unix permissions to the communication sockets thereby restricting
communication to the socket owner. The `internal_ssl` option will eventually
extend to securing the `tcp` sockets as well.
### Mitigating same-origin deployments
While per-user domains are **required** for robust protection of users from each other,
you can mitigate many (but not all) cross-user issues.
First, it is critical that users cannot modify their server environments, as described above.
Second, it is important that users do not have `access:servers` permission to any server other than their own.
If users can access each others' servers, additional security measures must be enabled, some of which come with distinct user-experience costs.
Without the [Same-Origin Policy] (SOP) protecting user servers from each other,
each user server is considered a trusted origin for requests to each other user server (and the Hub itself).
Servers _cannot_ meaningfully distinguish requests originating from other user servers,
because SOP implies a great deal of trust, losing many restrictions applied to cross-origin requests.
That means pages served from each user server can:
1. arbitrarily modify the path in the Referer
2. make fully authorized requests with cookies
3. access full page contents served from the hub or other servers via popups
JupyterHub uses distinct xsrf tokens stored in cookies on each server path to attempt to limit requests across.
This has limitations because not all requests are protected by these XSRF tokens,
and unless additional measures are taken, the XSRF tokens from other user prefixes may be retrieved.
[Same-Origin Policy]: https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy
For example:
- `Content-Security-Policy` header must prohibit popups and iframes from the same origin.
The following Content-Security-Policy rules are _insecure_ and readily enable users to access each others' servers:
- `frame-ancestors: 'self'`
- `frame-ancestors: '*'`
- `sandbox allow-popups`
- Ideally, pages should use the strictest `Content-Security-Policy: sandbox` available,
but this is not feasible in general for JupyterLab pages, which need at least `sandbox allow-same-origin allow-scripts` to work.
The default Content-Security-Policy for single-user servers is
```
frame-ancestors: 'none'
```
which prohibits iframe embedding, but not pop-ups.
A more secure Content-Security-Policy that has some costs to user experience is:
```
frame-ancestors: 'none'; sandbox allow-same-origin allow-scripts
```
`allow-popups` is not disabled by default because disabling it breaks legitimate functionality, like "Open this in a new tab", and the "JupyterHub Control Panel" menu item.
To reiterate, the right way to avoid these issues is to enable per-user domains, where none of these concerns come up.
Note: even this level of protection requires administrators maintaining full control over the user server environment.
If users can modify their server environment, these methods are ineffective, as users can readily disable them.
### Cookie tossing
Cookie tossing is a technique where another server on a subdomain or peer subdomain can set a cookie
which will be read on another domain.
This is not relevant unless there are other user-controlled servers on a peer domain.
"Domain-locked" cookies avoid this issue, but have their own restrictions:
- JupyterHub must be served over HTTPS
- All secure cookies must be set on `/`, not on sub-paths, which means they are shared by all JupyterHub components in a single-domain deployment.
As a result, this option is only recommended when per-user subdomains are enabled,
to prevent sending all jupyterhub cookies to all user servers.
To enable domain-locked cookies, set:
```python
c.JupyterHub.cookie_host_prefix_enabled = True
```
```{versionadded} 4.1
```
### Forced-login
Jupyter servers can share links with `?token=...`.
JupyterHub prior to 5.0 will accept this request and persist the token for future requests.
This is useful for enabling admins to create 'fully authenticated' links bypassing login.
However, it also means users can share their own links that will log other users into their own servers,
enabling them to serve each other notebooks and other arbitrary HTML, depending on server configuration.
```{versionadded} 4.1
Setting environment variable `JUPYTERHUB_ALLOW_TOKEN_IN_URL=0` in the single-user environment can opt out of accepting token auth in URL parameters.
```
```{versionadded} 5.0
Accepting tokens in URLs is disabled by default, and `JUPYTERHUB_ALLOW_TOKEN_IN_URL=1` environment variable must be set to _allow_ token auth in URL parameters.
```
## Security audits
We recommend that you do periodic reviews of your deployment's security. It's

View File

@@ -8,6 +8,57 @@ command line for details.
## [Unreleased]
## 4.1
### 4.1.0 - 2024-03
JupyterHub 4.1 is a security release, fixing [CVE-2024-28233].
All JupyterHub deployments are encouraged to upgrade,
especially those with other user content on peer domains to JupyterHub.
As always, JupyterHub deployments are especially encouraged to enable per-user domains if protecting users from each other is a concern.
For more information on securely deploying JupyterHub, see the [web security documentation](web-security).
[CVE-2024-28233]: https://github.com/jupyterhub/jupyterhub/security/advisories/GHSA-7r3h-4ph8-w38g
([full changelog](https://github.com/jupyterhub/jupyterhub/compare/4.0.2...4.1.0))
#### Enhancements made
- Backport PR #4628 on branch 4.x (Include LDAP groups in local spawner gids) [#4735](https://github.com/jupyterhub/jupyterhub/pull/4735) ([@minrk](https://github.com/minrk))
- Backport PR #4561 on branch 4.x (Improve debugging when waiting for servers) [#4714](https://github.com/jupyterhub/jupyterhub/pull/4714) ([@minrk](https://github.com/minrk))
- Backport PR #4563 on branch 4.x (only set 'domain' field on session-id cookie) [#4707](https://github.com/jupyterhub/jupyterhub/pull/4707) ([@minrk](https://github.com/minrk))
#### Bugs fixed
- Backport PR #4733 on branch 4.x (Catch ValueError while waiting for server to be reachable) [#4734](https://github.com/jupyterhub/jupyterhub/pull/4734) ([@minrk](https://github.com/minrk))
- Backport PR #4679 on branch 4.x (Unescape jinja username) [#4705](https://github.com/jupyterhub/jupyterhub/pull/4705) ([@minrk](https://github.com/minrk))
- Backport PR #4630: avoid setting unused oauth state cookies on API requests [#4697](https://github.com/jupyterhub/jupyterhub/pull/4697) ([@minrk](https://github.com/minrk))
- Backport PR #4632: simplify, avoid errors in parsing accept headers [#4696](https://github.com/jupyterhub/jupyterhub/pull/4696) ([@minrk](https://github.com/minrk))
- Backport PR #4677 on branch 4.x (Improve validation, docs for token.expires_in) [#4692](https://github.com/jupyterhub/jupyterhub/pull/4692) ([@minrk](https://github.com/minrk))
- Backport PR #4570 on branch 4.x (fix mutation of frozenset in scope intersection) [#4691](https://github.com/jupyterhub/jupyterhub/pull/4691) ([@minrk](https://github.com/minrk))
- Backport PR #4562 on branch 4.x (Use `user.stop` to cleanup spawners that stopped while Hub was down) [#4690](https://github.com/jupyterhub/jupyterhub/pull/4690) ([@minrk](https://github.com/minrk))
- Backport PR #4542 on branch 4.x (Fix include_stopped_servers in paginated next_url) [#4689](https://github.com/jupyterhub/jupyterhub/pull/4689) ([@minrk](https://github.com/minrk))
- Backport PR #4651 on branch 4.x (avoid attempting to patch removed IPythonHandler with notebook v7) [#4688](https://github.com/jupyterhub/jupyterhub/pull/4688) ([@minrk](https://github.com/minrk))
- Backport PR #4560 on branch 4.x (singleuser extension: persist token from ?token=... url in cookie) [#4687](https://github.com/jupyterhub/jupyterhub/pull/4687) ([@minrk](https://github.com/minrk))
#### Maintenance and upkeep improvements
- Backport quay.io publishing [#4698](https://github.com/jupyterhub/jupyterhub/pull/4698) ([@minrk](https://github.com/minrk))
- Backport PR #4617: try to improve reliability of test_external_proxy [#4695](https://github.com/jupyterhub/jupyterhub/pull/4695) ([@minrk](https://github.com/minrk))
- Backport PR #4618 on branch 4.x (browser test: wait for token request to finish before reloading) [#4694](https://github.com/jupyterhub/jupyterhub/pull/4694) ([@minrk](https://github.com/minrk))
- preparing 4.x branch [#4685](https://github.com/jupyterhub/jupyterhub/pull/4685) ([@minrk](https://github.com/minrk), [@consideRatio](https://github.com/consideRatio))
#### Contributors to this release
The following people contributed discussions, new ideas, code and documentation contributions, and review.
See [our definition of contributors](https://github-activity.readthedocs.io/en/latest/#how-does-this-tool-define-contributions-in-the-reports).
([GitHub contributors page for this release](https://github.com/jupyterhub/jupyterhub/graphs/contributors?from=2023-08-10&to=2024-03-19&type=c))
@Achele ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3AAchele+updated%3A2023-08-10..2024-03-19&type=Issues)) | @akashthedeveloper ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Aakashthedeveloper+updated%3A2023-08-10..2024-03-19&type=Issues)) | @balajialg ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Abalajialg+updated%3A2023-08-10..2024-03-19&type=Issues)) | @BhavyaT-135 ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3ABhavyaT-135+updated%3A2023-08-10..2024-03-19&type=Issues)) | @blink1073 ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Ablink1073+updated%3A2023-08-10..2024-03-19&type=Issues)) | @consideRatio ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3AconsideRatio+updated%3A2023-08-10..2024-03-19&type=Issues)) | @fcollonval ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Afcollonval+updated%3A2023-08-10..2024-03-19&type=Issues)) | @I-Am-D-B ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3AI-Am-D-B+updated%3A2023-08-10..2024-03-19&type=Issues)) | @jakirkham ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Ajakirkham+updated%3A2023-08-10..2024-03-19&type=Issues)) | @ktaletsk ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Aktaletsk+updated%3A2023-08-10..2024-03-19&type=Issues)) | @kzgrzendek ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Akzgrzendek+updated%3A2023-08-10..2024-03-19&type=Issues)) | @lumberbot-app ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Alumberbot-app+updated%3A2023-08-10..2024-03-19&type=Issues)) | @manics ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Amanics+updated%3A2023-08-10..2024-03-19&type=Issues)) | @mbiette ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Ambiette+updated%3A2023-08-10..2024-03-19&type=Issues)) | @minrk ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Aminrk+updated%3A2023-08-10..2024-03-19&type=Issues)) | @rcthomas ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Arcthomas+updated%3A2023-08-10..2024-03-19&type=Issues)) | @ryanlovett ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Aryanlovett+updated%3A2023-08-10..2024-03-19&type=Issues)) | @sgaist ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Asgaist+updated%3A2023-08-10..2024-03-19&type=Issues)) | @shubham0473 ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Ashubham0473+updated%3A2023-08-10..2024-03-19&type=Issues)) | @Temidayo32 ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3ATemidayo32+updated%3A2023-08-10..2024-03-19&type=Issues)) | @willingc ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Awillingc+updated%3A2023-08-10..2024-03-19&type=Issues)) | @yuvipanda ([activity](https://github.com/search?q=repo%3Ajupyterhub%2Fjupyterhub+involves%3Ayuvipanda+updated%3A2023-08-10..2024-03-19&type=Issues))
## 4.0
### 4.0.2 - 2023-08-10

210
jupyterhub/_xsrf_utils.py Normal file
View File

@@ -0,0 +1,210 @@
"""utilities for XSRF
Extends tornado's xsrf token checks with the following:
- only set xsrf cookie on navigation requests (cannot be fetched)
This utility file enables the consistent reuse of these functions
in both Hub and single-user code
"""
import base64
import hashlib
from datetime import datetime, timedelta, timezone
from http.cookies import SimpleCookie
from tornado import web
from tornado.httputil import format_timestamp
from tornado.log import app_log
def _get_signed_value_urlsafe(handler, name, b64_value):
"""Like get_signed_value (used in get_secure_cookie), but for urlsafe values
Decodes urlsafe_base64-encoded signed values
Returns None if any decoding failed
"""
if b64_value is None:
return None
if isinstance(b64_value, str):
try:
b64_value = b64_value.encode("ascii")
except UnicodeEncodeError:
app_log.warning("Invalid value %r", b64_value)
return None
# re-pad, since we stripped padding in _create_signed_value
remainder = len(b64_value) % 4
if remainder:
b64_value = b64_value + (b'=' * (4 - remainder))
try:
value = base64.urlsafe_b64decode(b64_value)
except ValueError:
app_log.warning("Invalid base64 value %r", b64_value)
return None
return web.decode_signed_value(
handler.application.settings["cookie_secret"],
name,
value,
max_age_days=31,
min_version=2,
)
def _create_signed_value_urlsafe(handler, name, value):
"""Like tornado's create_signed_value (used in set_secure_cookie), but returns urlsafe bytes"""
signed_value = handler.create_signed_value(name, value)
return base64.urlsafe_b64encode(signed_value).rstrip(b"=")
def _clear_invalid_xsrf_cookie(handler, cookie_path):
"""
Clear invalid XSRF cookie
This may an old XSRF token, or one set on / by another application.
Because we cannot trust browsers or tornado to give us the more specific cookie,
try to clear _both_ on / and on our prefix,
then reload the page.
"""
expired = format_timestamp(datetime.now(timezone.utc) - timedelta(days=366))
cookie = SimpleCookie()
cookie["_xsrf"] = ""
morsel = cookie["_xsrf"]
morsel["expires"] = expired
morsel["path"] = "/"
# use Set-Cookie directly,
# because tornado's set_cookie and clear_cookie use a single _dict_,
# so we can't clear a cookie on multiple paths and then set it
handler.add_header("Set-Cookie", morsel.OutputString(None))
if cookie_path != "/":
# clear it multiple times!
morsel["path"] = cookie_path
handler.add_header("Set-Cookie", morsel.OutputString(None))
if (
handler.request.method.lower() == "get"
and handler.request.headers.get("Sec-Fetch-Mode", "navigate") == "navigate"
):
# reload current page because any subsequent set_cookie
# will cancel the clearing of the cookie
# this only makes sense on GET requests
handler.redirect(handler.request.uri)
# halt any other processing of the request
raise web.Finish()
def get_xsrf_token(handler, cookie_path=""):
"""Override tornado's xsrf token to add further restrictions
- only set cookie for regular pages (not API requests)
- include login info in xsrf token
- verify signature
"""
# original: https://github.com/tornadoweb/tornado/blob/v6.4.0/tornado/web.py#L1455
if hasattr(handler, "_xsrf_token"):
return handler._xsrf_token
_set_cookie = False
# the raw cookie is the token
xsrf_token = xsrf_cookie = handler.get_cookie("_xsrf")
if xsrf_token:
try:
xsrf_token = xsrf_token.encode("ascii")
except UnicodeEncodeError:
xsrf_token = None
xsrf_id_cookie = _get_signed_value_urlsafe(handler, "_xsrf", xsrf_token)
if xsrf_cookie and not xsrf_id_cookie:
# we have a cookie, but it's invalid!
# handle possibility of _xsrf being set multiple times,
# e.g. on / and on /hub/
# this will reload the page if it's a GET request
app_log.warning(
"Attempting to clear invalid _xsrf cookie %r", xsrf_cookie[:4] + "..."
)
_clear_invalid_xsrf_cookie(handler, cookie_path)
# check the decoded, signed value for validity
xsrf_id = handler._xsrf_token_id
if xsrf_id_cookie != xsrf_id:
# this will usually happen on the first page request after login,
# which changes the inputs to the token id
if xsrf_id_cookie:
app_log.debug("xsrf id mismatch %r != %r", xsrf_id_cookie, xsrf_id)
# generate new value
xsrf_token = _create_signed_value_urlsafe(handler, "_xsrf", xsrf_id)
# only set cookie on regular navigation pages
# i.e. not API requests, etc.
# insecure URLs (public hostname/ip, no https)
# don't set Sec-Fetch-Mode.
# consequence of assuming 'navigate': setting a cookie unnecessarily
# consequence of assuming not 'navigate': xsrf never set, nothing works
_set_cookie = (
handler.request.headers.get("Sec-Fetch-Mode", "navigate") == "navigate"
)
if _set_cookie:
xsrf_cookie_kwargs = {}
xsrf_cookie_kwargs.update(handler.settings.get('xsrf_cookie_kwargs', {}))
xsrf_cookie_kwargs.setdefault("path", cookie_path)
if not handler.current_user:
# limit anonymous xsrf cookies to one hour
xsrf_cookie_kwargs.pop("expires", None)
xsrf_cookie_kwargs.pop("expires_days", None)
xsrf_cookie_kwargs["max_age"] = 3600
app_log.info(
"Setting new xsrf cookie for %r %r",
xsrf_id,
xsrf_cookie_kwargs,
)
handler.set_cookie("_xsrf", xsrf_token, **xsrf_cookie_kwargs)
handler._xsrf_token = xsrf_token
return xsrf_token
def check_xsrf_cookie(handler):
"""Check that xsrf cookie matches xsrf token in request"""
# overrides tornado's implementation
# because we changed what a correct value should be in xsrf_token
token = (
handler.get_argument("_xsrf", None)
or handler.request.headers.get("X-Xsrftoken")
or handler.request.headers.get("X-Csrftoken")
)
if not token:
raise web.HTTPError(
403, f"'_xsrf' argument missing from {handler.request.method}"
)
try:
token = token.encode("utf8")
except UnicodeEncodeError:
raise web.HTTPError(403, "'_xsrf' argument invalid")
if token != handler.xsrf_token:
raise web.HTTPError(
403, f"XSRF cookie does not match {handler.request.method.upper()} argument"
)
def _anonymous_xsrf_id(handler):
"""Generate an appropriate xsrf token id for an anonymous request
Currently uses hash of request ip and user-agent
These are typically used only for the initial login page,
so only need to be valid for a few seconds to a few minutes
(enough to submit a login form with MFA).
"""
hasher = hashlib.sha256()
hasher.update(handler.request.remote_ip.encode("ascii"))
hasher.update(
handler.request.headers.get("User-Agent", "").encode("utf8", "replace")
)
return base64.urlsafe_b64encode(hasher.digest()).decode("ascii")

View File

@@ -76,15 +76,8 @@ class APIHandler(BaseHandler):
return True
async def prepare(self):
await super().prepare()
# tornado only checks xsrf on non-GET
# we also check xsrf on GETs to API endpoints
# make sure this runs after auth, which happens in super().prepare()
if self.request.method not in {"HEAD", "OPTIONS"} and self.settings.get(
"xsrf_cookies"
):
self.check_xsrf_cookie()
_xsrf_safe_methods = {"HEAD", "OPTIONS"}
def check_xsrf_cookie(self):
if not hasattr(self, '_jupyterhub_user'):

View File

@@ -402,6 +402,25 @@ class JupyterHub(Application):
Useful for daemonizing JupyterHub.
""",
).tag(config=True)
cookie_host_prefix_enabled = Bool(
False,
help="""Enable `__Host-` prefix on authentication cookies.
The `__Host-` prefix on JupyterHub cookies provides further
protection against cookie tossing when untrusted servers
may control subdomains of your jupyterhub deployment.
_However_, it also requires that cookies be set on the path `/`,
which means they are shared by all JupyterHub components,
so a compromised server component will have access to _all_ JupyterHub-related
cookies of the visiting browser.
It is recommended to only combine `__Host-` cookies with per-user domains.
.. versionadded:: 4.1
""",
).tag(config=True)
cookie_max_age_days = Float(
14,
help="""Number of days for a login cookie to be valid.
@@ -2034,6 +2053,8 @@ class JupyterHub(Application):
hub_args['port'] = self.hub_port
self.hub = Hub(**hub_args)
if self.cookie_host_prefix_enabled:
self.hub.cookie_name = "__Host-" + self.hub.cookie_name
if not self.subdomain_host:
api_prefix = url_path_join(self.hub.base_url, "api/")
@@ -3051,6 +3072,7 @@ class JupyterHub(Application):
default_url=self.default_url,
public_url=urlparse(self.public_url) if self.public_url else "",
cookie_secret=self.cookie_secret,
cookie_host_prefix_enabled=self.cookie_host_prefix_enabled,
cookie_max_age_days=self.cookie_max_age_days,
redirect_to_server=self.redirect_to_server,
login_url=login_url,

View File

@@ -24,6 +24,7 @@ from tornado.log import app_log
from tornado.web import RequestHandler, addslash
from .. import __version__, orm, roles, scopes
from .._xsrf_utils import _anonymous_xsrf_id, check_xsrf_cookie, get_xsrf_token
from ..metrics import (
PROXY_ADD_DURATION_SECONDS,
PROXY_DELETE_DURATION_SECONDS,
@@ -100,7 +101,14 @@ class BaseHandler(RequestHandler):
self.log.error("Rolling back session due to database error")
self.db.rollback()
self._resolve_roles_and_scopes()
return await maybe_future(super().prepare())
await maybe_future(super().prepare())
# run xsrf check after prepare
# because our version takes auth info into account
if (
self.request.method not in self._xsrf_safe_methods
and self.application.settings.get("xsrf_cookies")
):
self.check_xsrf_cookie()
@property
def log(self):
@@ -205,9 +213,13 @@ class BaseHandler(RequestHandler):
"""The default Content-Security-Policy header
Can be overridden by defining Content-Security-Policy in settings['headers']
..versionchanged:: 4.1
Change default frame-ancestors from 'self' to 'none'
"""
return '; '.join(
["frame-ancestors 'self'", "report-uri " + self.csp_report_uri]
["frame-ancestors 'none'", "report-uri " + self.csp_report_uri]
)
def get_content_type(self):
@@ -217,7 +229,6 @@ class BaseHandler(RequestHandler):
"""
Set any headers passed as tornado_settings['headers'].
By default sets Content-Security-Policy of frame-ancestors 'self'.
Also responsible for setting content-type header
"""
# wrap in HTTPHeaders for case-insensitivity
@@ -239,17 +250,63 @@ class BaseHandler(RequestHandler):
# Login and cookie-related
# ---------------------------------------------------------------
_xsrf_safe_methods = {"GET", "HEAD", "OPTIONS"}
@property
def _xsrf_token_id(self):
"""Value to be signed/encrypted for xsrf token
include login info in xsrf token
this means xsrf tokens are tied to logged-in users,
and change after a user logs in.
While the user is not yet logged in,
an anonymous value is used, to prevent portability.
These anonymous values are short-lived.
"""
# cases:
# 1. logged in, session id (session_id:user_id)
# 2. logged in, no session id (anonymous_id:user_id)
# 3. not logged in, session id (session_id:anonymous_id)
# 4. no cookies at all, use single anonymous value (:anonymous_id)
session_id = self.get_session_cookie()
if self.current_user:
if isinstance(self.current_user, User):
user_id = self.current_user.cookie_id
else:
# this shouldn't happen, but may if e.g. a Service attempts to fetch a page,
# which usually won't work, but this method should not be what raises
user_id = ""
if not session_id:
# no session id, use non-portable anonymous id
session_id = _anonymous_xsrf_id(self)
else:
# not logged in yet, use non-portable anonymous id
user_id = _anonymous_xsrf_id(self)
xsrf_id = f"{session_id}:{user_id}".encode("utf8", "replace")
return xsrf_id
@property
def xsrf_token(self):
"""Override tornado's xsrf token with further restrictions
- only set cookie for regular pages
- include login info in xsrf token
- verify signature
"""
return get_xsrf_token(self, cookie_path=self.hub.base_url)
def check_xsrf_cookie(self):
try:
return super().check_xsrf_cookie()
except web.HTTPError as e:
# ensure _jupyterhub_user is defined on rejected requests
"""Check that xsrf cookie matches xsrf token in request"""
# overrides tornado's implementation
# because we changed what a correct value should be in xsrf_token
if not hasattr(self, "_jupyterhub_user"):
self._jupyterhub_user = None
self._resolve_roles_and_scopes()
# rewrite message because we use this on methods other than POST
e.log_message = e.log_message.replace("POST", self.request.method)
raise
# run too early to check the value
# tornado runs this before 'prepare',
# but we run it again after so auth info is available, which happens in 'prepare'
return None
return check_xsrf_cookie(self)
@property
def admin_users(self):
@@ -526,6 +583,16 @@ class BaseHandler(RequestHandler):
user = self._user_from_orm(u)
return user
def clear_cookie(self, cookie_name, **kwargs):
"""Clear a cookie
overrides RequestHandler to always handle __Host- prefix correctly
"""
if cookie_name.startswith("__Host-"):
kwargs["path"] = "/"
kwargs["secure"] = True
return super().clear_cookie(cookie_name, **kwargs)
def clear_login_cookie(self, name=None):
kwargs = {}
user = self.get_current_user_cookie()
@@ -597,6 +664,11 @@ class BaseHandler(RequestHandler):
kwargs.update(self.settings.get('cookie_options', {}))
kwargs.update(overrides)
if key.startswith("__Host-"):
# __Host- cookies must be secure and on /
kwargs["path"] = "/"
kwargs["secure"] = True
if encrypted:
set_cookie = self.set_secure_cookie
else:
@@ -626,7 +698,9 @@ class BaseHandler(RequestHandler):
Session id cookie is *not* encrypted,
so other services on this domain can read it.
"""
session_id = uuid.uuid4().hex
if not hasattr(self, "_session_id"):
self._session_id = uuid.uuid4().hex
session_id = self._session_id
# if using subdomains, set session cookie on the domain,
# which allows it to be shared by subdomains.
# if domain is unspecified, it is _more_ restricted to only the setting domain

View File

@@ -35,6 +35,7 @@ import socket
import string
import time
import warnings
from functools import partial
from http import HTTPStatus
from unittest import mock
from urllib.parse import urlencode, urlparse
@@ -45,6 +46,7 @@ from tornado.log import app_log
from tornado.web import HTTPError, RequestHandler
from traitlets import (
Any,
Bool,
Dict,
Instance,
Integer,
@@ -56,8 +58,9 @@ from traitlets import (
)
from traitlets.config import SingletonConfigurable
from .._xsrf_utils import _anonymous_xsrf_id, check_xsrf_cookie, get_xsrf_token
from ..scopes import _intersect_expanded_scopes
from ..utils import get_browser_protocol, url_path_join
from ..utils import _bool_env, get_browser_protocol, url_path_join
def check_scopes(required_scopes, scopes):
@@ -356,6 +359,46 @@ class HubAuth(SingletonConfigurable):
""",
).tag(config=True)
allow_token_in_url = Bool(
_bool_env("JUPYTERHUB_ALLOW_TOKEN_IN_URL", default=True),
help="""Allow requests to pages with ?token=... in the URL
This allows starting a user session by sharing a URL with credentials,
bypassing authentication with the Hub.
If False, tokens in URLs will be ignored by the server,
except on websocket requests.
Has no effect on websocket requests,
which can only reliably authenticate via token in the URL,
as recommended by browser Websocket implementations.
This will default to False in JupyterHub 5.
.. versionadded:: 4.1
.. versionchanged:: 5.0
default changed to False
""",
).tag(config=True)
allow_websocket_cookie_auth = Bool(
_bool_env("JUPYTERHUB_ALLOW_WEBSOCKET_COOKIE_AUTH", default=True),
help="""Allow websocket requests with only cookie for authentication
Cookie-authenticated websockets cannot be protected from other user servers unless per-user domains are used.
Disabling cookie auth on websockets protects user servers from each other,
but may break some user applications.
Per-user domains eliminate the need to lock this down.
JupyterLab 4.1.2 and Notebook 6.5.6, 7.1.0 will not work
because they rely on cookie authentication without
API or XSRF tokens.
.. versionadded:: 4.1
""",
).tag(config=True)
cookie_options = Dict(
help="""Additional options to pass when setting cookies.
@@ -374,6 +417,40 @@ class HubAuth(SingletonConfigurable):
else:
return {}
cookie_host_prefix_enabled = Bool(
False,
help="""Enable `__Host-` prefix on authentication cookies.
The `__Host-` prefix on JupyterHub cookies provides further
protection against cookie tossing when untrusted servers
may control subdomains of your jupyterhub deployment.
_However_, it also requires that cookies be set on the path `/`,
which means they are shared by all JupyterHub components,
so a compromised server component will have access to _all_ JupyterHub-related
cookies of the visiting browser.
It is recommended to only combine `__Host-` cookies with per-user domains.
Set via $JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED
""",
).tag(config=True)
@default("cookie_host_prefix_enabled")
def _default_cookie_host_prefix_enabled(self):
return _bool_env("JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED")
@property
def cookie_path(self):
"""
Path prefix on which to set cookies
self.base_url, but '/' when cookie_host_prefix_enabled is True
"""
if self.cookie_host_prefix_enabled:
return "/"
else:
return self.base_url
cookie_cache_max_age = Integer(help="DEPRECATED. Use cache_max_age")
@observe('cookie_cache_max_age')
@@ -636,6 +713,17 @@ class HubAuth(SingletonConfigurable):
auth_header_name = 'Authorization'
auth_header_pat = re.compile(r'(?:token|bearer)\s+(.+)', re.IGNORECASE)
def _get_token_url(self, handler):
"""Get the token from the URL
Always run for websockets,
otherwise run only if self.allow_token_in_url
"""
fetch_mode = handler.request.headers.get("Sec-Fetch-Mode", "unspecified")
if self.allow_token_in_url or fetch_mode == "websocket":
return handler.get_argument("token", "")
return ""
def get_token(self, handler, in_cookie=True):
"""Get the token authenticating a request
@@ -651,8 +739,7 @@ class HubAuth(SingletonConfigurable):
Args:
handler (tornado.web.RequestHandler): the current request handler
"""
user_token = handler.get_argument('token', '')
user_token = self._get_token_url(handler)
if not user_token:
# get it from Authorization header
m = self.auth_header_pat.match(
@@ -702,6 +789,14 @@ class HubAuth(SingletonConfigurable):
"""
return self._call_coroutine(sync, self._get_user, handler)
def _patch_xsrf(self, handler):
"""Overridden in HubOAuth
HubAuth base class doesn't handle xsrf,
which is only relevant for cookie-based auth
"""
return
async def _get_user(self, handler):
# only allow this to be called once per handler
# avoids issues if an error is raised,
@@ -709,6 +804,9 @@ class HubAuth(SingletonConfigurable):
if hasattr(handler, '_cached_hub_user'):
return handler._cached_hub_user
# patch XSRF checks, which will apply after user check
self._patch_xsrf(handler)
handler._cached_hub_user = user_model = None
session_id = self.get_session_id(handler)
@@ -794,7 +892,10 @@ class HubOAuth(HubAuth):
because we don't want to use the same cookie name
across OAuth clients.
"""
return self.oauth_client_id
cookie_name = self.oauth_client_id
if self.cookie_host_prefix_enabled:
cookie_name = "__Host-" + cookie_name
return cookie_name
@property
def state_cookie_name(self):
@@ -806,22 +907,103 @@ class HubOAuth(HubAuth):
def _get_token_cookie(self, handler):
"""Base class doesn't store tokens in cookies"""
fetch_mode = handler.request.headers.get("Sec-Fetch-Mode", "unset")
if fetch_mode == "websocket" and not self.allow_websocket_cookie_auth:
# disallow cookie auth on websockets
return None
token = handler.get_secure_cookie(self.cookie_name)
if token:
# decode cookie bytes
token = token.decode('ascii', 'replace')
return token
async def _get_user_cookie(self, handler):
def _get_xsrf_token_id(self, handler):
"""Get contents for xsrf token for a given Handler
This is the value to be encrypted & signed in the xsrf token
"""
token = self._get_token_cookie(handler)
session_id = self.get_session_id(handler)
if token:
token_hash = hashlib.sha256(token.encode("ascii", "replace")).hexdigest()
if not session_id:
session_id = _anonymous_xsrf_id(handler)
else:
token_hash = _anonymous_xsrf_id(handler)
return f"{session_id}:{token_hash}".encode("ascii", "replace")
def _patch_xsrf(self, handler):
"""Patch handler to inject JuptyerHub xsrf token behavior"""
handler._xsrf_token_id = self._get_xsrf_token_id(handler)
# override xsrf_token property on class,
# so it's still a getter, not invoked immediately
handler.__class__.xsrf_token = property(
partial(get_xsrf_token, cookie_path=self.base_url)
)
handler.check_xsrf_cookie = partial(self.check_xsrf_cookie, handler)
def check_xsrf_cookie(self, handler):
"""check_xsrf_cookie patch
Applies JupyterHub check_xsrf_cookie if not token authenticated
"""
if getattr(handler, '_token_authenticated', False):
return
check_xsrf_cookie(handler)
def _clear_cookie(self, handler, cookie_name, **kwargs):
"""Clear a cookie, handling __Host- prefix"""
# Set-Cookie is rejected without 'secure',
# this includes clearing cookies!
if cookie_name.startswith("__Host-"):
kwargs["path"] = "/"
kwargs["secure"] = True
return handler.clear_cookie(cookie_name, **kwargs)
def _needs_check_xsrf(self, handler):
"""Does the given cookie-authenticated request need to check xsrf?"""
if getattr(handler, "_token_authenticated", False):
return False
fetch_mode = handler.request.headers.get("Sec-Fetch-Mode", "unspecified")
if fetch_mode in {"websocket", "no-cors"} or (
fetch_mode in {"navigate", "unspecified"}
and handler.request.method.lower() in {"get", "head", "options"}
):
# no xsrf check needed for regular page views or no-cors
# or websockets after allow_websocket_cookie_auth passes
if fetch_mode == "unspecified":
self.log.warning(
f"Skipping XSRF check for insecure request {handler.request.method} {handler.request.path}"
)
return False
else:
return True
async def _get_user_cookie(self, handler):
# check xsrf if needed
token = self._get_token_cookie(handler)
session_id = self.get_session_id(handler)
if token and self._needs_check_xsrf(handler):
try:
self.check_xsrf_cookie(handler)
except HTTPError as e:
self.log.error(
f"Not accepting cookie auth on {handler.request.method} {handler.request.path}: {e}"
)
# don't proceed with cookie auth unless xsrf is okay
# don't raise either, because that makes a mess
return None
if token:
user_model = await self.user_for_token(
token, session_id=session_id, sync=False
)
if user_model is None:
app_log.warning("Token stored in cookie may have expired")
handler.clear_cookie(self.cookie_name)
self._clear_cookie(handler, self.cookie_name, path=self.cookie_path)
return user_model
# HubOAuth API
@@ -962,7 +1144,7 @@ class HubOAuth(HubAuth):
cookie_name = self.state_cookie_name
state_id = self.generate_state(next_url, **extra_state)
kwargs = {
'path': self.base_url,
'path': self.cookie_path,
'httponly': True,
# Expire oauth state cookie in ten minutes.
# Usually this will be cleared by completed login
@@ -1020,9 +1202,9 @@ class HubOAuth(HubAuth):
"""Clear persisted oauth state"""
for cookie_name, cookie in handler.request.cookies.items():
if cookie_name.startswith(self.state_cookie_name):
handler.clear_cookie(
self._clear_cookie(
cookie_name,
path=self.base_url,
path=self.cookie_path,
)
def _decode_state(self, state_id, /):
@@ -1044,8 +1226,11 @@ class HubOAuth(HubAuth):
def set_cookie(self, handler, access_token):
"""Set a cookie recording OAuth result"""
kwargs = {'path': self.base_url, 'httponly': True}
if get_browser_protocol(handler.request) == 'https':
kwargs = {'path': self.cookie_path, 'httponly': True}
if (
get_browser_protocol(handler.request) == 'https'
or self.cookie_host_prefix_enabled
):
kwargs['secure'] = True
# load user cookie overrides
kwargs.update(self.cookie_options)
@@ -1063,7 +1248,7 @@ class HubOAuth(HubAuth):
Args:
handler (tornado.web.RequestHandler): the current request handler
"""
handler.clear_cookie(self.cookie_name, path=self.base_url)
self._clear_cookie(handler, self.cookie_name, path=self.cookie_path)
class UserNotAllowed(Exception):
@@ -1275,7 +1460,7 @@ class HubAuthenticated:
return
try:
self._hub_auth_user_cache = self.check_hub_user(user_model)
except UserNotAllowed as e:
except UserNotAllowed:
# cache None, in case get_user is called again while processing the error
self._hub_auth_user_cache = None
@@ -1297,6 +1482,25 @@ class HubAuthenticated:
self.hub_auth._persist_url_token_if_set(self)
return self._hub_auth_user_cache
@property
def _xsrf_token_id(self):
if hasattr(self, "__xsrf_token_id"):
return self.__xsrf_token_id
if not isinstance(self.hub_auth, HubOAuth):
return ""
return self.hub_auth._get_xsrf_token_id(self)
@_xsrf_token_id.setter
def _xsrf_token_id(self, value):
self.__xsrf_token_id = value
@property
def xsrf_token(self):
return get_xsrf_token(self, cookie_path=self.hub_auth.base_url)
def check_xsrf_cookie(self):
return self.hub_auth.check_xsrf_cookie(self)
class HubOAuthenticated(HubAuthenticated):
"""Simple subclass of HubAuthenticated using OAuth instead of old shared cookies"""
@@ -1332,7 +1536,7 @@ class HubOAuthCallbackHandler(HubOAuthenticated, RequestHandler):
cookie_state = self.get_secure_cookie(cookie_name)
# clear cookie state now that we've consumed it
if cookie_state:
self.clear_cookie(cookie_name, path=self.hub_auth.base_url)
self.hub_auth.clear_oauth_state_cookies(self)
else:
# completing oauth with stale state, but already logged in.
# stop here and redirect to default URL
@@ -1349,8 +1553,13 @@ class HubOAuthCallbackHandler(HubOAuthenticated, RequestHandler):
# check that state matches
if arg_state != cookie_state:
app_log.warning("oauth state %r != %r", arg_state, cookie_state)
raise HTTPError(403, "OAuth state does not match. Try logging in again.")
app_log.warning(
"oauth state argument %r != cookie %s=%r",
arg_state,
cookie_name,
cookie_state,
)
raise HTTPError(403, "oauth state does not match. Try logging in again.")
next_url = self.hub_auth.get_next_url(cookie_state)
# clear consumed state from _oauth_states cache now that we're done with it
self.hub_auth.clear_oauth_state(cookie_state)

View File

@@ -44,6 +44,7 @@ from jupyterhub._version import __version__, _check_version
from jupyterhub.log import log_request
from jupyterhub.services.auth import HubOAuth, HubOAuthCallbackHandler
from jupyterhub.utils import (
_bool_env,
exponential_backoff,
isoformat,
make_ssl_context,
@@ -55,17 +56,6 @@ from ._disable_user_config import _disable_user_config
SINGLEUSER_TEMPLATES_DIR = str(Path(__file__).parent.resolve().joinpath("templates"))
def _bool_env(key):
"""Cast an environment variable to bool
0, empty, or unset is False; All other values are True.
"""
if os.environ.get(key, "") in {"", "0"}:
return False
else:
return True
def _exclude_home(path_list):
"""Filter out any entries in a path list that are in my home directory.
@@ -127,6 +117,9 @@ class JupyterHubIdentityProvider(IdentityProvider):
# HubAuth gets most of its config from the environment
return HubOAuth(parent=self)
def _patch_xsrf(self, handler):
self.hub_auth._patch_xsrf(handler)
def _patch_get_login_url(self, handler):
original_get_login_url = handler.get_login_url
@@ -161,6 +154,7 @@ class JupyterHubIdentityProvider(IdentityProvider):
if hasattr(handler, "_jupyterhub_user"):
return handler._jupyterhub_user
self._patch_get_login_url(handler)
self._patch_xsrf(handler)
user = await self.hub_auth.get_user(handler, sync=False)
if user is None:
handler._jupyterhub_user = None
@@ -632,6 +626,9 @@ class JupyterHubSingleUser(ExtensionApp):
app.web_app.settings["page_config_hook"] = (
app.identity_provider.page_config_hook
)
# disable xsrf_cookie checks by Tornado, which run too early
# checks in Jupyter Server are unconditional
app.web_app.settings["xsrf_cookies"] = False
# if the user has configured a log function in the tornado settings, do not override it
if not 'log_function' in app.config.ServerApp.get('tornado_settings', {}):
app.web_app.settings["log_function"] = log_request
@@ -642,6 +639,9 @@ class JupyterHubSingleUser(ExtensionApp):
# check jupyterhub version
app.io_loop.run_sync(self.check_hub_version)
# set default CSP to prevent iframe embedding across jupyterhub components
headers.setdefault("Content-Security-Policy", "frame-ancestors 'none'")
async def _start_activity():
self._activity_task = asyncio.ensure_future(self.keep_activity_updated())

View File

@@ -45,21 +45,15 @@ from traitlets.config import Configurable
from .._version import __version__, _check_version
from ..log import log_request
from ..services.auth import HubOAuth, HubOAuthCallbackHandler, HubOAuthenticated
from ..utils import exponential_backoff, isoformat, make_ssl_context, url_path_join
from ..utils import (
_bool_env,
exponential_backoff,
isoformat,
make_ssl_context,
url_path_join,
)
from ._disable_user_config import _disable_user_config, _exclude_home
def _bool_env(key):
"""Cast an environment variable to bool
0, empty, or unset is False; All other values are True.
"""
if os.environ.get(key, "") in {"", "0"}:
return False
else:
return True
# Authenticate requests with the Hub
@@ -683,10 +677,10 @@ class SingleUserNotebookAppMixin(Configurable):
)
headers = s.setdefault('headers', {})
headers['X-JupyterHub-Version'] = __version__
# set CSP header directly to workaround bugs in jupyter/notebook 5.0
# set default CSP to prevent iframe embedding across jupyterhub components
headers.setdefault(
'Content-Security-Policy',
';'.join(["frame-ancestors 'self'", "report-uri " + csp_report_uri]),
';'.join(["frame-ancestors 'none'", "report-uri " + csp_report_uri]),
)
super().init_webapp()

View File

@@ -163,6 +163,7 @@ class Spawner(LoggingConfigurable):
hub = Any()
orm_spawner = Any()
cookie_options = Dict()
cookie_host_prefix_enabled = Bool()
public_url = Unicode(help="Public URL of this spawner's server")
public_hub_url = Unicode(help="Public URL of the Hub itself")
@@ -1006,6 +1007,10 @@ class Spawner(LoggingConfigurable):
env['JUPYTERHUB_CLIENT_ID'] = self.oauth_client_id
if self.cookie_options:
env['JUPYTERHUB_COOKIE_OPTIONS'] = json.dumps(self.cookie_options)
env["JUPYTERHUB_COOKIE_HOST_PREFIX_ENABLED"] = str(
int(self.cookie_host_prefix_enabled)
)
env['JUPYTERHUB_HOST'] = self.hub.public_host
env['JUPYTERHUB_OAUTH_CALLBACK_URL'] = url_path_join(
self.user.url, url_escape_path(self.name), 'oauth_callback'

View File

@@ -1,6 +1,8 @@
"""Tests for the Playwright Python"""
import asyncio
import json
import pprint
import re
from unittest import mock
from urllib.parse import parse_qs, urlparse
@@ -11,7 +13,7 @@ from tornado.escape import url_escape
from tornado.httputil import url_concat
from jupyterhub import orm, roles, scopes
from jupyterhub.tests.utils import public_host, public_url, ujoin
from jupyterhub.tests.utils import async_requests, public_host, public_url, ujoin
from jupyterhub.utils import url_escape_path, url_path_join
pytestmark = pytest.mark.browser
@@ -44,7 +46,7 @@ async def test_submit_login_form(app, browser, user_special_chars):
login_url = url_path_join(public_host(app), app.hub.base_url, "login")
await browser.goto(login_url)
await login(browser, user.name, password=user.name)
expected_url = ujoin(public_url(app), f"/user/{user_special_chars.urlname}/")
expected_url = public_url(app, user)
await expect(browser).to_have_url(expected_url)
@@ -56,7 +58,7 @@ async def test_submit_login_form(app, browser, user_special_chars):
# will encode given parameters for an unauthenticated URL in the next url
# the next parameter will contain the app base URL (replaces BASE_URL in tests)
'spawn',
[('param', 'value')],
{'param': 'value'},
'/hub/login?next={{BASE_URL}}hub%2Fspawn%3Fparam%3Dvalue',
'/hub/login?next={{BASE_URL}}hub%2Fspawn%3Fparam%3Dvalue',
),
@@ -64,15 +66,15 @@ async def test_submit_login_form(app, browser, user_special_chars):
# login?param=fromlogin&next=encoded(/hub/spawn?param=value)
# will drop parameters given to the login page, passing only the next url
'login',
[('param', 'fromlogin'), ('next', '/hub/spawn?param=value')],
'/hub/login?param=fromlogin&next=%2Fhub%2Fspawn%3Fparam%3Dvalue',
'/hub/login?next=%2Fhub%2Fspawn%3Fparam%3Dvalue',
{'param': 'fromlogin', 'next': '/hub/spawn?param=value'},
'/hub/login?param=fromlogin&next={{BASE_URL}}hub%2Fspawn%3Fparam%3Dvalue',
'/hub/login?next={{BASE_URL}}hub%2Fspawn%3Fparam%3Dvalue',
),
(
# login?param=value&anotherparam=anothervalue
# will drop parameters given to the login page, and use an empty next url
'login',
[('param', 'value'), ('anotherparam', 'anothervalue')],
{'param': 'value', 'anotherparam': 'anothervalue'},
'/hub/login?param=value&anotherparam=anothervalue',
'/hub/login?next=',
),
@@ -80,7 +82,7 @@ async def test_submit_login_form(app, browser, user_special_chars):
# login
# simplest case, accessing the login URL, gives an empty next url
'login',
[],
{},
'/hub/login',
'/hub/login?next=',
),
@@ -98,6 +100,8 @@ async def test_open_url_login(
user = user_special_chars.user
login_url = url_path_join(public_host(app), app.hub.base_url, url)
await browser.goto(login_url)
if params.get("next"):
params["next"] = url_path_join(app.base_url, params["next"])
url_new = url_path_join(public_host(app), app.hub.base_url, url_concat(url, params))
print(url_new)
await browser.goto(url_new)
@@ -853,12 +857,15 @@ async def test_oauth_page(
oauth_client.allowed_scopes = sorted(roles.roles_to_scopes([service_role]))
app.db.commit()
# open the service url in the browser
service_url = url_path_join(public_url(app, service) + 'owhoami/?arg=x')
service_url = url_path_join(public_url(app, service), 'owhoami/?arg=x')
await browser.goto(service_url)
if app.subdomain_host:
expected_redirect_url = url_path_join(
app.base_url + f"services/{service.name}/oauth_callback"
public_url(app, service), "oauth_callback"
)
else:
expected_redirect_url = url_path_join(service.prefix, "oauth_callback")
expected_client_id = f"service-{service.name}"
# decode the URL
@@ -1236,3 +1243,225 @@ async def test_start_stop_server_on_admin_page(
await expect(browser.get_by_role("button", name="Spawn Page")).to_have_count(
len(users_list)
)
@pytest.mark.parametrize(
"case",
[
"fresh",
"invalid",
"valid-prefix-invalid-root",
],
)
async def test_login_xsrf_initial_cookies(app, browser, case, username):
"""Test that login works with various initial states for xsrf tokens
Page will be reloaded with correct values
"""
hub_root = public_host(app)
hub_url = url_path_join(public_host(app), app.hub.base_url)
login_url = url_path_join(
hub_url, url_concat("login", {"next": url_path_join(app.base_url, "/hub/home")})
)
# start with all cookies cleared
await browser.context.clear_cookies()
if case == "invalid":
await browser.context.add_cookies(
[{"name": "_xsrf", "value": "invalid-hub-prefix", "url": hub_url}]
)
elif case == "valid-prefix-invalid-root":
await browser.goto(login_url)
# first visit sets valid xsrf cookie
cookies = await browser.context.cookies()
assert len(cookies) == 1
# second visit is also made with invalid xsrf on `/`
# handling of this behavior is undefined in HTTP itself!
# _either_ the invalid cookie on / is ignored
# _or_ both will be cleared
# currently, this test assumes the observed behavior,
# which is that the invalid cookie on `/` has _higher_ priority
await browser.context.add_cookies(
[{"name": "_xsrf", "value": "invalid-root", "url": hub_root}]
)
cookies = await browser.context.cookies()
assert len(cookies) == 2
# after visiting page, cookies get re-established
await browser.goto(login_url)
cookies = await browser.context.cookies()
print(cookies)
cookie = cookies[0]
assert cookie['name'] == '_xsrf'
assert cookie["path"] == app.hub.base_url
# next page visit, cookies don't change
await browser.goto(login_url)
cookies_2 = await browser.context.cookies()
assert cookies == cookies_2
# login is successful
await login(browser, username, username)
def _cookie_dict(cookie_list):
"""Convert list of cookies to dict of the form
{ 'path': {'key': {cookie} } }
"""
cookie_dict = {}
for cookie in cookie_list:
path_cookies = cookie_dict.setdefault(cookie['path'], {})
path_cookies[cookie['name']] = cookie
return cookie_dict
async def test_singleuser_xsrf(app, browser, user, create_user_with_scopes, full_spawn):
# full login process, checking XSRF handling
# start two servers
target_user = user
target_start = asyncio.ensure_future(target_user.spawn())
browser_user = create_user_with_scopes("self", "access:servers")
# login browser_user
login_url = url_path_join(public_host(app), app.hub.base_url, "login")
await browser.goto(login_url)
await login(browser, browser_user.name, browser_user.name)
# end up at single-user
await expect(browser).to_have_url(re.compile(rf".*/user/{browser_user.name}/.*"))
# wait for target user to start, too
await target_start
await app.proxy.add_user(target_user)
# visit target user, sets credentials for second server
await browser.goto(public_url(app, target_user))
await expect(browser).to_have_url(re.compile(r".*/oauth2/authorize"))
auth_button = browser.locator('//input[@type="submit"]')
await expect(auth_button).to_be_enabled()
await auth_button.click()
await expect(browser).to_have_url(re.compile(rf".*/user/{target_user.name}/.*"))
# at this point, we are on a page served by target_user,
# logged in as browser_user
# basic check that xsrf isolation works
cookies = await browser.context.cookies()
cookie_dict = _cookie_dict(cookies)
pprint.pprint(cookie_dict)
# we should have xsrf tokens for both singleuser servers and the hub
target_prefix = target_user.prefix
user_prefix = browser_user.prefix
hub_prefix = app.hub.base_url
assert target_prefix in cookie_dict
assert user_prefix in cookie_dict
assert hub_prefix in cookie_dict
target_xsrf = cookie_dict[target_prefix].get("_xsrf", {}).get("value")
assert target_xsrf
user_xsrf = cookie_dict[user_prefix].get("_xsrf", {}).get("value")
assert user_xsrf
hub_xsrf = cookie_dict[hub_prefix].get("_xsrf", {}).get("value")
assert hub_xsrf
assert hub_xsrf != target_xsrf
assert hub_xsrf != user_xsrf
assert target_xsrf != user_xsrf
# we are on a page served by target_user
# check that we can't access
async def fetch_user_page(path, params=None):
url = url_path_join(public_url(app, browser_user), path)
if params:
url = url_concat(url, params)
status = await browser.evaluate(
"""
async (user_url) => {
try {
response = await fetch(user_url);
} catch (e) {
return 'error';
}
return response.status;
}
""",
url,
)
return status
if app.subdomain_host:
expected_status = 'error'
else:
expected_status = 403
status = await fetch_user_page("/api/contents")
assert status == expected_status
status = await fetch_user_page("/api/contents", params={"_xsrf": target_xsrf})
assert status == expected_status
if not app.subdomain_host:
expected_status = 200
status = await fetch_user_page("/api/contents", params={"_xsrf": user_xsrf})
assert status == expected_status
# check that we can't iframe the other user's page
async def iframe(src):
return await browser.evaluate(
"""
async (src) => {
const frame = document.createElement("iframe");
frame.src = src;
return new Promise((resolve, reject) => {
frame.addEventListener("load", (event) => {
if (frame.contentDocument) {
resolve("got document!");
} else {
resolve("blocked")
}
});
setTimeout(() => {
// some browsers (firefox) never fire load event
// despite spec appasrently stating it must always do so,
// even for rejected frames
resolve("timeout")
}, 3000)
document.body.appendChild(frame);
});
}
""",
src,
)
hub_iframe = await iframe(url_path_join(public_url(app), "hub/admin"))
assert hub_iframe in {"timeout", "blocked"}
user_iframe = await iframe(public_url(app, browser_user))
assert user_iframe in {"timeout", "blocked"}
# check that server page can still connect to its own kernels
token = target_user.new_api_token(scopes=["access:servers!user"])
url = url_path_join(public_url(app, target_user), "/api/kernels")
headers = {"Authorization": f"Bearer {token}"}
r = await async_requests.post(url, headers=headers)
r.raise_for_status()
kernel = r.json()
kernel_id = kernel["id"]
kernel_url = url_path_join(url, kernel_id)
kernel_ws_url = "ws" + url_path_join(kernel_url, "channels")[4:]
try:
result = await browser.evaluate(
"""
async (ws_url) => {
ws = new WebSocket(ws_url);
finished = await new Promise((resolve, reject) => {
ws.onerror = (err) => {
reject(err);
};
ws.onopen = () => {
resolve("ok");
};
});
return finished;
}
""",
kernel_ws_url,
)
finally:
r = await async_requests.delete(kernel_url, headers=headers)
r.raise_for_status()
assert result == "ok"

View File

@@ -44,8 +44,8 @@ from .. import metrics, orm, roles
from ..app import JupyterHub
from ..auth import PAMAuthenticator
from ..spawner import SimpleLocalProcessSpawner
from ..utils import random_port, utcnow
from .utils import async_requests, public_url, ssl_setup
from ..utils import random_port, url_path_join, utcnow
from .utils import AsyncSession, public_url, ssl_setup
def mock_authenticate(username, password, service, encoding):
@@ -372,29 +372,32 @@ class MockHub(JupyterHub):
async def login_user(self, name):
"""Login a user by name, returning her cookies."""
base_url = public_url(self)
external_ca = None
s = AsyncSession()
if self.internal_ssl:
external_ca = self.external_certs['files']['ca']
s.verify = self.external_certs['files']['ca']
login_url = base_url + 'hub/login'
r = await async_requests.get(login_url)
r = await s.get(login_url)
r.raise_for_status()
xsrf = r.cookies['_xsrf']
r = await async_requests.post(
r = await s.post(
url_concat(login_url, {"_xsrf": xsrf}),
cookies=r.cookies,
data={'username': name, 'password': name},
allow_redirects=False,
verify=external_ca,
)
r.raise_for_status()
r.cookies["_xsrf"] = xsrf
assert sorted(r.cookies.keys()) == [
# make second request to get updated xsrf cookie
r2 = await s.get(
url_path_join(base_url, "hub/home"),
allow_redirects=False,
)
assert r2.status_code == 200
assert sorted(s.cookies.keys()) == [
'_xsrf',
'jupyterhub-hub-login',
'jupyterhub-session-id',
]
return r.cookies
return s.cookies
class InstrumentedSpawner(MockSpawner):

View File

@@ -99,7 +99,7 @@ async def test_post_content_type(app, content_type, status):
assert r.status_code == status
@mark.parametrize("xsrf_in_url", [True, False])
@mark.parametrize("xsrf_in_url", [True, False, "invalid"])
@mark.parametrize(
"method, path",
[
@@ -110,6 +110,13 @@ async def test_post_content_type(app, content_type, status):
async def test_xsrf_check(app, username, method, path, xsrf_in_url):
cookies = await app.login_user(username)
xsrf = cookies['_xsrf']
if xsrf_in_url == "invalid":
cookies.pop("_xsrf")
# a valid old-style tornado xsrf token is no longer valid
xsrf = cookies['_xsrf'] = (
"2|7329b149|d837ced983e8aac7468bc7a61ce3d51a|1708610065"
)
url = path.format(username=username)
if xsrf_in_url:
url = f"{url}?_xsrf={xsrf}"
@@ -120,7 +127,7 @@ async def test_xsrf_check(app, username, method, path, xsrf_in_url):
noauth=True,
cookies=cookies,
)
if xsrf_in_url:
if xsrf_in_url is True:
assert r.status_code == 200
else:
assert r.status_code == 403

View File

@@ -685,11 +685,10 @@ async def test_other_user_url(app, username, user, group, create_temp_role, has_
],
)
async def test_page_with_token(app, user, url, token_in):
cookies = await app.login_user(user.name)
token = user.new_api_token()
if token_in == "url":
url = url_concat(url, {"token": token})
headers = None
headers = {}
elif token_in == "header":
headers = {
"Authorization": f"token {token}",
@@ -734,14 +733,13 @@ async def test_login_strip(app, form_user, auth_user, form_password):
"""Test that login form strips space form usernames, but not passwords"""
form_data = {"username": form_user, "password": form_password}
expected_auth = {"username": auth_user, "password": form_password}
base_url = public_url(app)
called_with = []
async def mock_authenticate(handler, data):
called_with.append(data)
with mock.patch.object(app.authenticator, 'authenticate', mock_authenticate):
r = await async_requests.get(base_url + 'hub/login')
r = await get_page('login', app)
r.raise_for_status()
cookies = r.cookies
xsrf = cookies['_xsrf']
@@ -922,17 +920,19 @@ async def test_auto_login(app, request):
async def test_auto_login_logout(app):
name = 'burnham'
cookies = await app.login_user(name)
s = AsyncSession()
s.cookies = cookies
with mock.patch.dict(
app.tornado_settings, {'authenticator': Authenticator(auto_login=True)}
):
r = await async_requests.get(
r = await s.get(
public_host(app) + app.tornado_settings['logout_url'], cookies=cookies
)
r.raise_for_status()
logout_url = public_host(app) + app.tornado_settings['logout_url']
assert r.url == logout_url
assert r.cookies == {}
assert list(s.cookies.keys()) == ["_xsrf"]
# don't include logged-out user in page:
try:
idx = r.text.index(name)
@@ -946,19 +946,23 @@ async def test_auto_login_logout(app):
async def test_logout(app):
name = 'wash'
cookies = await app.login_user(name)
r = await async_requests.get(
public_host(app) + app.tornado_settings['logout_url'], cookies=cookies
s = AsyncSession()
s.cookies = cookies
r = await s.get(
public_host(app) + app.tornado_settings['logout_url'],
)
r.raise_for_status()
login_url = public_host(app) + app.tornado_settings['login_url']
assert r.url == login_url
assert r.cookies == {}
assert list(s.cookies.keys()) == ["_xsrf"]
@pytest.mark.parametrize('shutdown_on_logout', [True, False])
async def test_shutdown_on_logout(app, shutdown_on_logout):
name = 'shutitdown'
cookies = await app.login_user(name)
s = AsyncSession()
s.cookies = cookies
user = app.users[name]
# start the user's server
@@ -978,14 +982,14 @@ async def test_shutdown_on_logout(app, shutdown_on_logout):
with mock.patch.dict(
app.tornado_settings, {'shutdown_on_logout': shutdown_on_logout}
):
r = await async_requests.get(
r = await s.get(
public_host(app) + app.tornado_settings['logout_url'], cookies=cookies
)
r.raise_for_status()
login_url = public_host(app) + app.tornado_settings['login_url']
assert r.url == login_url
assert r.cookies == {}
assert list(s.cookies.keys()) == ["_xsrf"]
# wait for any pending state to resolve
for i in range(50):

View File

@@ -386,7 +386,7 @@ async def test_oauth_service_roles(
# token-authenticated request to HubOAuth
token = app.users[name].new_api_token()
# token in ?token parameter
r = await async_requests.get(url_concat(url, {'token': token}))
r = await async_requests.get(url_concat(url, {'token': token}), headers=s.headers)
r.raise_for_status()
reply = r.json()
assert reply['name'] == name
@@ -394,7 +394,9 @@ async def test_oauth_service_roles(
# verify that ?token= requests set a cookie
assert len(r.cookies) != 0
# ensure cookie works in future requests
r = await async_requests.get(url, cookies=r.cookies, allow_redirects=False)
r = await async_requests.get(
url, cookies=r.cookies, allow_redirects=False, headers=s.headers
)
r.raise_for_status()
assert r.url == url
reply = r.json()

View File

@@ -75,18 +75,20 @@ async def test_singleuser_auth(
spawner = user.spawners[server_name]
url = url_path_join(public_url(app, user), server_name)
s = AsyncSession()
# no cookies, redirects to login page
r = await async_requests.get(url)
r = await s.get(url)
r.raise_for_status()
assert '/hub/login' in r.url
# unauthenticated /api/ should 403, not redirect
api_url = url_path_join(url, "api/status")
r = await async_requests.get(api_url, allow_redirects=False)
r = await s.get(api_url, allow_redirects=False)
assert r.status_code == 403
# with cookies, login successful
r = await async_requests.get(url, cookies=cookies)
r = await s.get(url, cookies=cookies)
r.raise_for_status()
assert (
urlparse(r.url)
@@ -100,7 +102,7 @@ async def test_singleuser_auth(
assert r.status_code == 200
# logout
r = await async_requests.get(url_path_join(url, 'logout'), cookies=cookies)
r = await s.get(url_path_join(url, 'logout'))
assert len(r.cookies) == 0
# accessing another user's server hits the oauth confirmation page
@@ -149,6 +151,8 @@ async def test_singleuser_auth(
async def test_disable_user_config(request, app, tmp_path, full_spawn):
# login, start the server
cookies = await app.login_user('nandy')
s = AsyncSession()
s.cookies = cookies
user = app.users['nandy']
# stop spawner, if running:
if user.running:
@@ -180,10 +184,11 @@ async def test_disable_user_config(request, app, tmp_path, full_spawn):
url = public_url(app, user)
# with cookies, login successful
r = await async_requests.get(url, cookies=cookies)
r = await s.get(url)
r.raise_for_status()
assert r.url.endswith('/user/nandy/jupyterhub-test-info')
assert r.status_code == 200
info = r.json()
pprint(info)
assert info['disable_user_config']
@@ -385,20 +390,31 @@ async def test_nbclassic_control_panel(app, user, full_spawn):
@pytest.mark.skipif(
IS_JUPYVERSE, reason="jupyverse doesn't implement token authentication"
)
async def test_token_url_cookie(app, user, full_spawn):
@pytest.mark.parametrize("accept_token_in_url", ["1", "0", ""])
async def test_token_url_cookie(app, user, full_spawn, accept_token_in_url):
if accept_token_in_url:
user.spawner.environment["JUPYTERHUB_ALLOW_TOKEN_IN_URL"] = accept_token_in_url
should_accept = accept_token_in_url != "0"
await user.spawn()
await app.proxy.add_user(user)
token = user.new_api_token(scopes=["access:servers!user"])
url = url_path_join(public_url(app, user), user.spawner.default_url or "/tree/")
# first request: auth with token in URL
r = await async_requests.get(url + f"?token={token}", allow_redirects=False)
s = AsyncSession()
r = await s.get(url + f"?token={token}", allow_redirects=False)
print(r.url, r.status_code)
if not should_accept:
assert r.status_code == 302
return
assert r.status_code == 200
assert r.cookies
assert s.cookies
# second request, use cookies set by first response,
# no token in URL
r = await async_requests.get(url, cookies=r.cookies, allow_redirects=False)
r = await s.get(url, allow_redirects=False)
assert r.status_code == 200
await user.stop()
@@ -409,7 +425,8 @@ async def test_api_403_no_cookie(app, user, full_spawn):
await user.spawn()
await app.proxy.add_user(user)
url = url_path_join(public_url(app, user), "/api/contents/")
r = await async_requests.get(url, allow_redirects=False)
s = AsyncSession()
r = await s.get(url, allow_redirects=False)
# 403, not redirect
assert r.status_code == 403
# no state cookie set

View File

@@ -42,6 +42,13 @@ async_requests = _AsyncRequests()
class AsyncSession(requests.Session):
"""requests.Session object that runs in the background thread"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
# session requests are for cookie authentication
# and should look like regular page views,
# so set Sec-Fetch-Mode: navigate
self.headers.setdefault("Sec-Fetch-Mode", "navigate")
def request(self, *args, **kwargs):
return async_requests.executor.submit(super().request, *args, **kwargs)
@@ -157,6 +164,7 @@ async def api_request(
else:
base_url = public_url(app, path='hub')
headers = kwargs.setdefault('headers', {})
headers.setdefault("Sec-Fetch-Mode", "cors")
if 'Authorization' not in headers and not noauth and 'cookies' not in kwargs:
# make a copy to avoid modifying arg in-place
kwargs['headers'] = h = {}
@@ -176,7 +184,7 @@ async def api_request(
kwargs['cert'] = (app.internal_ssl_cert, app.internal_ssl_key)
kwargs["verify"] = app.internal_ssl_ca
resp = await f(url, **kwargs)
assert "frame-ancestors 'self'" in resp.headers['Content-Security-Policy']
assert "frame-ancestors 'none'" in resp.headers['Content-Security-Policy']
assert (
ujoin(app.hub.base_url, "security/csp-report")
in resp.headers['Content-Security-Policy']
@@ -197,6 +205,9 @@ def get_page(path, app, hub=True, **kw):
else:
prefix = app.base_url
base_url = ujoin(public_host(app), prefix)
# Sec-Fetch-Mode=navigate to look like a regular page view
headers = kw.setdefault("headers", {})
headers.setdefault("Sec-Fetch-Mode", "navigate")
return async_requests.get(ujoin(base_url, path), **kw)

View File

@@ -426,6 +426,9 @@ class User:
_deprecated_db_session=self.db,
oauth_client_id=client_id,
cookie_options=self.settings.get('cookie_options', {}),
cookie_host_prefix_enabled=self.settings.get(
"cookie_host_prefix_enabled", False
),
trusted_alt_names=trusted_alt_names,
user_options=orm_spawner.user_options or {},
)

View File

@@ -8,6 +8,7 @@ import errno
import functools
import hashlib
import inspect
import os
import random
import re
import secrets
@@ -34,6 +35,21 @@ from tornado.httpclient import AsyncHTTPClient, HTTPError
from tornado.log import app_log
def _bool_env(key, default=False):
"""Cast an environment variable to bool
If unset or empty, return `default`
`0` is False; all other values are True.
"""
value = os.environ.get(key, "")
if value == "":
return default
if value.lower() in {"0", "false"}:
return False
else:
return True
# Deprecated aliases: no longer needed now that we require 3.7
def asyncio_all_tasks(loop=None):
warnings.warn(