fleshing out institutional faq

2025-10-16 14:33:00 +00:00 · 2019-11-12 17:02:51 -08:00
parent 42d9c31db7
commit 35ba762c9c
1 changed files with 97 additions and 7 deletions
--- a/docs/source/getting-started/institutional-faq.md
+++ b/docs/source/getting-started/institutional-faq.md
@@ -47,10 +47,13 @@ scalable infrastructure, large datasets, and high-performance computing.

 ## Who else uses JupyterHub?

-JupyterHub has been used at a variety of institutions in academia,
-industry, and governmental research labs. These include:
+JupyterHub is used at a variety of institutions in academia,
+industry, and governmental research labs. It is most-commonly used by two kinds of groups:

-* <list of orgs>
+* Small teams (e.g., data science teams, research labs, or collaborative projects) to provide a
+  shared resource for interactive computing, collaboration, and analytics.
+* Large teams (e.g., a department, a large class, or a large group of remote users) to provide
+  access to organizational hardware, data, and analytics environments at scale.

 ## How does JupyterHub compare with hosted products, like Google Colaboratory, RStudio.cloud, or Anaconda Enterprise?

@@ -125,32 +128,119 @@ JupyterHub has no internal concept of "data", but is designed to be able to comm

 ## How do I manage users?

+JupyterHub offers a few options for managing your users. Upon setting up a JupyterHub, you can choose what
+kind of **authentication** you'd like to use. For example, you can have users sign up with an institutional
+email address, or choose a username / password when they first log-in, or offload authentication onto
+another service such as an organization's OAuth.

+The users of a JupyterHub are stored locally, and can be modified manually by an administrator of the JupyterHub.
+Moreover, the *active* users on a JupyterHub can be found on the administrator's page. This page
+gives you the abiltiy to stop or restart kernels, inspect user filesystems, and even take over user
+sessions to assist them with debugging.

 ## How do I manage software environments?

+A key benefit of JupyterHub is the ability for an administrator to define the environment(s) that users
+have access to. There are many ways to do this, depending on what kind of infrastructure you're using for
+your JupyterHub.
+
+For example, **The Littlest JupyterHub** runs on a single VM. In this case, the administrator defines
+an environment by installing packages to a shared folder that exists on the path of all users. The
+**JupyterHub for Kubernetes** deployment uses Docker images to define environments. You can create your
+own list of Docker images that users can select from, and can also control things like the amount of
+RAM available to users, or the types of machines that their sessions will use in the cloud.
+
 ## How does JupyterHub manage computational resources?

+For interactive computing sessions, JupyterHub controls computational resources via a **spawner**.
+Spawners define how a new user session is created, and are customized for particular kinds of
+infrastructure. For example, the KubeSpawner knows how to control a Kubernetes deployment
+to create new pods when users log in.
+
+For more sophisticated computational resources (like distributed computing), JupyterHub can
+connect with other infrastructure tools (like Dask or Spark). This allows users to control
+scalable or high-performance resources from within their JupyterHub sessions. The logic of
+how those resources are controlled is taken care of by the non-JupyterHub application.
+
+
 ## Can JupyterHub be used with my high-performance computing resources?

+Yes - JupyterHub can provide access to many kinds of computing infrastructure.
+Especially when combined with other open-source schedulers such as Dask, you can manage fairly
+complex computing infrastructure from the interactive sessions of a JupyterHub. For example
+[see the Dask HPC page](https://docs.dask.org/en/latest/setup/hpc.html).
+
 ## How much resources do user sessions take?

-## Can I customize the look and feel of a JupyterHub?
-* Branding notebook server / jupyter lab. Custom error pages / support and help pages
+This is highly configurable by the administrator. If you wish for your users to have simple
+data analytics environments for prototyping and light data exploring, you can restrict their
+memory and CPU based on the resources that you have available. If you'd like your JupyterHub
+to serve as a gateway to high-performance compute or data resources, you may increase the
+resources available on user machines, or connect them with computing infrastructure elsewhere.

+## Can I customize the look and feel of a JupyterHub?
+
+JupyterHub provides some customization of the graphics displayed to users. The most common
+modification is to add custom branding to the JupyterHub login page, loading pages, and
+various elements that persist across all pages (such as headers).

 # For Technical Leads

 ## Will JupyterHub “just work” with our team's interactive computing setup?

+Depending on the complexity of your setup, you'll have different experiences with "out of the box"
+distributions of JupyterHub. If all of the resources you need will fit on a single VM, then
+[The Littlest JupyterHub](https://tljh.jupyter.org) should get you up-and-running within
+a half day or so. For more complex setups, such as scalable Kubernetes clusters or access
+to high-performance computing and data, it will require more time and expertise with
+the technologies your JupyterHub will use (e.g., dev-ops knowledge with cloud computing).
+
+In general, the base JupyterHub deployment is not the bottleneck for setup, it is connecting
+your JupyterHub with the various services and tools that you wish to provide to your users.
+
+
 ## How well does JupyterHub scale? What are JupyterHub's limitations?

-## Will our team have to re-write their code when they want to scale to high-performance compute?
+JupyterHub works well at both a small scale (e.g., a single VM or machine) as well as a
+high scale (e.g., a scalable Kubernetes cluster). It can be used for teams as small a 2, and
+for user bases as large as 10,000. The scalability of JupyterHub largely depends on the
+infrastructure on which it is deployed. JupyterHub has been designed to be lightweight and
+flexible, so you can tailor your JupyterHub deployment to your needs.
+

 ## Is JupyterHub resilient? What happens when a machine goes down?

+For JupyterHubs that are deployed in a containerized environment (e.g., Kubernetes), it is
+possible to configure the JupyterHub to be fairly resistant to failures in the system.
+For example, if JupyterHub fails, then user sessions will not be affected (though new
+users will not be able to log in). When a JupyterHub process is restarted, it should
+seamlessly connect with the user database and the system will return to normal.
+Again, the details of your JupyterHub deployment (e.g., whether it's deployed on a scalable cluster)
+will affect the resiliency of the deployment.
+
 ## What interfaces does JupyterHub support?

+Out of the box, JupyterHub supports a variety of popular data science interfaces for user sessions,
+such as JupyterLab, Jupyter Notebooks, and RStudio. Any interface that can be served
+via a web address can be served with a JupyterHub (with the right setup).
+
 ## Does JupyterHub make it easier for our team to collaborate?

-## Can I use JupyterHub with R/RStudio or other languages and environments?
+JupyterHub provides a standardized environment and access to shared resources for your teams.
+This greatly reduces the cost associated with sharing analyses and content with other team
+members, and makes it easier to collaborate and build off of one another's ideas. Combined with
+access to high-performance computing and data, JupyterHub provides a common resource to
+amplify your team's ability to prototype their analyses, scale them to larger data, and then
+share their results with one another.
+
+JupyterHub also provides a computational framework to share computational narratives between
+different levels of an organization. For example, data scientists can share Jupyter Notebooks
+rendered as [voila dashboards](https://voila.readthedocs.io/en/stable/) with those who are not
+familiar with programming, or create publicly-available interactive analyses to allow others to
+interact with your work.
+
+## Can I use JupyterHub with R/RStudio or other languages and environments?
+
+Yes, Jupyter is a polyglot project, and there are over 40 community-provided kernels for a variety
+of languages (the most common being Python, Julia, and R). You can also use a JupyterHub to provide
+access to other interfaces, such as RStudio, that provide their own access to a language kernel.