Getting Started
If you have questions about using or deploying your own Breeder Genomics Hub, create a GitHub issue with the “support” label!
Quick Start
Clone the repository:
git clone https://github.com/maize-genetics/breeder-genomics-hub
cd breeder-genomics-hub
Follow the ORCID API Tutorial to create an application via the Developer Tools submenu after clicking on your name in the top right of the page. This will allow you to utilize ORCID’s OAuth provider, enabling users to sign in with their ORCID iD.
Create an env file named prod.env
containing the OAuth client ID and secret generated for your ORCID application.
Additionally, add the HUB_DOMAIN
environment variable with the domain that you’ll be using to access the Breeder Genomics Hub. This is used by the reverse proxy Caddy to acquire a TLS certificate automatically via Let’s Encrypt. If you wish to force HTTP and not use a certificate, prefix this value with http://
(e.g. http://0.0.0.0:80
).
OAUTH_CLIENT_ID=<APP-123ABC>
OAUTH_CLIENT_SECRET=<ORCID Secret>
HUB_DOMAIN=myhub.example.com
UID=1000
The UID
value above is interpolated within the hub.yml
Docker Compose config to utilize the Docker socket associated with your user. You can append this line easily by running:
echo "UID=$UID" >> prod.env
Next, make sure you have the breeder-notebook Jupyter image. This is the environment used for each client, so needs to be present, otherwise starting up a user’s server will time out. Get it via:
docker pull maizegenetics/breeder-notebook:latest
Then it’s as simple as using hub.yml
to start your Breeder Genomics Hub:
docker compose --env-file prod.env -f hub.yml up -d
Make sure to include the --env-file prod.env
option so that the UID
value is recognized by Docker Compose.
Customization and Configuration
Permanent Storage
The Breeder Genomics Hub uses DockerSpawner to start containers for each user. The files within the container are only available during the lifecycle of the container (i.e. are deleted when it is stopped). In order to provide a means for users to store persistent data, we must configure the extension to mount a volume from the host into the spawned container. This volume will persist on the host filesystem between container restarts, enabling users to save data into the ~/work
directory that they don’t want to lose. Add the following to your jupyterhub_config.py
:
notebook_dir = "/home/jovyan/work"
c.DockerSpawner.notebook_dir = notebook_dir
c.DockerSpawner.volumes = { "breeder-{username}": notebook_dir }
Choose Where Data Is Stored
The above config snippet will create a Docker volume with a default mount point. If you are on Linux, it will likely be stored at ~/.local/share/docker/volumes
. Using bind mounts via an absolute path is currently broken (#453), so if an administrator wishes to store persistent data elsewhere, they will need to employ a symbolic link:
ln -s /tmp/hub_userdata /home/your_user/.local/share/docker/volumes
You can list all current volumes via docker volume ls
, and view information about any of them using docker inspect
. For example, for a volume named breeder-bob
:
your_user@your_server:~$ docker inspect breeder-bob
[
{
"CreatedAt": "2023-01-01T00:00:00Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/home/your_user/.local/share/docker/volumes/breeder-bob/_data",
"Name": "breeder-bob",
"Options": null,
"Scope": "local"
}
]
For further context, see this GitHub comment.
User-installed Packages (Python, R, etc)
Please see the Installing Additional Software section.
User Authentication and Authorization
In JupyterHub, Authenticators are responsible for managing both authentication (verifying a user is who they say they are) and authorization (verifying if a given user is allowed to do some action). By default, the Breeder Genomics Hub uses ORCID’s OAuth functionality to enable individuals to log in to a Hub with their existing ORCID iD, removing the need for them to create an account specific to the Hub. For more information on using ORCID for logins, see the below subsection About ORCID iD & OAuth.
The general topic of authentication and authorization has security implications, and is therefore outside the scope of this documentation. A good starting point for JupyterHub specifically is their Authentication and User Basics tutorial page.
For example, if you’d like to limit access to an instance of the Breeder Genomics Hub, simply add the following to your jupyterhub_config.py
:
c.GenericOAuthenticator.allowed_users = { "0000-0002-9079-593X", "0000-0002-3100-371X" }
The above would limit access to Stephen Hawking and Ed Buckler.
About ORCID iD & OAuth
The configuration for GenericOAuthenticator
as seen in the code, follows the procedure in the Setup for ORCID iD section of the GenericOAuthenticator
docs.
There are a variety of additional config options available; consult the GenericOAuthenticator
API Reference for more information.
For example, this config allows any authenticated ORCID iD holder to log in.
Redirect URI
By default the redirect URI used by GenericOAuthenticator
is based on the HUB_DOMAIN
environment variable specified in the prod.env
file.
If you wish to use a different redirect URI, provide a REDIRECT_URI
value in your prod.env
file:
REDIRECT_URI=https://thirdparty.com/hub/oauth_callback
If using a custom redirect URI, ensure that you use the /oauth_callback
endpoint, otherwise authentication will be successful but you will encounter a 404 error.
Please refer to this ORCID FAQ for more information about how redirect URIs work.