Pebbles Group Owner Documentation

This document is intended to be for the group owners in Pebbles.

This document aims to be as brief as possible and link to relevant parts of the more comprehensive documentation and relevant artefacts.

Terminology

group
is a rights management unit: the regular members of a group have access to the same set of blueprints
blueprints
are a set of instructions to provision a resource. A group owner or group admin can create blueprints and make them visible to their group.
group owner
is an individual who has the rights to create groups. each group must have an owner. In academic projects the group owner is typically the professor or the principal investigator. A group owner has accountability for the resources that a group uses. Group owners can also do everything a group manager does.
group manager
is an individual who can create and enable and disable blueprints to regular group members. They are expected to be teaching or research assistans, junior researchers etc.
blueprint templates
are created by the admins. A blueprint is created using a blueprint template.
admins
are people who maintain the Pebbles installation. The much more comprehensive admin documentation covers things admins should know about
drivers
are the integrations Pebbles has with different resource providing systems, such as OpenStack, OpenShift or our custom DockerDriver
ephemeral blueprints
some resources are ephemeral, i.e. all data on disk disappears when the resource expires
persistent blueprints
instances tagged with peristent are the opposite of ephemeral: the disk is persisted in one way or another for at least some time.

Typical End-User Use Case

  1. User logs in to Pebbles using SSO
  2. User is given a join code for a group
  3. User pastes/types the join code into Pebbles
  4. User locates a Blueprint (called a resource in the user UI because they don’t need to know what blueprints are)
  5. User starts a blueprint
  6. User uses the resources provided by the blueprint

Creating Blueprints

First a Pebbles admin has to create a blueprint template. How to do so is not covered here but you can contact your admins to make requests.

  1. Go to Blueprints tab
  2. Select the BlueprintTemplate which you wish to use and click Create Blueprint
  3. select the for which the blueprint will be made available
  4. Fill in other variables
  5. Click Create
  6. Activate the blueprint by clicking Activate

The configurable variables depend on what fields the admins have made available for editing. Environment variables are available to the process started inside the container so they can be used to configure the processes in case the software supports

Bootstrapping environments

Most of the time it is desirable to have some pre-downloaded data or code files present on the system. For a university course you might download the templates.

The builds/scripts directory of notebook-images contains the scripts run at the start of containers. The bootstrap/ directory contains examples of how to use the files. If you have multiple files and/or repositories then it is suggested that you download a shell script and execute it in the background. The script can be downloaded from anywhere as long as it is publicly available on the Internet.

Note:

  • if you have a persistent volume then if you blindly download things you may override any changes a user made on the disk. Write your shell scripts accordingly.
  • the peristent disk is shared by all Blueprints on the same back-end so you should have your data inside a subdirectory
  • most blueprints are containers which run as a pseudorandom UID for the

user - both Jupyter Notebooks and RStudio support running an HTML-based shell so

you can try out your commands before

The suggested workflow is:

  1. boot a clean blueprint with no startup script
  2. using the editing features in Jupyter notebooks / RStudio server edit a shell script file
  3. debug running the shell script in the clean blueprint
  4. when you are satisfied upload the file somewhere and set the system up to download and run it automatically

Environment variables

The containers built from notebook-images all understand common environment variables

AUTODOWNLOAD_URL
A URL to be downloaded as a file to the local system, e.g “http://example.org/example.sh
AUTODOWNLOAD_URL_FILENAME
A target name for the URL, e.g. “data.zip”

The Jupyter notebooks containers additionally have

AUTODOWNLOAD_EXEC
The name of a file to execute before starting Jupyter notebooks. If the script is essential for starting the notebook(s) then it should be run with AUTODOWNLOAD_EXEC.
AUTODOWNLOAD_EXEC_BG
The name of a file to execute in the background just before starting Jupyter notebooks. If your script e.g. loads data over a slow network connection and that data isn’t needed for the first 15 minutes, AUOTODOWNLOAD_EXEC_BG is a good alternative.
AUTODOWNLOAD_URL=https://example.com/~user/scripts/prepare.sh
AUTODOWNLOAD_EXEC=prepare.sh

Or a more complex example that forces the name of the downloaded artefact and runs it in the background.

AUTODOWNLOAD_URL=https://example.com/~user/scripts/nonsensical_name.sh
AUTODOWNLOAD_FILENAME=bootstrap_slow.sh
AUTODOWNLOAD_EXEC_BG=bootstrap_slow.sh

Git

Git wants to always know who the user is. Specifically it needs a username and an email address. The UID in containers is a pseudorandom number and not

available in the local /etc/passwd, which confuses git.

The workaround is to set

GIT_COMMITTER_NAME
A human-readable name.
GIT_COMMITTER_EMAIL
Something that looks like a valid email address.

Using HTTPS urls is preferred over using SSH for GitHub/GitLab. Note that if the user has two-factor authentication enabled in GitHub they will need a token to use the HTTPS interface.

Root Privileges

Most blueprints won’t run the bootup script with root privileges for security reasons. In some rare cases this cannot be avoided. In those cases the user should be able to use sudo to become root.

It is recommended to do all configurations and installation as the user whenever possible.