Open OnDemand¶
We provide Open OnDemand powered web services for both Vera and alvis at https://vera.c3se.chalmers.se and https://alvis.c3se.chalmers.se.
Open OnDemand is a HPC portal accessible from a web browser and comes with features such as file managment, command-line shell access, job management and monitoring. Open OnDemand is accessible using most browsers and is built for scientific workflows using a scheduler. Open OnDemand includes support for applications such as Jupyter, RStudio, MATLAB, and more.
Advantages of Open OnDemand¶
Traditionally access to HPC resources has largely been terminal-based. Users has been required to install and operate a terminal emulator, access a remote system and carefully compose files for batch-job processing. If you were lucky, the labour from one HPC system could be reused on another, but not always. Open OnDemand simplifies these prior steps. In particular for users from non-traditional HPC fields with less experience in terminals and command-line operation. To access Open OnDemand you only need a browser, an account and the URL to the Open OnDemand portal. If you work mostly using Jupyter notebooks, or in a Desktop environment, the setup work is greatly reduced, saving you time for science.
Requirements¶
- Access to the cluster in question.
- You need to connect from a network on SUNET.
- A modern browser such as Mozilla Firefox, Google Chrome or Microsoft Edge. The later versions of each is expected to work best. Safari might work, but have reported issues with some parts of Open OnDemand.
Note
Open OnDemand uses the Shell App to provide an interactive shell using the browser. In some versions of Google Chrome not all special characters can be typed into the shell.
Login¶
To access Open OnDemand open a web browser and type https://vera.c3se.chalmers.se or https://alvis.c3se.chalmers.se in the address bar. Follow the instructions on the page and you will be taken to https://supr.naiss.se for authentication. When that is done you will be taken back to the Open OnDemand portal page.
Important
Remember to log out. An active and logged in session in Open OnDemand offers similar access to Alvis as traditional remote shell access. Always make sure you log out when leaving your computer. Closing the browser tab is not enough as most browsers keeps active sessions for some time.
Interactive Apps¶
The Interactive Apps is the main use with the portal. They allow you to easily configure a job that launches an interactive application, such asa Desktop or a Jupyter notebook server.
The application is started as a Slurm job on the cluster. You will need to enter
a project, the duration of the job, node type to use (i.e. how many GPUs do you
need), a runtime and (optional) a working directory. You can customize your own
Runtime by creating e.g. ~/portal/jupyter/my-env.sh
(you may need to first
create the directories mkdir -p ~/portal/jupyter
). For examples how runtimes
are constructed see the existing runtimes in /apps/portal/*/
.
As the job is submitted to the Slurm queue you will need to be patient until it starts. Once it starts you can find and start the session from the My Interactive Sessions tab.
Jupyter¶
The Jupyter app will launch a Jupyter server. The premade runtimes are examples
and we expect you to customize them in your ~/portal/jupyter/
.
When you are done with your session you can delete it under "My Interactive Sessions".
Desktop¶
There are two desktop apps "Desktop (Compute)" and "Desktop (Login)". Both will give you an interactive desktop session, the difference if it will be on a compute node where you can do some actual computations or if it is on a shared login node where you can need to refrain from heavy usage.
The interactive "Desktop (Compute)" is useful for if want to launch heavy computations through some graphical user interface.
Inside the desktop session you find applications in the upper left corner or by right clicking on the desktop area. When you are done with your session you can log-out in the upper right corner or deleting the session among "My Interactive Sessions" in the Open OnDemand dashboard.
Running local LLMs¶
For those who needs a local running interactive LLM, we implement a
chatbot_kernel
, which is
capable of launching a few LLMs published on huggingface locally.
Users can install the package in their virtual envionement as they want. For
convenience, we also provide a container having the chatbot_kernel
installed
and users can launch the kernel by selecting the Chatbot-x.x.x.sh
option in
the droplist of OnDemand Jupyter app. This will also launch a JupyterLab server,
but there will be an additional Chatbot
kernel in the Launcher
. Selecting
the kernel will allow you launch LLMs locally. Typing %model_list
will show
you the models we have downloaded. Some of models may not be provided due to
license issues. If you would like to use your downloaded models, you can execute
%hf_home <path/to/your/huggingface/dir>
to let the kernel search models from
your directory. Alternatively, you can also customize your own launch script as
a normal script for Jupyter app and set a proper HF_HOME
in your script. You
can see the example in /apps/containers/Chatbot
.
Note
It is more recommended to use A40 or A100 nodes. By default, all LLMs in
chatbot_kernel
use bfloat16
. To make them work on T4 or V100 nodes, you
may need to set %config dtype float16
.
Files - Manage your files¶
The file app has been known to cause lockups, leading to 502 errors when trying to use anything in the portal until the server is rebooted. Light use is likely fine, it should not be used for any significant file transfer.
Main use you will get out of this tab is to see your disk quota through the "Check my quota" option.
Active jobs¶
Click on the left most arrow to view job details. At the end of job details you have the option to open the output location in the File Manager or start a terminal. Note that the terminal will start where you submitted the job. If the job is running on a compute node opening a terminal will not start the terminal at the compute node.
Important
Please note that jobs that fail (exit with a non-zero return code) will still show as status completed. You need to click on the left most arrow to view the job details to see the actual job state as reported by Slurm.
My interactive session¶
This tab is where you can see details about your ongoing and previous sessions launched through the portal. Ypu can also go here to cancel a job that you no longer will be using. Idling jobs that have not been cancelled will still count towards your usage.
Clusters - Quickly launch a login shell¶
Here you can open a shell on a login node. A shared resource so not for heavy use. But useful if you want to for example set-up a custom runtime for one of your interactive apps.