Using Jupyter notebooks

We support running your own Jupyter Notebook server either on the login nodes, for post processing, or on the compute nodes, for compute intensive work loads.

1. Customisation

The following customization is now the default on all our clusters. The configuration is present on all nodes under /usr/local/etc/jupyter/ and is also bind mounted into containers by default. You may customize this script if you want, but most users can skip this step.

To configure the Jupyter Notebook server to behave well on our systems we need to configure it via jupyter_notebook_config.py with the content

import errno, socket, random
def get_available_port(port_retries=10, ip='127.0.0.1'):
    for port in (random.randrange(8888, 8988) for i in range(port_retries)):
        try: socket.socket(socket.AF_INET, socket.SOCK_STREAM).bind((ip, port))
        except: print('Port %i not available, trying another port.' % port)
        else: return port        
    print('ERROR: No available port could be found.'); exit(1)

port, hostname = get_available_port(), socket.gethostname()

c.NotebookApp.ip = hostname
c.NotebookApp.port = port
c.NotebookApp.base_url = '/{0}/'.format(hostname)
c.NotebookApp.custom_display_url = 'https://proxy.c3se.chalmers.se:{0}/{1}/'.format(port, hostname)
c.NotebookApp.allow_origin = '*'
c.NotebookApp.port_retries = 0
c.NotebookApp.open_browser = False

either your home path ~/.jupyter/ or in the directory from where you will launch Jupyter.

2. Environment setup

Apart from loading the software modules that you want to use in your notebook, you also need to load the IPython module that contains the jupyter Notebook server.

module load IPython

3. Running the notebook server

Simple notebooks that do not contain computationally intensive work may be tested on the login nodes by running

[hugin@alvis ~]$ jupyter notebook

This returns with the following guidline if step 1 was done correctly:

To access the notebook, open this file in a browser:
        file:///cephyr/users/yourCID/machine/.local/share/jupyter/runtime/nbserver-xxx-open.html
    Or copy and paste one of these URLs:
        https://proxy.c3se.chalmers.se:8888/xxx/?token=xxx

Copy and paste the last url in a web browser to connect to the notebook server running on the login node.

For computationally intensive workflows, you should use the back-end nodes by submitting an interactive job:

srun -A yourAccount -p targetPartition -t 00:30:00 --pty jupyter notebook

If running on the new system Alvis, you must also specify the number of GPUs by e.g. including --gpus-per-node=1 in the above submission command. In the command above, you should specify the -A and the -p values as you normally do in your regular job submission scripts. Note that the waiting time for launching the interactive job is dependent on the workload of the machine, which means that at peak times you may have to wait long to get an interactive session.

Jupyter notebooks and TensorFlow

Note that if you want to use an MPI-based program like e.g. TensorFlow in your notebook, the above method of launching the notebook directly through srun will fail at runtime. That is because the process management interface (PMI) must be linked to properly (almost like running an MPI application from within a container image https://github.com/c3se/containers/blob/master/README.md#running-singularity-with-mpi-across-multiple-nodes). In those cases, the correct way of starting up a Jupyter Notebook server would be to have mpirun launch it:

mpirun -n 1 jupyter notebook, where the -n value can be as many cores as your notebook needs to run the job. For anything above -n 1, the use of the login nodes is not permitted, and you should launch the notebook through an interactive session:

  • first ask for an interactive shell in your submission command: srun -A yourAccount -p targetPartition -t 00:30:00 --pty bash

  • then launch the Jupyter Notebook server using mpirun -n XX jupyter notebook