Python

We install many versions of Python and all the common packages (many bundled with Python, many more as seprate modules)

The software we build for the cluster is optimized for the hardware. Pre-compiled versions are often only built generically, which will give up a lot of performance and features. Prefer using the packages from the the module tree.

Avoid using pip install directly. It will place user wide packages directly into your ~/.local/, not only using up most if not all or your disk quota, but won't be isolated and will leak into all other containers and environments likely breaking compatibility. Please only follow the examples below instead.

Virtual environments

The virtualenv command is included in the Python modules. Load your favourite version of Python (and everything else you need, e.g. SciPy-bundle) from the module system first. The first time, we create a new virtual environment (only done once), e.g.

module load SciPy-bundle/2021.05-foss-2021a matplotlib/3.4.2-foss-2021a h5py/3.2.1-foss-2021a
virtualenv --system-site-packages my_python

to use this environment, we must activate it and load the modules (every time you log in)

module load SciPy-bundle/2021.05-foss-2021a matplotlib/3.4.2-foss-2021a h5py/3.2.1-foss-2021a
source my_python/bin/activate

and then we can install modules locally

pip install --no-cache-dir --no-build-isolation some_module

The --no-cache-dir option is required to avoid it from reusing earlier installations from the same user in a different environment. The --no-build-isolation is to make sure that it uses the loaded modules from the module system when building any Cython libraries.

For more information see 1

Accessing virtual environments in Jupyter Notebook

If you're using virtual-environments in connection with Jupyter Notebook you might have problems that the kernel used in Jupyter doesn't recognize the correct site-packages. To resolve this do the following after completing the above steps

module load SciPy-bundle/2021.05-foss-2021a matplotlib/3.4.2-foss-2021a h5py/3.2.1-foss-2021a
source my_python/bin/activate
python -m ipykernel install --user --name=my_python --display-name="My Python"

and then when running your notebooks you should be able to select this as your kernel

Changing kernel is done under Kernel>Change kernel>My Python

Conda environments

The best way to create conda environments is with a container. There are example recipes at

/apps/containers/Conda/conda-example1.recipe
/apps/containers/Conda/conda-example2.recipe
/apps/containers/Conda/conda-example3.recipe

which after you've built them can be used as e.g.

apptainer exec conda-example.sif python my_script.py

This drastically reduces the installation size, and the number of files (down to 1), making it much better for network storage. Many conda packages might not be compatible with the operating system we use on the cluster. Using a container also solves this issue.

Please see our container page for more details.

There are Anaconda3 modules available, which can be directly used (but might not be as optimized as the other software in the module tree). If you need to set up a conda environment, you should instead always use a container.

Note: The containers themselves serve as environments, you don't need to set up conda environments inside the containers (see example recipes) as it only complicates using the container.

Matplotlib

Please note that when running Matplotlib, you might want to run

matplotlib.use('Agg')

after importing, to avoid Matplotlib trying to use the X Windows backend, which will fail if you didn't log in with X forwarding (which won't work in batch jobs).

NumPy and SciPy

As it is impractical to have each individual extension as a module, so you can find the optimized builds of numpy and scipy as part of the SciPy-bundle. In general, you can use module keyword xxx to search for a particular extension that might be part of a module bundle.