Skip to content

Python

We install many versions of Python and all the common packages (many bundled with Python, many more as seprate modules). Prefer using packages from the the module tree. when possible.

Important

Avoid using pip install without first creating a virtual environemnt. It will place user wide packages directly into your ~/.local/ directory. This will use up your disk quota, and will leak into all other containers and environments, likely breaking compatibility. Please only follow the examples below.

Virtual environments

To use virtualenv, we need load its module first. At the same time, we can load other preferable modules built from the same toolchain. We use the module tool to search for all the software we need. In this example, we search matplotlib, numpy, h5py and jupyterlab and find a suitable set of modules within the same toolchain:

virtualenv/20.23.1-GCCcore-12.3.0
matplotlib/3.7.2-gfbf-2023a
SciPy-bundle/2023.07-gfbf-2023a
h5py/3.9.0-foss-2023a
JupyterLab/4.0.5-GCCcore-12.3.0

which we can load:

$ module load virtualenv/20.23.1-GCCcore-12.3.0 matplotlib/3.7.2-gfbf-2023a SciPy-bundle/2023.07-gfbf-2023a
$ module load h5py/3.9.0-foss-2023a JupyterLab/4.0.5-GCCcore-12.3.0

Note

For toolchains older than 2023a, virtualenv is included in the Python modules and you don't have to load addition virtualenv module.

Once we loaded all needed modules, we create a new virtual environment (only done once), e.g.

$ virtualenv --system-site-packages my_venv  # pick a name for your environment

You will see a directory appear in your currenct directory. To use this environment, we must activate it and load the same modules we have loaded before creating the virtual environment (every time)

$ module load virtualenv/20.23.1-GCCcore-12.3.0 matplotlib/3.7.2-gfbf-2023a SciPy-bundle/2023.07-gfbf-2023a h5py/3.9.0-foss-2023a JupyterLab/4.0.5-GCCcore-12.3.0
$ source my_venv/bin/activate

Once the virtual environment is activated, you should see the name of the virtual environment appears in front of you console. In this example, it will be

(my_venv) ... $

and you can use pip to install additional packages after activating the environment

$ module load virtualenv/20.23.1-GCCcore-12.3.0 matplotlib/3.7.2-gfbf-2023a SciPy-bundle/2023.07-gfbf-2023a h5py/3.9.0-foss-2023a JupyterLab/4.0.5-GCCcore-12.3.0
$ source my_venv/bin/activate
(my_venv) ... $ pip install <some_package>
(my_venv) ... $ pip install <some_other_package>

If you find that your packages are install into your ~/.local directory. It is most likely you forget to activate the virtual environment.

Note that there is a global pip configuration active on Vera and Alvis (see pip config -v list) that by default sets the following options for pip install:

  • --no-user option is very important to avoid accidentally doing user installations that in turn would break a lot of things.
  • --no-cache-dir option is required to avoid it from reusing earlier installations from the same user in a different environment.
  • --no-build-isolation is to make sure that it uses the loaded modules from the module system when building any Cython libraries.

For more information see this external guide

Accessing virtual environments in Jupyter Notebook

If you're using virtual-environments in connection with Jupyter Notebook you might have problems that the kernel used in Jupyter doesn't recognize the correct site-packages. To resolve this do the following after completing the above steps

$ module load virtualenv/20.23.1-GCCcore-12.3.0 matplotlib/3.7.2-gfbf-2023a SciPy-bundle/2023.07-gfbf-2023a h5py/3.9.0-foss-2023a
$ module load JupyterLab/4.0.5-GCCcore-12.3.0
$ source my_venv/bin/activate
$ python -m ipykernel install --user --name=my_venv --display-name="My Python"

and then when running your notebooks you should be able to select this as your kernel, see screenshots below for where to find this option.

In notebooks Changing kernel is done under Kernel>Change kernel>My Python or in jupyterlab In jupyterlab change kernel after clicking on the current kernel

  • Note: about using your virtual environment on OnDemand portal, please refer to OpenOndemand portals

Conda environments

Anaconda licensing

Anaconda and the default channel has license requirements. We recommend that you use miniforge with the conda-forge channel instead.

We strongly recommend creating condaenvironments inside containers. There are example recipes at

/apps/containers/Conda/conda-example1.def
/apps/containers/Conda/conda-example2.def
/apps/containers/Conda/conda-example3.def

which after you've built them can be used as e.g.

apptainer exec conda-example.sif python my_script.py

This drastically reduces the installation size and the number of files (down to 1), making it much better for network storage. Please see our container page for more details.

There are Miniforge3 modules available, which can be directly used (but might not be as optimized as the other software in the module tree). If you need to set up a conda environment, you should instead always use a container.

Note

The containers themselves serve as environments, you don't need to set up conda environments inside the containers (see example recipes) as it only complicates using the container. If you install with an environment.yml prefer installing into "base".

Matplotlib

Using matplotlib in python script

Please note that when running matplotlib in python script, you might want to run

import matplotlib
matplotlib.use('Agg')

to avoid matplotlib using the X Windows backend, which does not work if X forwarding is not enabled (i.e. does not work in batch jobs).

Using matplotlib in jupyter notebook

To use matplotlib in notebook, you can enable the inline backend by

%matplotlib inline

on the top of your notebook. [Reference]

NumPy and SciPy

As it is impractical to have each individual extension as a module, so you can find the optimized builds of numpy and scipy as part of the SciPy-bundle. In general, you can use module keyword xxx to search for a particular extension that might be part of a module bundle.