We install many versions of Python and all the common packages (many bundled with Python, many more as seprate modules)
The software we build for the cluster is optimized for the hardware. Pre-compiled versions are often only built generically, which will give up a lot of performance and features. Prefer using the packages from the the module tree.
pip install directly. It will place user wide packages directly into your
~/.local/, not only using up most if not all or your disk quota, but won't be isolated and will leak into all other containers and environments likely breaking compatibility. Please only follow the examples below instead.
virtualenv command is included in the Python modules.
Load your favourite version of Python (and everything else you need, e.g.
SciPy-bundle) from the module system first.
The first time, we create a new virtual environment (only done once), e.g.
module load SciPy-bundle/2021.05-foss-2021a matplotlib/3.4.2-foss-2021a h5py/3.2.1-foss-2021a virtualenv --system-site-packages my_python
to use this environment, we must activate it and load the modules (every time you log in)
module load SciPy-bundle/2021.05-foss-2021a matplotlib/3.4.2-foss-2021a h5py/3.2.1-foss-2021a source my_python/bin/activate
and then we can install modules locally
pip install --no-cache-dir --no-build-isolation some_module
--no-cache-dir option is required to avoid it from reusing earlier
installations from the same user in a different environment. The
--no-build-isolation is to make sure that it uses the loaded modules from the
module system when building any Cython libraries.
For more information see 1
Accessing virtual environments in Jupyter Notebook¶
If you're using virtual-environments in connection with Jupyter Notebook you might have problems that the kernel used in Jupyter doesn't recognize the correct site-packages. To resolve this do the following after completing the above steps
module load SciPy-bundle/2021.05-foss-2021a matplotlib/3.4.2-foss-2021a h5py/3.2.1-foss-2021a source my_python/bin/activate python -m ipykernel install --user --name=my_python --display-name="My Python"
and then when running your notebooks you should be able to select this as your kernel
The best way to create conda environments is with a container. There are example recipes at
/apps/containers/Conda/conda-example1.recipe /apps/containers/Conda/conda-example2.recipe /apps/containers/Conda/conda-example3.recipe
which after you've built them can be used as e.g.
apptainer exec conda-example.sif python my_script.py
This drastically reduces the installation size, and the number of files (down to 1), making it much better for network storage. Many conda packages might not be compatible with the operating system we use on the cluster. Using a container also solves this issue.
Please see our container page for more details.
Anaconda3 modules available, which can be directly used (but might not be as optimized as the other software in the module tree).
If you need to set up a conda environment, you should instead always use a container.
Note: The containers themselves serve as environments, you don't need to set up conda environments inside the containers (see example recipes) as it only complicates using the container.
Please note that when running Matplotlib, you might want to run
after importing, to avoid Matplotlib trying to use the X Windows backend, which will fail if you didn't log in with X forwarding (which won't work in batch jobs).
NumPy and SciPy¶
As it is impractical to have each individual extension as a module, so you can find the optimized builds of
scipy as part of the
SciPy-bundle. In general, you can use
module keyword xxx to search for a particular extension that might be part of a module bundle.