Using software

To get your work done on the cluster, the software you require must be available on the cluster. Much common software is already installed, but if you need some special program it has to be installed. As a user you have the possibility to install software in your home directory. If you need support on installation please send a support request.

To organize all the software installed on the cluster we use the Environment Module System. It allows you to list available software and dynamically load software modules in your command line session. The command used is simply called module, see man module for detailed usage information.

We offer a module system with a lot of pre-installed software. If your software isn't available, we offer several compilers for compiling the software yourself. Many languages also have systems in place for user local installations, see Python, Perl below. If it's a common software, we can do a system installation.

Modules

A module system is used to allow the user to specify which application/software he or she wants to use on the cluster. The modules package allows dynamical modification of a user's environment (PATH etc.), which makes the program available from the shell. Modules can be loaded and unloaded dynamically.

The module command

The module command (or the ml shortcut) controls how to change the software you have available. For the complete manual, please use

man module

To see the available applications/software installed use:

module avail
ml avail

To see what modules are loaded at the moment use:

module list
ml

To load/unload a modulefile making a certain application/compiler available or not use:

module load <modulename>
module add <modulename>
module unload <modulename>
module rm <modulename>
module purge    # Unloads all currently loaded modules`
ml <modulename>
ml <modulename>
ml unload <modulename>
ml rm <modulename>
ml purge

To see what a module does, use

module show <modulename>
ml show <modulename>

This will display what environment variables the module sets or modifies.

Toolchains

A toolchain is a set of modules that makes up the necessary components to build most software.

Most commercial (closed source) tools, do not require a toolchain, e.g. Mathematica, MATLAB. These can be loaded directly;

module load MATLAB

while most other tool requiring picking a toolchain first, e.g:

module load intel
module load Python

We install most software under the intel toolchain, and secondarily foss. You can see all the available compilers and toolchains by calling

module avail

Looking for available modules

You can view the modules for your current toolchain you can load (including toolchains) immediately by running:

module avail

This won't show you most software. Instead, you have to pick a toolchain, which adds many more modules. E.g.

module load intel
module avail

will show all modules compiled under the intel toolchain. Some software might only be available in some toolchains, including specific versions of a toolchain, e.g.

module load foss/2016b
module avail

Finding the right toolchain for a module

Searching for a specific version gives you additional information about the module, e.g:

module spider Python/2.7.12

    ------------------------------------------------------------------------------
      Python: Python/2.7.12
    ------------------------------------------------------------------------------
        Description:
          Python is a programming language that lets you work more quickly and
          integrate your systems more effectively. - Homepage: http://python.org/

         Other possible modules matches:
            Biopython

        This module can only be loaded through the following modules:

          GCC/5.4.0-2.26  OpenMPI/1.10.3
          icc/2016.3.210-GCC-5.4.0-2.26  impi/5.1.3.181
          ifort/2016.3.210-GCC-5.4.0-2.26  impi/5.1.3.181

        Help:
          Python is a programming language that lets you work more quickly and integrate your systems
           more effectively. - Homepage: http://python.org/

    ------------------------------------------------------------------------------
      To find other possible module matches do:
          module -r spider '.*Python/2.7.12.*'

Here, it tells us that that we can do either

module load GCC/5.4.0-2.26  OpenMPI/1.10.3
module load Python/2.7.12

or

module load icc/2016.3.210-GCC-5.4.0-2.26  impi/5.1.3.181
module load Python/2.7.12

or, as GCC + OpenMPI are part of the foss/2016b, and icc/ifort + impi are part of the intel/2016b toolchain, we could also do

module load foss/2016b
module load Python/2.7.12

or

module load intel/2016b
module load Python/2.7.12

Modules on Glenn

The module system on Glenn is older, and no longer maintained. The available software can only be seen using

module avail

and some additional modules can be added using

module load extras

However, most software found here is relatively old, and will not be updated.

Singularity

Singularity is a container system for HPC that lets you define your own environment and makes your work portable and reproducible on any HPC that supports it. Singularity has extensive documentation and examples for helping you installing, setting up containers http://singularity.lbl.gov/

Note: creating and modifying containers must be done outside the cluster on your own linux machine where you have root access.

C3SE specific additions

We currently expose the following paths into the container:

/c3se                (required)
/local               (required if you use $TMPDIR)
/apps                (required if you need to access software)
/usr/share/lmod/lmod (required if you need to access the module system)
/var/hasplm          (required if your software needs to access the HASP license manager)
/var/opt/thinlinc    (required if your application needs to run graphically on Thinlinc)

You must create the corresponding directories in your container by either modifying the recipe, or create the paths with mkdir in a writable image. An example recipe is located at /apps/Singularity/c3se_centos7.img

GPU

To access the GPU, you can use the --nv option when running your container, e.g:

singularity exec --nv my_image.img  my_gpu_app

When running graphical applications that need 3D acceleration on the GUI machines, you need to combine this with VirtualGL:

singularity exec --nv my_image.img  vglrun my_gui_app

Using containers in jobs

Using the image in a job is straight forward, and requires no special steps:

#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 0:30:00
#SBATCH -A **your-project** -p hebbe

echo "Outside of singularity, host python version:"
python --version
singularity exec ~/ubuntu.img echo "This is from inside a singularity. Check python version:"
singularity exec ~/ubuntu.img python --version

Running Hebbe modules inside your container

If you need to import additional paths into your container using the SINGULARITYENV_ prefix. This is in particular useful with the PATH and LD_LIBRARY_PATH which are for technical reasons cleared inside the container environment.

module load MATLAB
export SINGULARITYENV_PATH=$PATH
export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH
singularity exec ~/ubuntu.simg matlab -nodesktop -r "disp('hello world');"

However, note that is it very easy to break other software inside your container by importing the host's PATH and LD_LIBRARY_PATH into your container. In addition, any system library that the software depends on needs to be installed in your container. E.g. you can not start MATLAB if there is no X11 installed, which is typically not done when setting up a small, lean, Singularity image. Thus, if possible, strive to call modules from outside your container unless you a special need, e.g:

singularity exec ~/ubuntu.simg run_my_program simulation.inp
module load MATLAB
matlab < post_process_results.m

Compilers

Modern compilers and development tools are available through the module system. It is highly recommended to always load a toolchain, even if you are just using GCC, as the system compiler is very dated.

Intel compiler suite

The intel compiler toolchain includes:

  • icpc - C++ compiler
  • icc - C compiler
  • ifort - FORTRAN
  • imkl - Intel Math Kernel Library (BLAS, LAPACK, FFT, etc.)
  • impi - Intel MPI

Exactly how to instruct a build system to use these compilers varies from software to software.

In addition some tools are also available:

  • VTune - Visual profiling tool
  • Advisor - Code optimization tool
  • Inspector - Memory and thread error detection tool

all of which you can find in the menu when loogging in with over remote graphics.

GCC

The foss compiler toolchain includes:

  • g++ - C++ compiler
  • gcc - C compiler
  • gfortran - Fortran compiler
  • OpenBLAS - Efficient open source BLAS and LAPACK library
  • OpenMPI

Using a compiler with a build system

Common build systems (e.g. CMake) use cc, c++, or f95 for locating a compiler by default. These will point towards the very old system compilers.

With CMake, you will want to do

module load foss CMake
CC=gcc CXX=g++ FC=gfortran cmake path/to/src

or

module load intel CMake
CC=icc CXX=icpc FC=ifort cmake path/to/src

The variables CC, CXX, FC, are standard names that will work with other build systems as well (such as autoconf) However, some software relies on custom made build tools which makes things more difficult and will require custom solutions.

Allinea/ARM Forge

Allinea Forge (named recently changed to ARM Forge) is a graphical tool to debug and profile parallel codes (MPI, OpenMP, CUDA). You can find this tool in the menu when logging in over remote graphics.

Additional libraries

We install many libraries which can greatly simplify Loading modules will set the CPATH and LIBRARY_PATH environment variables, which are usually picked up popular build systems. However, many build systems will fail to respect these general rules, and may require some tweaking to build correctly.

Every library is not installed for every toolchain version. If you are missing some dependency for your software, you can request an installation, or install it locally in your $SNIC_NOBACKUP.