Vera hardware

The Vera cluster is built on Intel Xeon Gold 6130 (code-named "skylake") CPU's. The system consists of:

  • In total 245 compute nodes (total of 7848 cores) with a total of 28 TiB of RAM and 13 GPUs. More specific:
    • 209 compute nodes with 32 cores and 92 GB of RAM
    • 18 compute nodes with 32 cores and 192 GB of RAM
    • 6 compute nodes with 32 cores and 384 GB of RAM
    • 2 compute nodes with 32 cores and 768 GB of RAM
    • 2 compute nodes with 32 cores, 384 GB of RAM and 2 NVIDIA Tesla V100 32 GB SMX2 GPU:s each
    • 1 compute nodes with 40 cores (Intel Xeon Gold 6230), 384 GB of RAM, 4 NVIDIA Tesla V100 32 GB SMX2 GPU:s and 13 TB of fast local NVMe storage
    • 5 compute nodes with with 32 cores and 92 GB of RAM an 1 NVIDIA Tesla T4 GPU each
    • 2 login nodes with 32 cores, 192 GB of RAM and NVIDIA P2000 for remote graphics

There are also 3 system servers used for accessing and managing the cluster.

There's a 25Gigabit Ethernet network used for logins, a dedicated management network and an Infiniband high-speed/low-latency network for parallel computations and filesystem access. The nodes are equipped with Mellanox ConnectX-3 FDR Infiniband 56Gbps HCA's.

The servers are build by Supermicro and the compute node hardware by Intel, the system is delivered by Southpole.

Cores, threads and CPU:s

One thing to note that is different from the previous systems at C3SE is that Hyper-Threading (HT for short) is enabled on Vera nodes.

Each Vera node have 2 physical processors, with 16 (physical) cores each (giving a total of 32 cores per node). With HT enabled (giving 2 threads per core, and a total of 64 threads per node) the following must be taken into consideration:

  • If your code is heavily optimised for the Vera hardware, you probably will not benefit from HT and should only use 1 task per core. To use this add "-c 2" (or "--cpus-per-task=2") to your jobscript, or the commandline.
  • You will probably want to benchmark using "-n X", "-n X -c 2" and "-n 2X", where X is the number of MPI-processes that will be launched.
  • mpirun automatically picks up the relevant information from Slurm, so you probably only want "mpirun ./my.exe" in your jobscript (i.e. no "-n" or "-np" flags).
  • Slurm will only allocate you full core, i.e. you will only get even number of tasks if you do not use "-c 2"
  • Specifying only "-n1" will actually give you 2 tasks/threads to use (one physical core).
  • In $TMPDIR you will find task-files in MPICH and LAM format:
    • with all tasks: $TMPDIR/mpichnodes, $TMPDIR/lamnodes
    • with physical cores only: $TMPDIR/mpichnodes.no_HT, $TMPDIR/lamnodes.no_HT

For general information on running jobs, see running jobs


If you need some kind of support (trouble logging in, how to run your software, etc.) please first

  • Contact the PI of your project and see if he/she can help
  • Talk with your fellow students/colleagues
  • Contact C3SE support