ᛆᛚᚡᛁᛋ (Alvis)

Alvis logo

The Alvis cluster is a national SNIC resource dedicated for Artificial Intelligence and Machine Learning research. The system is built around Graphical Processing Units (GPUs) accelerator cards, and consists of several types of compute nodes with multiple NVIDIA GPUs. The system is divided in phases were Phase I is going into production in the summer of 2020. Project applications will open mid August 2020 in SUPR, see Getting Access.

For more information on using Alvis, see documentation on this site, in particular the parts on Machine Learning, Data sets, Containers and HPC and AI software.

Alvis is also available from an Open OnDemand web portal at https://portal.c3se.chalmers.se. For more information see the Alvis OnDemand documentation.

Etymology; Alvis is an old Nordic name meaning "all-wise" written as ᛆᛚᚡᛁᛋ in medieval viking runes.

Queue

Below shows the current availability of resources in the queue on the login node (link):

Queue information is only accessible from within SUNET networks (use of VPN is necessary if you are outside).

Hardware

Login node alvis1.c3se.chalmers.se

  • 4 x NVIDIA Tesla T4 GPU with 16GB RAM
  • 2 x 16 core Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz (total 32 cores)
  • 768GB DDR4 RAM

Phase Ia

12 high-performance GPU compute nodes alvis1-01 to alvis1-12 with the node configuration

  • 2 x NVIDIA Tesla V100 SXM2 GPU with 32GB RAM, connected by nvlink
  • 2 x 8 core Intel(R) Xeon(R) Gold 6244 CPU @ 3.60GHz (total 16 cores)
  • 768GB DDR4 RAM
  • 387GB SSD scratch disk

5 high-performance GPU compute nodes alvis1-13 to alvis-17 with the node configuration

  • 4 x NVIDIA Tesla V100 SXM2 GPU with 32GB RAM, connected by nvlink
  • 2 x 16 core Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz (total 32 cores)
  • 768GB DDR4 RAM
  • 387GB SSD scratch disk

Phase Ib

20 capacity GPU compute nodes alvis2-01 to alvis2-20 with the node configuration

  • 8 x NVIDIA Tesla T4 GPU with 16GB RAM
  • 2 x 16 core Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz (total 32 cores)
  • 576GB DDR4 RAM (1 node with 1536GB)
  • 387GB SSD scratch disk

Phase Ic

1 high-performance GPU compute node alvis2-21 with the node configuration

  • 4 x NVIDIA Tesla A100 GPU with 40GB RAM
  • 2 x 16 core Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz (total 32 cores)
  • 768GB DDR4 RAM
  • 3.4TB SSD scratch disk

Phase II

Phase II will be available from late fall 2021 consisting of:

Data transfer node - 2 x 32 core Intel Xeon Gold 6338 CPU @ 2GHz - 256GiB RAM

85 nodes optimised for inference and smaller training jobs

  • 4 x NVIDIA Tesla A40 GPU with 48GB RAM
  • 2 x 32 core Intel(R) Xeon(R) Gold 6338 CPU @ 2GHz (total 64 cores)
  • 256GiB DDR4 RAM

56 nodes optimised for training jobs

  • 4 x NVIDIA Tesla A100 HGX GPU with 40GB RAM
  • 2 x 32 core Intel(R) Xeon(R) Gold 6338 CPU @ 2GHz (total 64 cores)
  • 256GiB DDR4 RAM

20 nodes optimised for training jobs with a bit more memory needs

  • 4 x NVIDIA Tesla A100 HGX GPU with 40GB RAM
  • 2 x 32 core Intel(R) Xeon(R) Gold 6338 CPU @ 2GHz (total 64 cores)
  • 512GiB DDR4 RAM

8 nodes optimised for heavy training jobs

  • 4 x NVIDIA Tesla A100 HGX GPU with 80GB RAM
  • 2 x 32 core Intel(R) Xeon(R) Gold 6338 CPU @ 2GHz (total 64 cores)
  • 1024GiB DDR4 RAM

4 nodes without GPUs

  • 2 x 32 core Intel(R) Xeon(R) Gold 8358 CPU @ 2.6GHz (total 64 cores)
  • 512GiB DDR4 RAM

Dedicated storage

In addition to the compute nodes listed above, together with phase II a fast ~0.6PB dedicate all-flash storage solution will be installed in Alvis. The solution will be backed by ~7PB of bulk storage.

More details on the storage solution can be found on this page.

GPU cost on Alvis

Depending on which GPU type you choose for your job, an hour of on the GPU will have different costs according to the following table:

GPU type VRAM System memory per GPU CPU cores per GPU Cost
T4 16GB 72 or 192 GB 4 0.35
A40* 48GB 1
V100 32GB 96 or 192 GB 8 1.31
A100* 40GB 1.84
A100fat* 80GB 2.2
  • Example: using 2xT4 GPUs for 10 hours costs 7 "GPU hours" (2 x 0.35 x 10).
  • The cost reflects the actual price of the hardware (normalised against an A40 node/GPU).
  • * available in the near future. Currently only 4 A100 GPUs are available from phase 1.

More info

To get started look through the introduction slides for Alvis, and the general user documentation on this site.