Skip to content

Alvis data migration

Last update: 2026-05-22

This document is to help you with the upcoming move to Arrhenius.

We will start migrating users and data from Alvis/Mimer to Arrhenius in June. By then all users on Alvis with an active project should have a possibility to get an Arrhenius accounts through SUPR. The number of Alvis compute nodes will decrease gradually starting 2026-07-01, until 2026-08-31, when no more nodes will be available.

Migrating your project storage can take significant time, and depend on both the amount of storage you are using (GiB) and the number of files. Here, often the number of files is more limiting than the volume!

Data migrating can take days to over a week if you have very many files in your project storage. It is especially slow if you have cold data.

Arrhenius GPU run ARM

Arrhenius GPU partition runs on Grace Hopper nodes https://www.naiss.se/resource/arrhenius/.

This means any software built for x86 CPUs (everything from Intel and AMD) will not work. Therefor, there is no point in migrating software installations.

Options for projects with Chalmers PI:s

If you are a PI at Chalmers, there will be some options to continue using the Alvis hardware and Mimer storage.

For more information the project PI should contact e-Commons support.

Smaller storage

With the move to Arrhenius, you should expect a significant reduction in storage allocations overall. In contrast to Alvis (but common on other NAISS storage systems), there will also be quota on number of files. Initially there will be a generous soft quota limit, but you will need to take this into account going forward.

Start cleaning today

You can help yourself already today by cleaning out your storage as much as possible (both project storage, and home directories). This not only will help you reach any new allocation limits, but may also massively speed up the time to transfer the data from Alvis at Chalmers to Arrhenius in Linköping.

Like mentioned above, in particular, the number of files is a big concern.

Examples of things to clean out:

  1. Old checkpoints, results, logs etc.
  2. Datasets that are easy to download again, and are not actively used today.
  3. Software installations (including Python and Conda environments). These will not work on Arrhenius anyway.
  4. Containers, you need to rebuild or re-download these again on Arrhenius.

Transfer the data yourself

Please run the storagemigrate tool and answer that you don't want help from staff in migrating data. If you wish to migrate your data yourself, you should use rsync yourself after you obtain your Arrhenius account.

We strongly recommend that you module load rsync for a newer (compared to the OS provided one) and faster rsync with better hashing and compression support for faster transfers.

  • --compress --compress-choice=zstd: zstd is extremely fast compression that should be used for the transfers.
  • -a: Archive (preserves many file properties like read/write etc.)
  • --delete: Deletes files from the target directory (on Arrhenius) so that they match the source (on Mimer). This can be useful if you wish to undo previously synced files that you realize you didn't need. Be VERY careful if you use this flag; it deletes files.
  • Expert options (if you don't know what these mean, you almost certainly don't need them):
    • -U: Preserve access times (when you last used the file).
    • -H: Look for hard-linked files in the source and link together on the destination.
    • -A: Preserve ACLs (probably not useful unless you remap UID and GIDs). If you

E.g:

module load rsync
rsync -a --compress --compress-choice=zstd --itemize-changes /mimer/NOBACKUP/groups/<your_project>/your_files/ <you_arrhenius_user>@login.hpc.arrhenius.naiss.se:/nobackup/proj/disk/<your_project>/$USER_alvis/
rsync -a --compress --compress-choice=zstd --itemize-changes /mimer/NOBACKUP/groups/<your_project>/your_flash_files/ <you_arrhenius_user>@login.hpc.arrhenius.naiss.se:/nobackup/proj/flash/<your_project>/$USER_alvis/

For very long running transfer, strongly consider using a tool like tmux, screen, zellij or even the graphical login nodes to keep a persistent session alive.

You want help migrating data

We are looking at how to help user migrate storage project data and map user IDs on the new system, but such data migration will still take a long time due to the sheer number of files. If you reduce your usage (especially number of files), syncing will be faster.

You will find subdirectories automatically created in your project area:

/mimer/NOBACKUP/groups/<your_project>/to-arrhenius-disk/
/mimer/NOBACKUP/groups/<your_project>/to-arrhenius-flash/  # only if you are allocated flash.
in your project areas on Mimer. You should move directories into the corresponding to-arrhenius-disk and to-arrhenius-flash directories. Then, one user, with the PI's permission, must use the storagemigrate tool to select the desired sync date. You can continue using running jobs on this area up until the date for the transfer (which might differ from the desired one).

Example usage:

storagemigrate /mimer/NOBACKUP/groups/<your_project/

You will be asked questions interactively in the terminal.

Details

  • The directory to-arrhenius-flash have a quota set corresponding to the space quota for flash on Arrhenius. You must be selective to what you move here.
  • We will exclude any directory containing a pyenv.cfg file or a conda-meta directory from the sync. Normally, such directories indicate it is a software environment and you should never save data inside those.
  • The storagemigrate tool is safe to run at any time. It will ask you a few questions about the transfer. You can re-run it to change the information up until the selected sync date is decided.
  • You can see the flash quota and current usage by just running the storagemigrate tool.
  • At the date of the move, you will no longer be able to access these directories.
  • You must not continue writing output here once this date is passed; cancel your jobs if they do.
  • If many users select the same sync date, we need to select another date for you, information regarding this will be given in advance.

If you don't take any action

If you don't use the storagemigrate tool at, we will not sync anything for you. I.e. we will treat this the same as if you ran the tool and answered that you do not want help with the migration.

Timeline

  • Migration will start first of June
  • By the last of September all projects and users should be migrated
  • In general we will not prolong projects on Alvis/Mimer, i.e. migration will happen latest at the end date of your current project. No new allocations will be made on Alvis or Mimer.
  • As there are very many projects and very much data to migrate, we will need to spread out migration over the months. I.e. prepare for an earlier move than you suggest above.
  • After the original end-date of your project, or 2026-07-01 latest, the regular decommissioning procedure is used. That means no data can remain on Alvis/Mimer after 2026-09-30.
  • Any data left on Alvis/Mimer after 2026-10-31 will be permanently deleted, without any further attempts to contact the PI or project members.

Home directories are not transferred automatically

You must transfer any important data from your home directories yourself. Nothing automatic will be done here. See the rsync examples above for examples, and remember to only copy what is needed, and remove what is not.

Archiving

You should put large number of files into compressed archives. We recommend you use the much faster Zstd (compared to gzip) for the compression:

tar -I zstd -cvf my_dataset.tar.zst  my_dataset/

If you have terabytes of data, considering splitting it up in a few smaller archives so that you can process them in parallel.

You should submit batch jobs to the CPU nodes for very long running tasks like this. Do not do it on the login nodes, and do not use srun to run interactively. Prefer starting a graphical desktop on the CPU nodes via the OpenOndemand portal https://alvis.c3se.chalmers.se, this way you avoid tying this process to the uninterrupted uptime of a login node.

Advanced users may also wish to try mpifileutils parallel dtar:

module load mpifileutils
mpirun -np 16 dtar -c -f my_dataset.tar  my_dataset/
zstd my_dataset.tar  # creates my_dataset.tar.zst
rm my_dataset.tar

# alternatively gzip, using the parallel pigz
pigz my_dataset.tar

This can especially speed things up if you have lots of cold files. You may also want to store the temporary tar file on $TMPDIR inside a job.