Storage resources at C3SE¶
The storage hierarchy¶
Storage is available for different usage and different availability.
Looking at it from the bottom up, we have:
-
The node local disk:
- available only to the running job
- automatically purged after the job finishes
-
accessible to the job using $TMPDIR (the environment variable
$TMPDIR
is automatically set to contain the correct directory path) -
the available size is different for different clusters (and can even differ between node types in a cluster)
-
Cluster(wide) storage:
- available from all machines in a specific cluster
- not available at any C3SE-clusters today (see Centre storage below)
- Centre(wide) storage:
- available from all resources at the centre
- your cluster home directory is located here
- National accessible storage:
- requires a separate storage allocation in SUPR
- file based (as opposed to block based for those above. Cf. a FTP-server)
- available through using dedicated tools
C3SE Centre storage (Cephyr and Mimer)¶
This centre storage is available on the clusters. You can view all the storage
areas you have access to via C3SE_quota
which shows you your current usage and
limits for each path.
It contains two parts, one with backup available, and one without backup storage projects only. Users home directories (which are backed up) are:
For example, if your UID/CID is ada:
$SNIC_BACKUP is /cephyr/users/ada
$HOME is /cephyr/users/ada/Vera on Vera
$HOME is /cephyr/users/ada/Alvis on Alvis
Backups¶
Even in the backup area, several hidden paths are intentionally excluded from backup, as they are typically automatically populated, contain local software installations, or just cached data of little value:
.cache/
.conda/
.local/
.MathWorks/
.matlab/
.mozilla/
.npm/
.schrodinger/
.vscode/
.vscode-server/
We will not be able to restore files from any of these directories. For software installations or any other data that can be easily recreated, it is much faster to set this up again than to restore from the backup robot hosted at HPC2N.
How to check your usage and quota limits¶
You can check your quota limits by issuing the command C3SE_quota
.
The default quota is:
Storage location | Space quota | Files quota |
---|---|---|
$SNIC_BACKUP |
30GiB | 60 000 |
Note! Your home-directories are on the same file-system and accessible from both systems. They also share the same quota, so please move or access files directly instead of copying!
"Can I exceed my quota? What will happen?"¶
No, the quota limits are hard limits and your code will most likely crash or stop if you try to allocate over your limits.
Quota limit warnings¶
To obtain a warning for a custom limit, you can put the following in your
.bashrc
file:
and it will show up as a warning when you log in if you are close to the limit, e.g:
Path: /cephyr/users/ada
Space used: 24.3GiB Quota: 30GiB - WARNING! 81% full!
Files used: 1073 Quota: 60000
The timeout
command prevents the task from blocking your login if it takes to
long.
Copying files into and out of the system¶
Use tools that can communicate using the SSH/SFTP/SCP-protocols to transfer files, for example Cyberduck, WinSCP and rsync (or scp/sftp directly!).
The same network requirements as for connecting in general is true here as well.
Individual files stored on Google Drive can be downloaded by the following procedure without having to authenticate on the data transfer node, it's mandatory for larger files and convenient for smaller ones:
- Start the download locally on your own computer;
- abort the transfer;
- copy the file URL from the download manager in the browser;
- issue a
curl -OLJ 'https://drive.usercontent.google.com/download?id=1FjK...651a'
command with the full URL just copied inside single quotes.
If a transfer is expected to take a long time, consider running it inside a
screen
or tmux
session to be able to disconnect from the data transfer node
and re-attach later to check the transfer progress.
For bulk data download from a set of remote HTTP URLs without authentication or cookies, there is a dedicated system called ADDS for background data download on Alvis.
Finding where quota is used on Cephyr¶
On /cephyr
recursive size is shown on directories when listing files, so you
can easily and quickly use it to locate where quota is used up. E.g:
shows
indicates that 239MiB of data is stored under the directory Mathematica
. Note
the -a
flag to also show hidden files.
Similarly, one can show the number of inodes (files and directories) used under
a directory by using cat
:
which shows
entries: 5
files: 3
subdirs: 2
rentries: 223
rfiles: 218
rsubdirs: 5
rbytes: 226350145
rctime: 1578793680.240728864
Here, 223 rentries
is the total number of files and directories under the
directory Mathematica
.
You can also use the command line utility: where-are-my-files some_path
which
prints the number of files used in each subdirectory from the given path.
If you are unsure, please contact the support.
Storage projects¶
If you need more resources than is available for you as a user (see above), you, or your supervisor/PI needs to apply for a storage project on Cephyr in case you are using Vera and on Mimer in case you are using Alvis. This is done through the SUPR-portal
If you are a member of a storage project, it will show up when using
C3SE_quota
.
If you have just joined a storage project, you must start a new login session for group memberships to update.
Project Storage Decomissioning¶
Decomissioning for projects on Centre storage follow the guidelines.
If you have received notice that your project has ended you should read the guidelines and how they affect your project data on our storage.
The easiest way to notify us that your project is ready to delete is to log in to SUPR and note which projects are in the decomissioning phase. In the details page of projects you may mark them as ready to delete.
If you are unsure or believe you have received a notification in error please get in touch with us.
File sharing with groups and other users¶
You can also share files with other users by manipulating the group ownership and associated permissions of directories or files.
Every computational project has their own group, named "c3-project-name", e.g:
Here emilia is a member of project NAISS2023-1-2.
She wants to share files (read only), and she could do
[emilia@vera1 ~]$ chgrp -R pg_naiss2023-1-2 shared_directory
[emilia@vera1 ~]$ chmod -R g+rx shared_directory
[emilia@vera1 ~]$ chmod o+x ~/ ~/..
The first two lines change the group, and the group rights recursively (applies
to all files under shared_data
). The last line gives the necessary execute
permissions required to access directories under ~
and the directory above
it..
Remember! If you give out write-permissions in a sub-directory, all files created in there, also by other users, will still count towards your quota.
Access Control Lists (ACLs)¶
To give more fine-grained control of file sharing, you can use ACLs. This allows you to give out different read, write, and execute permissions to individual users or groups.
If emilia wants to have a shared file storage with robert, and give out read rights to sara, she could do:
[emilia@vera1 ~]$ setfacl -R -m user:robert:rwx,user:sara:rx shared_data
[emilia@vera1 ~]$ chmod o+x ~/ ~/..
and she can check the current rights using the corresponding get-command:
[emilia@vera1 ~]$ getfacl shared_data
# file: /cephyr/users/emilia/Vera/shared_data
# owner: emilia
# group: emilia
user::rwx
user:robert:rwx
user:sara:rx
group::rwx
mask::rwx
other::r-x
You can find many examples using setfacl
and getfacl
online, e.g.
https://linux.die.net/man/1/setfacl.
Using node local disk ($TMPDIR
)¶
It is crucial that you use the node local disk for jobs that perform a lot of intense file IO. The globally accessible file system is a shared resource with limited capability and performance.
It is also crucial that you retrieve and save any important data that was produced and saved to the node local file system. The node local file systems are always wiped clean immediately after your job has ended!
To use use $TMPDIR, copy the files there, change to the directory, run your simulation, and copy the results back:
#!/bin/bash
# ... various SLURM commands
cp file1 file2 $TMPDIR
cd $TMPDIR
... run your code ...
cp results $SLURM_SUBMIT_DIR
Be certain that you retrieve and save any important data that was produced and saved on the node local file system.
The size of
$TMPDIR is 380GB on Vera and 140GB to 813G.
When running on a shared node, you will be allocated size on $TMPDIR
proportional to the number of cores you have on the node.
Note! As a default each node have a private $TMPDIR
, i.e. $TMPDIR
share the
same path, but point to different storage areas. You have to make sure to
distribute and collect files to all nodes if you use more than one node! Also
see below for a shared, parallel $TMPDIR
.
Distributing files to multiple $TMPDIR
's¶
Note: Using ptmpdir
is usually a much simpler option, see below!
The job-script only executes on the main node (first node) in your job, therefore the job-script must
- distributes input files to all other nodes in the job
- collect output files from all other nodes
- copy the results back to the centre storage
To distribute files to the node local disks, use the command pdcp
. When
invoked from within a job script, pdcp
automatically resolves which nodes are
involved. Ex.
copies file1
and file2
from the current directory to the different $TMPDIR
on all nodes in the current job.
Collecting the data back from multiple nodes depends on the software used.
copies output_file.data
from the head node only, whereas
copies the files $TMPDIR/output.data
from all compute nodes in the job, and
places them in $SLURM_SUBMIT_DIR
, ex. output.data.vera04-2
,
output.data.vera07-1
, etc.
Both the pdcp
and rpdcp
commands takes the flag -f
for recursively copying
file hierarchies.
A shared, parallel $TMPDIR¶
The nodes local disks can be set up to make up a shared, parallel area when running a job on more than 1 node. This will give you:
- a common namespace (i.e. all the nodes in your job can see the same files)
- a larger total area aggregating all nodes
$TMPDIR
- a faster file IO
To invoke a shared $TMPDIR
, simply add the flag --gres=ptmpdir:1
to your job
script.
Your $TMPDIR
will now use all active nodes local disks in parallel. Copying
files works as if it was one large drive. It is recommended to always use this
option if you use $TMPDIR
for multi-node jobs!
Saving files periodically¶
With a little bit of shell scripting, it is possible to periodically save files
from $TMPDIR
to the centre storage. Please implement this with reason so that
you don't put excessive load on the shared file system (if you are unsure, ask
support@c3se.chalmers.se for advice).
A hypothetical example that creates a few backup files once every second hour could look like
#!/bin/bash
# ... various SBATCH flags
while sleep 2h; do
# This will be executed once every second hour
rsync -a $TMPDIR/output_data/ $SLURM_SUBMIT_DIR/
# -u == --update skip files that are newer on the receiver
done & # The &-sign after the done-keyword places the while-loop in a sub-shell in the background
LOOPPID=$! # Save the PID of the subshell running the loop
... calculate stuff and retrieve data in a normal fashion ...
# All calculations are done, let's clean up and kill the background loop
kill $LOOPPID
This example would create a background loop that would be running on the head-compute node (the compute node in your allocation that run the batch script).