Svea Filesystem

From C3SE
Jump to: navigation, search

Contents

Chalmers-global homedirectory

Currently there is no access to users home directory on Chalmers' central file server (/chalmers/users/[CID]).

User homedirectory

NOTE: The users home directories are not shared between Svea and other computational resources at C3SE, Beda for instance.

Users home directories:

/c3se/users/[CID]

On the Svea-cluster there is about 9TB of storage for user files (filesystem: /c3se/users/[CID]). This filesystem (like all other user-writeable filesystems on the C3SE clusters) is NOT backed up at all! So important files should be copied to your other home directory (at your department) for backup.

To use the path to your C3SE home directory in i.e. scripts, use $HOME.

Quota

There is a limit to how much disk space and to how many files (number of inodes) each user can have on the home directory file system. There are two important concepts of the quota limits

  • "quota" (also called "soft limit"). Indicate the amount of resources (disk space or number of files) a user is allowed to allocate for an unrestricted time period. It is possible to temporarily exceed the soft limit.
  • "limit" (also called "hard limit"). Indicate a limit to the amount of resources (disk space or number of files) that a user can allocate. It is not possible to go past the hard limit!

How to check your usage and quota limits

You can check your quota limits by issuing the command C3SE_quota.

"Can I exceed my quota? What will happen?"

It is possible to temporarily exceed the quota (but never the hard limit). If you exceed the quota for either disk space or inodes, you enter into a "grace period", which is a limited amount of time before the soft limit starts to act as a hard limit. That means that if your quota is exceeded, and if you are out of grace time, you will not be able to allocate any further disk resources. You then have to remove files until your current usage goes below the corresponding quota!

"How much grace time do I have left?"

If you have exceeded your quota for either disk space or inodes, C3SE_quota will show how much time you have left of your grace period. Depending on the amount of grace time you have left the output will be of the format number of days #days or hours and minutes hh:mm

Job submission directory

Lets say you prepared your files in the $HOME/my_project/subtask_XY directory, an (easy) way of telling this to to your scripts is to use the $PBS_O_WORKDIR environment variable. The comment above regarding filesystem performance is also valid here.

So, 'cd $PBS_O_WORKDIR' always put you back where you submitted your job.

Node local disk

If your job uses a lot of disk, or if it access files a lot, it is beneficial (to your runtime and allover system performance) to use the local disk in the worker node. The system prepares a job-unique directory on this local disk for you to use, this directory (remember: job-unique and node-local) can be found in the environment variable $TMPDIR.

Here is a small example on how this could be used in your job-scripts:

1. copy the files you need for your job to local disk

cp -p file1 file2 $TMPDIR/

2. go there

cd $TMPDIR

3. run your job

./file1 file2

4. copy output files back to job submission directory

cp -p file3 file4 $PBS_O_WORKDIR/

The above example copies your file to the master node. If your program does not use mpi to distribute the file and you want to copy FILE to all your nodes before you run you can do this by

mpiexec -comm none -pernode cp FILE $TMPDIR

and vice versa when you copy back.

Note! This directory and all the files in it is removed when the job ends!

Copy intermediate data

Here are example scripts that can be used to copy intermediate data/results from the local disc on the compute node/nodes during your calculations. Examples for Single-node and Multi-node/MPI jobs are provided.

Do not copy your files unnecessarily often during your calculations (about 10 times should be sufficient) as this may affect both the performance of your calculation as well as the overall performance of the cluster.

Change the variable COPYTIME and the names of the files that you want to transfer to and from the local discs of the compute node/nodes to suite your needs. Notes about this are present in the scripts.

Note! Do NOT remove the "&" at the end of the command in the examples below (i.e. "time ./my_big_calc &" and "time mpiexec ./my_mpi_calc &"), it is necessary for the script to work as intended!

  • Single-node
# Arguments to qsub can be submitted via the script as well by starting
# the line with #PBS 
#
# Set your mail address
#PBS -M me@my.domain
#
# Mail on abort
#PBS -m a
#
# Specify time for job (here 1h 10 min 5 sec)
#PBS -l walltime=1:10:05
#
# Request 1 processor (node)
#PBS -l nodes=1:ppn=1
#
# Set the name of the job
#PBS -N Test_Job
#
# End of arguments to qsub

# Go to work submission directory (Don't remove this line)
cd $PBS_O_WORKDIR 

# Copy files to nodes local $TMPDIR
# Change "indata1.dat indata2.dat my_big_calc"
cp -p indata1.dat indata2.dat my_big_calc $TMPDIR/

# Go to node local temp-directory
cd $TMPDIR

# Run the program
# Change "my_big_calc" 
time ./my_big_calc &

# Catch process id
main_pid=$!

# Copying interval and sleeptime in seconds (here 1h 5 seconds)
# Change "3605" to suite your needs
COPYTIME=3605
let "SLEEPTIME = ${COPYTIME} / 10"

# This loop runs until the main process has ended
SECONDS=0

until [[ -z `ps --no-heading $main_pid` ]]  
  do
  # Sleeps for the specified interval
  sleep $SLEEPTIME

  # make copies if COPYTIME is reached
  if (( $SECONDS > $COPYTIME )); then
    # Change "my-file1 my-file2"
    rsync -au my-file1 my-file2 $PBS_O_WORKDIR/

    SECONDS=0
  fi;
done;

# Copy back when main process has ended
# Change "my-file1 my-file2"
rsync -au my-file1 my-file2 $PBS_O_WORKDIR/

#End of script (make sure line before this gets run)

Example script can be downloaded here.


  • Multi-node/MPI
# Arguments to qsub can be submitted via the script as well by starting
# the line with #PBS 
#
# Set your mail address
#PBS -M me@my.domain
#
# Mail on abort
#PBS -m a
#
# Specify time for job (here 1h 10 min 5 sec)
#PBS -l walltime=1:10:05
#
# Request 1 processor (node)
#PBS -l nodes=1:ppn=1
#
# Set the name of the job
#PBS -N Test_Job
#
# End of arguments to qsub

# Go to work submission directory (Don't remove this line)
cd $PBS_O_WORKDIR 

# Copy to file to master nodes local $TMPDIR
# Change "indata1.dat indata2.dat my_big_calc"
cp -p indata1.dat indata2.dat my_mpi_calc $TMPDIR/ 

# Go to master nodes local $TMPDIR
cd $TMPDIR

# Distribute files to other nodes $TMPDIR
hostlist=`uniq $TMPDIR/mpichnodes | perl -pi -e 's/\n/,/g;s/,\n//g' | cut -d, -f2-` 
# Change "indata1.dat indata2.dat my_mpi_calc"
dcp -T -r /usr/bin/scp -n $hostlist indata1.dat indata2.dat my_mpi_calc $TMPDIR/

# Run the program
# Change "my_mpi_calc"
time mpiexec ./my_mpi_calc & 

# Catch process id
main_pid=$!

# Copying interval and sleeptime in seconds (here 1h 5 seconds)
# Change "3605" to suite your needs
COPYTIME=3605
let "SLEEPTIME = ${COPYTIME} / 10"

# This loop runs until the main process has ended
SECONDS=0

until [[ -z `ps --no-heading $main_pid` ]] 
  do
  # Sleeps for the specified interval
  sleep $SLEEPTIME

  # make copies if COPYTIME is reached
  if (( $SECONDS > $COPYTIME )); then
    # Copy files from all nodes in job
    cat ${TMPDIR}/lamnodes | while read line; do
      node=$(echo ${line}|sed 's/mpi.*$/mpi/');
      # Change "outdata*.dat"
      rsync -au --rsh=ssh "$node:$TMPDIR/outdata*.dat" $PBS_O_WORKDIR/
    done
    
    SECONDS=0
  fi;
done;

# Copy back files from all nodes when main process has ended
cat ${TMPDIR}/lamnodes | while read line; do
  node=$(echo ${line}|sed 's/mpi.*$/mpi/');
  # Change "outdata*.dat"
  rsync -au --rsh=ssh "$node:$TMPDIR/outdata*.dat" $PBS_O_WORKDIR/
done

#End of script (make sure line before this gets run)

Example script can be downloaded here.

Personal tools
Protected Pages
 
Toolbox