Skip to content

Bulk data transfer to and from Alvis

Any larger transfers to and from Alvis should make sure to use alvis2 log-in node which is the dedicated data transfer node on Alvis.

rclone

rclone is a tool for copying data to, from, and between various cloud storage services. It comes with a large number of configuration templates for common services, like those offered by Microsoft, Google and Amazon. Creating a configuration with authentication credentials for most services is easiest to do in an interactive desktop session, which you can most easily access via the Alvis portal. Before you start, locate the rclone documentation pages for usage of the specific service you are using. Then, open an OnDemand interactive desktop session, open a terminal, and type

$ rclone config

Enter n for "new", and choose a name for your configuration (such as the name of the service). Refer to the rclone documentation. If the service you are using has already provided you with an authentication token, you can enter it directly. Otherwise, in most cases rclone will prompt you to open a link in a browser and log in to the service, in order to obtain the necessary credentials.

Once you have set up the configuration, you can check the files and directories on the cloud service using

$ rclone lsf NAME_OF_YOUR_CONFIGURATION:/
my_directory/
my_file.txt

To see the detailed contents of a specific directory, you can use

$ rclone lsl NAME_OF_YOUR_CONFIGURATION:/my_directory
1604608 2024-04-04 02:23:12.0000000 my_other_file.txt

To copy files from the service to the local file area, use

$ rclone copy NAME_OF_YOUR_CONFIGURATION:/my_directory /PATH/TO/YOUR/LOCAL/STORAGE

and vice versa (assuming you have configured rclone for both retrieving and uploading data):

$ rclone copy /PATH/TO/YOUR/LOCAL/STORAGE NAME_OF_YOUR_CONFIGURATION:/my_directory

For further details, consult man rclone, the online documentation, or type rclone --help.

rrsync - restricting ssh key usage

rrsync is a utility which allows you to restrict SSH private-public key pairs to only function for a certain subset of rsync operations. A typical case would be that you have a folder of important data, which you are worried about accidentally overwriting. Moreover, you have other important folders, and you don't want to accidentally overwrite those either. So you want to make sure that your usage is restricted to reading from a specific folder, and writing to a specific folder.

For the purpose of this example, we will assume you have a folder on the mimer system, e.g. /mimer/NOBACKUP/groups/groupname, and in there two sub-folders, write_to and read_from. Our goal is to set up a configuration whereby there is no risk of accidentally overwriting files outside of write_to, or reading any other files than those in read_from.

In order to do this, start by creating 2 ssh keys on your local computer:

$ ssh-keygen -f .ssh/write_key
$ ssh-keygen -f .ssh/read_key

Print the public keys and/or copy them by your preferred means to alvis1 or alvis2:

$ cat .ssh/write_key.pub
$ cat .ssh/read_key.pub

If it doesn't already exist, create a file called authorized_keys on alvis2. Paste the two public keys in there, so that your authorized_keys looks something like

$ ssh-rsa AAAAA...username@local_computer
$ ssh-rsa AAAAA...username@local_computer

We can test that these keys work by creating a regular ssh configuration on your computer. Do the following:

$ touch .ssh/config
$ chmod 600 .ssh/config

Then edit .ssh/config with a text editor like vim, and add the following, replacing the value under User with your CID:

Host alvis2-write
    HostName 129.16.125.131
    User YOUR_CID
    IdentitiesOnly yes
    IdentityFile ~/.ssh/write_key
    PasswordAuthentication no

Host alvis2-read
    HostName 129.16.125.131
    User YOUR_CID
    IdentitiesOnly yes
    IdentityFile ~/.ssh/read_key
    PasswordAuthentication no

Test that your new configuration works, by doing ssh alvis2-read and ssh alvis2-write from your local computer.

Then go into .ssh/authorized_keys on alvis2, and modify it as follows:

command="/usr/bin/rrsync -wo /mimer/NOBACKUP/groups/groupname/write_to" ssh-rsa AAAAA...username@local_computer
command="/usr/bin/rrsync -ro /mimer/NOBACKUP/groups/groupname/read_from" ssh-rsa AAAAA...username@local_computer

The command part restricts each key to be write-only and read-only respectively, to a specific folder. You should now be able to write to /mimer/NOBACKUP/groups/groupname/write_to by running

$ rsync -av /folder/on/local/computer alvis2-write:

and read from /mimer/NOBACKUP/groups/groupname/read_from by running

$ rsync -av alvis2-read: /folder/on/local/computer

To test that usage is restricted to this operation, you can do

$ rsync -av alvis2-write: /folder/on/local/computer
$ rsync -av /folder/on/local/computer alvis2-read:

These operations should fail, if everything is working and has been set up correctly.