A command-line tool for performing basic operations on the data content (not the metadata) of the Radboud Data Repository collections. In essense, it uses the WebDAV protocol to implemente the operations; therefore it is also a genetic tool for managing data accessible via WebDAV with the HTTP basic authentication (e.g. SURFDrive).
The following operations are currently implemented:
- ls: list a directory
- mkdir: create a new directory
- cp: copy a file or a directory
- mv: rename a file or a directory
- rm: remove a file or a directory
- get: download a file or a directory
- put: upload a file or a directory
- mget: download multiple files or directories
- mput: upload multiple files or directories
When performing recursive operation on a directory, the tool does a directory walk-through and applies the operation on individual files in parallel. This approach breaks down a lengthy bulk-operation request into multiple shorter, less resource demanding requests. It helps improve the overall success rate of the operation.
The repocli
tool is provided as a single binary file which can be downloaded from the here.
Download the asset file repocli
for Linux, repocli.darwin
for Intel-based MacOSX and repocli.exe
for Windows.
You can place the file in any directory as long as the directory is part of the $PATH
variable (or %PATH%
for Windows).
For Linux and MacOSX users, you also need to make the downloaded file executable, e.g.
$ chmod +x repocli
A CLI for managing data content of the Donders Repository collections.
Usage:
repocli [command]
Available Commands:
completion Generate the autocompletion script for the specified shell
config configure the repository connection and save the credential
cp copy file or directory in the repository
get download file or directory from the repository
help Help about any command
ls list file or directory in the repository
mget download multiple files or directories from the repository
mkdir create new directory in the repository
mput upload multiple files or directories to the repository
mv move file or directory in the repository
put upload file or directory to the repository
rm remove file or directory from the repository
shell start an interactive shell
version print version number and exit
Flags:
-c, --config path path of the configuration YAML file. (default "/home/tg/honlee/.repocli.yml")
-h, --help help for repocli
-n, --nthreads number number of concurrent worker threads. (default 4)
-s, --silent set to slient mode (i.e. do not show progress)
-u, --url URL URL of the webdav server.
-v, --verbose verbose output
Use "repocli [command] --help" for more information about a command.
The configuration file
The credential (username and password) of the data-access account should be provided in a configuration file (specified by the -c
option) in the YAML format. The default location of this configuration file is ${HOME}/.repocli.yml
on Linux/MacOSX and C:\Users\<username>\.repocli.yml
on Windows. Since the program expects that the password stored in the configuration file is encrypted, it is better to use the following command to generate (or overwrite) the file:
$ repocli config
You will be asked to provide the WebDAV's baseURL, username and password. For the Radboud Data Repository users, the baseURL is https://webdav.data.ru.nl
. The username and password are your data-access account credential. The credentail can be retrieved from the RDR web portal. See the screenshot below as an example:
After providing those values, type y
to save the credential to the configuration file. Once it is done successfully, you can reuse the configuration file in the future to connect to the same WebDAV endpoint.
💡You can use the login
subcommand with the -c
option to create multiple configuration files, each for a different WebDAV endpoint.
❗The password in the configuration file is encrypted with the signatures of the file path and the username. Changes on the signatures (e.g. renaming the configuration file) will make the password invalid.
The shell mode
In addition to run the program's subcommands as individual shell commands (single-command mode), the CLI can also be used as an interactive shell (shell mode). One uses the shell
command to enter the shell mode:
$ repocli shell
The CLI's specific prompt > repocli
will be displayed as the screenshot below, waiting for furhter commands from the user.
In the shell mode, the following additional operations are enabled:
- cd: change the present working directory in the repository
- pwd: show the present working directory in the repository
- lcd: change the present working directory at local
- lpwd: show the present working directory at local
- lls: list content in the present working directory at local
Hereafter are examples showcasing how to use various subcommands. You can find more detailed and up-to-date usage via the help
subcommand. For example, the online help of the get
subcommand can be found by:
$ repocli help get
Given a collection with identifier di.dccn.DAC_3010000.01_173
, the WebDAV directory in which the collection data is stored is /dccn/DAC_3010000.01_173
. To list the content of this WebDAV directory, one does
$ repocli ls -l /dccn/DAC_3010000.01_173
/dccn/DAC_3010000.01_173:
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/Cropped
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/raw
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/test1
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/test2021
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/test3
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/test_loc.new
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/test_sync
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/testx
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/xyz.5
drwxrwxr-x 0 /dccn/DAC_3010000.01_173/xyz.x
-rw-rw-r-- 203 /dccn/DAC_3010000.01_173/MANIFEST.txt.1
-rw-rw-r-- 191503 /dccn/DAC_3010000.01_173/MD5E-s191503--8661ce04ccbbf51e96ce124e30fc0c8c.txt
-rw-rw-r-- 49152352 /dccn/DAC_3010000.01_173/MP2RAGE.nii
-rw-rw-r-- 2589 /dccn/DAC_3010000.01_173/Makefile
...
Assuming that we want to remove the file MANIFEST.txt.1
from the collection content listed above, we do
$ repocli rm /dccn/DAC_3010000.01_173/MANIFEST.txt.1
If we want to remove the entire sub-directory testx
, we use the command
$ repocli rm -r /dccn/DAC_3010000.01_173/textx
where the extra flag -r
indicates recursive removal.
To create a subdirectory demo
in the collection, we do
$ repocli mkdir /dccn/DAC_3010000.01_173/demo
One could also create a directory tree use the same command, any missing parent directories will also be created (similar to the mkdir -p
command on Linux). For example, if we want to create a directory tree demo1/data/sub-001/ses-mri01
, we do
$ repocli mkdir /dccn/DAC_3010000.01_173/demo1/data/sub-001/ses-mri01
It can be done with or without the existence of the parent tree structure demo1/data/sub-001
.
For uploading/downloading a single file to/from the collection in the repository. One use the put
and get
sub-commands, respectively. The put
and get
sub-commands require two arguments. The first argument refers to the source path; while the second to the destination path.
The local path can be in a format recognized by the shell. For the WebDAV path, although either the absolute form (i.e. started with /
) or the relative form (i.e. started with ./
or ../
) can be used, the relative path makes more sense in the shell mode (i.e. repocli shell
, see above) where one can change the current WebDAV directory using the cd
command. Outside the shell mode, the current WebDAV working directory is always the one defined by the configuration variable baseURL
.
For example, to upload a local file test.txt
in the present working directory to /dccn/DAC_3010000.01_173/demo/test.txt
, one does
$ repocli put ./test.txt /dccn/DAC_3010000.01_173/demo/test.txt
To download a remote file /dccn/DAC_3010000.01_173/demo/test.txt
to test.txt.new
in the home directory at local (refered by the $HOME
variable), one does
$ repocli get /dccn/DAC_3010000.01_173/demo/test.txt $HOME/test.txt.new
If the destination is a directory, file will be downloaded/uploaded into the directory with the same name. If the destination is an existing file, the file will be skip by default. One can use the -f
option to overwrite the existing file.
Assuming that we have a local directory /project/3010000.01/demo
, and we want to upload the content of it recursively to the collection under the sub-directory demo
. We use the command below:
$ repocli put /project/3010000.01/demo/ /dccn/DAC_3010000.01_173/demo
where the first argument to put
is a directory locally as the source, and the second is a directory in the repository as the destination.
For downloading a directory from the repository, one does
$ repocli get /dccn/DAC_3010000.01_173/demo/ /project/3010000.01/demo.new
where the first argument is a directory in the repository as the source, and the second is a local directory as the destination.
Note: The same as the rsync
command, the tailing /
in the source instructs the tool to copy the content into the destination. If the tailing /
is left out, it will copy the directory by name in to the destination, resulting in the content being put into a (new) sub-directory in the destination.
For renaming a file within a collection, one uses the mv
sub-command. This sub-command also takes two arguments, the source and the destniation.
For example, if we want to rename a file /dccn/DAC_3010000.01_173/test.txt
to /dccn/DAC_3010000.01_173/test.txt.old
in the repository, we do
$ repocli mv /dccn/DAC_3010000.01_173/test.txt /dccn/DAC_3010000.01_173/test.txt.old
We could also rename an entire directory. For example, if we want to rename a /dccn/DAC_3010000.01_173/demo
to /dccn/DAC_3010000.01_173/demo.new
, we use the command below (note the tailing /
of the source for "moving the content over"):
$ repocli mv /dccn/DAC_3010000.01_173/demo/ /dccn/DAC_3010000.01_173/demo.new
Moving the source directory into a the destination directory can be achived by leaving the tailing /
out the source directory. Taking the example above, if the tailing /
is omitted, e.g.
$ repocli mv /dccn/DAC_3010000.01_173/demo /dccn/DAC_3010000.01_173/demo.new
the end result will a new directory /dccn/DAC_3010000.01_173/demo.new/demo
in which the data within the source directory are moved over.
When performing an operation on a large amount of files, there can be temporary (server or network) issues causing errors on few files. While the errors are written to the terminal; one can use the -e {filename}
option of repocli
to save the errors to a text file {filename}
. This text file can be used to simplify the process of patching the operation. The option is currently available for the get
, put
, mget
and mput
operations.
From version >= 0.5.0, repocli
also supports retry on failed file upload and download. This retry feature is disabled by default and can be enabled for put
, get
, mput
and mget
operations with the -r N
option where N
is the maximum number of retries (i.e. in total N+1
attempts).
Since repocli
is a standalone executable, it can be used within a shell script or by making a system call. Hereafter are some examples:
- size.sh gets the total size and number of files in a remote (WebDAV) directory.
- download_n_process.sh downloads the MR data from a Donders Repository followed by processing it locally.