Skip to content

jefscheepers/vsc-python-irodsclient

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VSC Python iRODS Client (VSC-PRC)

Documentation on ReadTheDocs

Welcome to the Vlaams Supercomputing Centrum (VSC) extensions to the Python iRODS Client (PRC)!

VSC-PRC's main goal is to make it easier for researchers to manage their data using iRODS, in particular on VSC's high performance computing infrastructure.

To this end, VSC-PRC offers a Python module and associated command line scripts:

  • The vsc_irods Python module contains a VSCiRODSSession class which represents an extension of the corresponding iRODSSession class in PRC.

    A main feature is the possibility of using wildcards ("*") and tildes ("~") for specifying iRODS data objects and collections. For example, the following code will copy all files ending on '.txt' inside a 'my_irods_collection' collection in your irods_home to the local working directory:

    >>> from vsc_irods.session import VSCiRODSSession
    >>>
    >>> with VSCiRODSSession() as session:
    >>>     session.bulk.get('~/my_irods_collection/*.txt', local_path='.')

    Other 'bulk' operations are available for:

    • uploading files and folders
    • removing data objects and collections
    • adding and modifying metadata
    • listing the disk usage

    More advanced search capabilities (i.e. beyond the above glob patterns) are also provided. For example, the following can be used to list all data objects in your irods_home ending on '.txt' and which possess a metadata entry with Attribute='Author' and Value='Me':

    >>> with VSCiRODSSession() as session:
    >>>     for item in session.search.find('~', pattern='*.txt', types='f',
    >>>                                     object_avu=('Author', 'Me')):
    >>>         print(item)

    This can be used in conjunction with the 'bulk' operations, e.g.:

    >>> with VSCiRODSSession() as session:
    >>>     iterator = session.search.find('~', pattern='*.txt', types='f',
    >>>                                    object_avu=('Author', 'Me'))
    >>>     session.bulk.get(iterator, local_path='.')
  • VSC-PRC also comes with a set of scripts which make it easy to use the Python module from a Unix shell:

    • vsc-prc-find
    • vsc-prc-iget
    • vsc-prc-iput
    • vsc-prc-imkdir
    • vsc-prc-irm
    • vsc-prc-size
    • vsc-prc-imeta
    • vsc-prc-add-job-metadata

    Typing e.g. vsc-prc-find --help will show a description of the recognized arguments. The command-line equivalents of the three Python snippets above, for example, would look like this:

    vsc-prc-iget '~/my_irods_collection/*.txt' -d .
    vsc-prc-find '~' -n '*.txt' --object_avu='Author;Me'
    vsc-prc-find '~' -n '*.txt' --object_avu='Author;Me' | xargs -i vsc-prc-iget {} -d .

More examples can be found in the examples directory.

Dependencies

  • Python3
  • python-irodsclient >= v0.8.4

Installation

  • Clone or download from GitHub
  • Set the $VSC_PRC_ROOT environment variable to the location of VSC-PRC's root directory
  • Add the $VSC_PRC_ROOT/lib directory to your $PYTHONPATH
  • Add the $VSC_PRC_ROOT/tools directory to your $PATH
  • Run the tests in the $VSC_PRC_ROOT/test folder

On VSC's BrENIAC cluster, VSC-PRC and its dependencies are also available as a module:

module load vsc-python-irodsclient/development

About

VSC Python iRODS client

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 97.3%
  • Shell 2.7%