Aggregate ResourceSync Sources
- The component in this repository is intended for system administrators and developers.
- Source location: https://github.com/EHRI/rs-aggregator.
- Documentation: https://ehri.github.io/rs-aggregator/.
- In case of questions contact the EHRI team.
The Destination
in a
ResourceSync Framework configuration keeps zero or
more sets of resources
from zero or more Sources
synchronized.
See also: Definitions.
rs-aggregator
can be used, out-of-the-box, as such a Destination
that aggregates sets of resources
from a list of Sources
.
-
Clone or download this repository to your local drive.
-
Start a Docker daemon (if it is not already running) switch to the
docker
directory and run the start-script.
cd rs-aggregator/docker
./start.sh
If you see the rs-aggregator
logo...
then you have just built a Java8 capable docker container, imported required libraries, compiled, tested and packaged the source code and started the rs-aggregator application. To gracefully stop the aggregator (without interrupting a synchronisation run) you can run the stop-script.
./stop.sh
The aggregated resources are in docker/destination
, grouped by host name of the Sources
.
Sources
are synchronized per set of resources
described by a capabilityList
.
Each set of resources
will be in a subdirectory contingent to the path that led to the
corresponding capabilityList
. In this base directory you will find:
__MOR__
a directory containing the metadata: thecapabilityList
and its child-sitemaps.__SOR__
a directory containing theset of resources
.__SYNC_PROPS__
a directory containing a report for each synchronisation run in the form of an xml-properties file.
The configuration files are in cfg
. When running the Docker container rs-aggregator
will
look in docker/cfg
.
cfg/uri-list.txt
| docker/cfg/uri-list.txt
Each set of resources
is denoted by a distinguished capabilityList
. The file
cfg/uri-list.txt
should contain a list of URI's pointing to capabilityLists
you
want to follow. This list is read each time before a synchronisation run, so changes
you make to the list are hot-deployed: you do not need to restart the aggregator.
cfg/syncapp-context.xml
| docker/cfg/syncapp-context.xml
The Spring configuration file. At this moment notable configuration details are:
- job-scheduler
- See for available schedulers the package
nl.knaw.dans.rs.aggregator.schedule
. - See for available timer options the description on properties of the bean.
- See for available schedulers the package
- resource-manager
During synchronisation the
nl.knaw.dans.rs.aggregator.sync.SyncWorker
and companions are doing all the heavy lifting, while Thenl.knaw.dans.rs.aggregator.syncore.ResourceManager
gets relatively easy to accomplish and simple tasks. An implementation ofresource-manager
is dedicated to a specific storage system for the aggregated resources. At this moment only thenl.knaw.dans.rs.aggregator.sync.FsResourceManager
is available, aimed at storing resources on the file system.