Skip to content

gMark is a domain- and query language-independent query workload generator and query language utility library.

License

Notifications You must be signed in to change notification settings

RoanH/gMark

Repository files navigation

gMark

gMark is a domain- and query language-independent query workload generator, as well as a general utility library for working with the CPQ (conjunctive path query) and RPQ (regular path query) query languages. This project was originally started as a rewrite of the original version of gMark available on GitHub at gbagan/gmark, with as goal to make gMark easier to extend and better documented. However, presently the focus of the project has shifted primarily towards query languages, notably CPQ. Graph generation is currently out of scope for this project, though full feature parity for query generation is still planned. Presently, most of the features available for RPQs in the original version of gMark are available for CPQs in this version, with the exception of some output formats. However, the utilities available within gMark for working with query languages in general are much more extensive than those available in the original version of gMark.

Documentation & Research

The current state of the repository is the result of several research projects, each of these research items can be consulted for more information on a specific component in gMark:

The javadoc documentation for this repository can be found at: gmark.docs.roanh.dev

Getting started with gMark

To support a wide variety of of use cases gMark is a available in a number of different formats.

Command line usage

When using gMark on the command line the following arguments are supported:

usage: gmark [-c <file>] [-f] [-g <size>] [-h] [-o <folder>] [-s <syntax>] [-w <file>]
 -c,--config <file>     The workload and graph configuration file
 -f,--force             Overwrite existing files if present
 -h,--help              Prints this help text
 -o,--output <folder>   The folder to write the generated output to
 -s,--syntax <syntax>   The concrete syntax(es) to output
 -w,--workload <file>   Triggers workload generation, a previously generated input workload can
                        optionally be provided to generate concrete syntaxes for instead

For example, a workload of queries in SQL format can be generated using:

gmark -c config.xml -o ./output -s sql -w

An example configuration XML file can be found both in this repository and in the graphical interface of the standalone executable. The example RPQ workload configuration files included in the original gMark repository are also compatible and can be found in the use-cases folder.

Executable download

gMark is available as a standalone portable executable that has both a graphical interface and a command line interface. The graphical interface will only be launched when no command line arguments are passed. This version of gMark requires Java 17 or higher to run.

All releases: releases
GitHub repository: RoanH/gMark

Command line usage of the standalone executable

The following commands show how to generate a workload of queries in SQL format using the standalone executable.

Windows executable
./gMark.exe -c config.xml -o ./output -s sql -w
Runnable Java archive
java -jar gMark.jar -c config.xml -o ./output -s sql -w

Docker image

gMark is available as a docker image on Docker Hub. This means that you can obtain the image using the following command:

docker pull roanh/gmark:latest

Using the image then works much the same as the regular command line version of gMark. For example, we can generate the example workload of queries in SQL format using the following command:

docker run --rm -v "$PWD/data:/data" roanh/gmark:latest -c /data/config.xml -o /data/queries -s sql -w

Note that we mount a local folder called data into the container to pass our configuration file and to retrieve the generated queries.

Maven artifact Maven Central

gMark is available on Maven central as an artifact so it can be included directly in another Java project using Gradle or Maven. This way it becomes possible to directly use all the implemented constructs and utilities. A hosted version of the javadoc for gMark can be found at gmark.docs.roanh.dev.

Gradle
repositories{
	mavenCentral()
}

dependencies{
	implementation 'dev.roanh.gmark:gmark:1.3'
}
Maven
<dependency>
	<groupId>dev.roanh.gmark</groupId>
	<artifactId>gmark</artifactId>
	<version>1.3</version>
</dependency>

Query Language API

Most of the query language API is accessible directly via the CPQ and RPQ classes. For example, queries can be constructed using:

Predicate a = new Predicate(0, "a");

CPQ query = CPQ.parse("a ∩ a");
CPQ query = CPQ.intersect(a, a);
CPQ query = CPQ.generateRandomCPQ(4, 1);

RPQ query = RPQ.parse("a ◦ a");
RPQ query = RPQ.disjunct(RPQ.concat(a, a), a);
RPQ query = RPQ.generateRandomRPQ(4, 1);

For CPQs query graphs and cores can be constructed using:

CPQ query = ...;

QueryGraphCPQ graph = query.toQueryGraph();
QueryGraphCPQ core = query.toQueryGraph().computeCore();
QueryGraphCPQ core = query.computeCore();

Other notable utilities for CPQ and RPQ are:

CPQ query = ...;

String sql = query.toSQL();
String formal = query.toFormalSyntax();
QueryTree ast = query.toAbstractSyntaxTree();

Note that CPQ and RPQ can also be constructed from an AST, which can sometimes be used to convert between the two query languages:

RPQ rpq = RPQ.parse("a ◦ a");
CPQ cpq = CPQ.parse(rpq.toAbstractSyntaxTree());

All more general utilities can be found in the dev.roanh.gmark.util package.

Development of gMark

This repository contain an Eclipse & Gradle project with Util and Apache Commons CLI as the only dependencies. Development work can be done using the Eclipse IDE or using any other Gradle compatible IDE. Continuous integration will check that all source files use Unix style line endings (LF) and that all functions and fields have valid documentation. Unit testing is employed to test core functionality, CI will also check for regressions using these tests. A hosted version of the javadoc for gMark can be found at gmark.docs.roanh.dev. Compiling the runnable Java archive (JAR) release of gMark using Gradle can be done using the following command in the gMark directory:

./gradlew client:shadowJar

After which the generated JAR can be found in the build/libs directory. On windows ./gradlew.bat should be used instead of ./gradlew.

History

Project development started: 25th of September, 2021.