Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark #711

Merged
merged 11 commits into from
Jan 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions benchmark/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
Benchmarking JFlex
==================

This directory is work in progress on a small performance benchmarking
suite for JFlex.

Main ideas
----------

* use [JMH][1] as the benchmarking framework. There are good [technical articles][2] on the subtleties, and accessible [short][3] and slightly [longer tutorials][4].

* main goal is to gather performance numbers on the scanning engine. There are multiple options for this:

* micro benchmark on just the generated JFlex code + skeleton, as tightly as possible, without action code

* macro end-to-end benchmark for a full scanner on realistic input (somehow eliminating file reading overhead etc, although it might be interesting to see if anything we do makes any difference once IO is present)

* anything in between these two

* run on current development snapshot

* add generated scanners from previous versions of JFlex to track development over time

* as baseline (= can do no better than this), use a method that reads a
Reader into a buffer (at least as long as the input) and touches each
character once, sequentially. This should be the minimum a matcher with a
Reader interface must do if it is supposed to consume the entire input.

* use something like java.util.regex and maybe JLex as comparison

* at some point automate and auto-publish results

We could also benchmark various aspects of the generator itself, but so far
that is lower priority.

The plan is to start with a small micro benchmark and incrementally add from
there. This should eventually include profiling to at least be informed about
what we're actually measuring.

Open to ideas on any of this. Please comment on github issue [#689][github-issue] if you have opinions or would like to contribute.


Building and Running
---------------------

mvn package

will build the benchmark and

java -jar target/benchmark-full-1.8.0-SNAPSHOT.jar

will run it.



[1]: https://openjdk.java.net/projects/code-tools/jmh/
[2]: https://www.oracle.com/technical-resources/articles/java/architect-benchmarking.html
[3]: https://www.mkyong.com/java/java-jmh-benchmark-tutorial/
[4]: http://tutorials.jenkov.com/java-performance/jmh.html

[github-issue]: https://github.com/jflex-de/jflex/issues/698
120 changes: 120 additions & 0 deletions benchmark/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
<?xml version="1.0"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>de.jflex</groupId>
<artifactId>jflex-parent</artifactId>
<version>1.8.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>benchmark</artifactId>
<name>JFlex Benchmark</name>
<description>A small performance benchmark suite for JFlex.</description>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.google.truth</groupId>
<artifactId>truth</artifactId>
<version>0.36</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>${jmh.version}</version>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>${jmh.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<annotationProcessorPaths>
<path>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>${jmh.version}</version>
</path>
</annotationProcessorPaths>
</configuration>
</plugin>
<plugin>
<groupId>de.jflex</groupId>
<artifactId>jflex-maven-plugin</artifactId>
<version>1.8.0-SNAPSHOT</version>
<executions>
<execution>
<goals>
<goal>generate</goal>
</goals>
<configuration>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-resources-plugin</artifactId>
<executions>
<execution>
<id>copy-pregen</id>
<phase>generate-sources</phase>
<goals>
<goal>copy-resources</goal>
</goals>
<configuration>
<outputDirectory>target/generated-sources/jflex/de/jflex/benchmark/pregen</outputDirectory>
<resources>
<resource>
<directory>src/main/pregen</directory>
</resource>
</resources>
</configuration>
</execution>
</executions>
</plugin>
<!-- Build an Uberjar -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>${project.artifactId}-full-${project.version}</finalName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>org.openjdk.jmh.Main</mainClass>
</transformer>
</transformers>
<artifactSet>
<includes>
</includes>
</artifactSet>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<jmh.version>1.22</jmh.version>
</properties>
</project>
95 changes: 95 additions & 0 deletions benchmark/src/main/java/jflex/benchmark/JFlexBench.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
package jflex.benchmark;

import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.util.concurrent.TimeUnit;
import jflex.benchmark.pregen.NoAction17;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

// @BenchmarkMode({Mode.AverageTime, Mode.SampleTime, Mode.SingleShotTime})
@BenchmarkMode({Mode.AverageTime, Mode.SingleShotTime})
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(value = 1)
public class JFlexBench {

@State(Scope.Benchmark)
public static class LexerState {
/**
* Factor by which to scale the input size. We should see a benchmark time roughly linear in the
* factor, i.e. the first time times 10 and 100.
*/
@Param({"100", "1000", "10000"})
public int factor;

@Param({"1", "2"})
public int input;

/** The length of the input for the benchmark. We give this to the baseline, but not JFlex. */
public int length;

/** The reader the input will be read from. Must support {@code reset()}. */
public Reader reader;

/** Create input and populate state fields. Runs once per entire benchmark. */
@Setup
public void setup() {
StringBuilder builder = new StringBuilder();
for (int i = 0; i < 10 * factor; i++) {
switch (input) {
case 1:
builder.append("aaa");
builder.append("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb");
builder.append(" ");
break;
case 2:
builder.append("😎a");
builder.append("このマニュアルについてbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb");
builder.append(" ");
break;
default:
assert false : "reached unreachable default case";
}
}
length = builder.length();
reader = new StringReader(builder.toString());
}
}

@Benchmark
public int noActionLexer(LexerState state) throws IOException {
state.reader.reset();
return new NoAction(state.reader).yylex();
}

@Benchmark
public int noAction17Lexer(LexerState state) throws IOException {
state.reader.reset();
return new NoAction17(state.reader).yylex();
}

/**
* The base line: a single continuous pass accessing each character once, through a buffer filled
* by a reader in one single reader invocation.
*/
@Benchmark
public void baselineReader(LexerState state, Blackhole bh) throws IOException {
char[] buffer = new char[state.length];
state.reader.reset();
state.reader.read(buffer, 0, buffer.length);
for (int i = 0; i < buffer.length; i++) {
bh.consume(buffer[i]);
}
}

public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder().include(JFlexBench.class.getSimpleName()).build();

new Runner(opt).run();
}
}
32 changes: 32 additions & 0 deletions benchmark/src/main/jflex/no-action.flex
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
package jflex.benchmark;

/*
A scanner with minimal action code, to measure inner matching loop
performance.
*/

%%

%public
%class NoAction

%int

%{
private int matches;
%}

SHORT = "a"
LONG = "b"+

%%

{SHORT} { matches++; }
{LONG} { matches++; }

"このマニュアルについて" { matches++; }
"😎" { matches++; }

[^] { /* nothing */ }

<<EOF>> { return matches; }
Loading