Rewrite to add more metrics and be more modular #7

surilindur · 2024-04-03T09:13:42Z

Here is a set of changes to rewrite the runner, with the following major differences:

The functionality from the input output file, as well as the result aggregation, have been split into more modular components:
- QueryLoaderFile handles the loading of queries from a file.
- ResultAggegator handles the result aggregation, and ResultAggregatorComunica handles additional Comunica-specific metadata (currently only the HTTP request counts).
- ResultSerializerCsv handles the serialization of results into a CSV file. The new structure allows adding JSON serializers and other serializers as needed, in case the CSV format is insufficient (for example, for dumping an entire error object, or other items that cannot be reduced to a simple value).
- SparqlBenchmarkRunner now only handles the running of queries, recording of metrics, and calls the result aggregator (passed to it as a config option) to aggregate the results.
The endpoint availability check now uses a simple HEAD request, which seems to work fine. This is more lightweight for the endpoint to handle, and does not cause any unnecessary processing.
The runner now records the following additional metrics to help provide a better view of the variance between replications:
- Minimum, maximum and average query duration.
- Minimum, maximum and average result count.
- Minimum, maximum and average HTTP request count (for Comunica).
- Minimum, maximum and average result arrival timestamps.
- Bindings hash for the results, to ensure the same query produces the exact same results across all replications. When the result hash differs between executions, the runner marks it as failed and sets the error to a corresponding error (unless another error has already been reported).
The metrics recorded by the runner cannot be toggled off anymore to keep code complexity low and to ensure all the relevant metrics are available.
The runner records the thrown error as the error object now, and serializes the error message in the CSV file by default, to help identify exactly what the error was when a specific query fails.
The runner now always waits for the endpoint to be available between executions, in case the endpoint crashes and needs some time to recover between subsequent query queries. This ensures the following query will not suffer because of a previous one that crashed the endpoint.

surilindur · 2024-04-03T09:30:17Z

This is based on the changes in #6 and will probably need to be updated after that one is merged.

rubensworks · 2024-04-12T10:49:42Z

lib/SparqlBenchmarkRunner.ts

      await this.sleep(1_000);
-      this.log(`\rEndpoint not available yet, waited for ${++counter} seconds...`);


Is there a particular reason this was removed?
We had it there to indicate to the user that the script is still running, and hasn't frozen.

The only reason was that I added timestamps to the default CLI logger, and the \r was deleting them. Without the \r, there would be a lot of lines printed if the endpoint took a long while to become available. I will add back this line, though, because it does make sense to let the user know that the tool is not frozen. Thank you for the feedback!

This logging has been added back now.

rubensworks

Looks very neat!

Could you also update the README to reflect the changes?
We may also need a smal section on how to use it from JS, as things may seem a bit less straightforward now.

Note to self: major update

coveralls · 2024-04-25T14:41:28Z

Pull Request Test Coverage Report for Build 8893996937

Details

225 of 225 (100.0%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 100.0%

Totals
Change from base Build 8847154501:	0.0%
Covered Lines:	241
Relevant Lines:	241

💛 - Coveralls

surilindur · 2024-04-25T15:07:43Z

Could you also update the README to reflect the changes?

There is now a short note about using the tool as a library in the readme. I did not add a full example of the output, because it would be a lot of test and the current examples use WatDiv and not SolidBench, and I have been testing with SolidBench.

rubensworks · 2024-04-26T10:56:46Z

I did not add a full example of the output, because it would be a lot of test and the current examples use WatDiv and not SolidBench, and I have been testing with SolidBench.

But is the current CSV example output valid still? Because based on what I understand, we now output more metrics?
Or is this still the default (which is definitely ok by me, as it means my old script will still work).

surilindur · 2024-04-26T11:28:12Z

It is valid in the sense that everything in the example is still produced, but it is true that it does not contain the additional columns that are also output. Additionally, the column ordering can be different from how it used to be, since name and id will still be first, but everything else after them is alphabetically sorted.

I will look into updating the example, and also adding the option to configure a delay between subsequent requests sent to the endpoint if someone has issues with too many requests being sent in rapid succession.

rubensworks · 2024-04-26T11:57:44Z

I will look into updating the example, and also adding the option to configure a delay between subsequent requests sent to the endpoint if someone has issues with too many requests being sent in rapid succession.

Sounds good!
The README doesn't have to contain actually produced data though, as long as the column names and orders correspond it's fine by me.

surilindur · 2024-04-30T11:33:12Z

This should be ready for another review now!

rubensworks · 2024-04-30T11:49:33Z

Thanks! Released as 3.0.0.

rubensworks · 2024-05-04T08:16:44Z

@surilindur I just had another look at the current README, and it looks like there's no httpRequests column anymore in the example.
I guess it's still measured, but not reported by default?
Could you look into documenting into the README how to enable that?

surilindur · 2024-05-04T14:35:49Z

I guess it's still measured, but not reported by default?

It is actually both measured and reported by default. The metadata is always extracted from the bindings stream (regardless of the result aggregator), and when the Comunica-specific result aggregator is used (which differs from the base aggregator only in that it does the httpRequest aggregation), it will also be reported (httpRequests, httpRequestsMax, httpRequestsMin). The default aggregator is the Comunica one.

I left out the columns in the example because they are only reported for the Comunica link traversal SPARQL endpoint that provides the HTTP request count as metadata, but the tool could also be used for other endpoints that do not provide it (or that provide other metadata which would require their own aggregators extending the base one). So the example only contains what can be expected from any endpoint.

If it is confusing, I can edit the example output in the readme. I did not actually think about it beyond the reasoning above. 🙂

surilindur · 2024-05-04T14:42:00Z

Maybe I should change the default aggregator to be the base one, and add a note in the readme on using different aggregators and creating custom ones? Would that be reasonable?

rubensworks · 2024-05-05T04:35:35Z

I left out the columns in the example because they are only reported for the Comunica link traversal SPARQL endpoint that provides the HTTP request count as metadata

I thought Comunica base always reports request count as well? (e.g. for TPF queries) But could be wrong.

Maybe I should change the default aggregator to be the base one, and add a note in the readme on using different aggregators and creating custom ones? Would that be reasonable?

Since we are the primary ones who will be using this tool, I think the comunica one as default is fine (including the requests columns in the README). But some docs on how to use a different one definitely make sense!

surilindur · 2024-05-05T10:30:32Z

I thought Comunica base always reports request count as well? (e.g. for TPF queries) But could be wrong.

Oh right, I forgot TPF! But for example Virtuoso or Jena probably do not provide the HTTP request count metadata, or Comunica when used with an HDT file. That was my reasoning. 😄

But some docs on how to use a different one definitely make sense!

I will look into adding some notes in the usage section, then!

surilindur mentioned this pull request Apr 3, 2024

Update to Yarn 4 #3

Closed

rubensworks reviewed Apr 12, 2024

View reviewed changes

Rewrite into more modular components with additional metrics

c33db3f

surilindur requested a review from rubensworks April 30, 2024 11:33

rubensworks merged commit 8fd9b18 into comunica:master Apr 30, 2024
6 checks passed

surilindur deleted the feat/modular-rewrite branch April 30, 2024 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite to add more metrics and be more modular #7

Rewrite to add more metrics and be more modular #7

surilindur commented Apr 3, 2024

surilindur commented Apr 3, 2024

rubensworks Apr 12, 2024

surilindur Apr 25, 2024

surilindur Apr 30, 2024

rubensworks left a comment

coveralls commented Apr 25, 2024 •

edited

Loading

surilindur commented Apr 25, 2024

rubensworks commented Apr 26, 2024

surilindur commented Apr 26, 2024

rubensworks commented Apr 26, 2024

surilindur commented Apr 30, 2024

rubensworks commented Apr 30, 2024

rubensworks commented May 4, 2024

surilindur commented May 4, 2024

surilindur commented May 4, 2024

rubensworks commented May 5, 2024

surilindur commented May 5, 2024

		await this.sleep(1_000);
		this.log(`\rEndpoint not available yet, waited for ${++counter} seconds...`);

Rewrite to add more metrics and be more modular #7

Rewrite to add more metrics and be more modular #7

Conversation

surilindur commented Apr 3, 2024

surilindur commented Apr 3, 2024

rubensworks Apr 12, 2024

Choose a reason for hiding this comment

surilindur Apr 25, 2024

Choose a reason for hiding this comment

surilindur Apr 30, 2024

Choose a reason for hiding this comment

rubensworks left a comment

Choose a reason for hiding this comment

coveralls commented Apr 25, 2024 • edited Loading

Pull Request Test Coverage Report for Build 8893996937

Details

💛 - Coveralls

surilindur commented Apr 25, 2024

rubensworks commented Apr 26, 2024

surilindur commented Apr 26, 2024

rubensworks commented Apr 26, 2024

surilindur commented Apr 30, 2024

rubensworks commented Apr 30, 2024

rubensworks commented May 4, 2024

surilindur commented May 4, 2024

surilindur commented May 4, 2024

rubensworks commented May 5, 2024

surilindur commented May 5, 2024

coveralls commented Apr 25, 2024 •

edited

Loading