RFC: Add data-file output option #305

bbros-dev · 2021-07-05T07:55:06Z

Prompted by issue #303 and PR #304

Add a data-file output option (under Metrics)

Metrics:
.....
-D, --data-file NAME        Create an CSV-formatted data file of response times (nanoseconds)
    --data-fields f1,...fn  Comma delimited list of fields.  Values: elapsed, co_elapsed, raw, co_adjusted
    --append                Append data to the data-file NAME if it exists.  Create if not exists.

This produces a data file of response times suitable for analysis such as that described in PR #304.

Example (data comes from README mitigation section):

datetime, sample, user, code, elapsed, co_elapsed, raw, co_adjusted
2021-04-05T14:30:01Z,1,2,200,185835,11965,11965,11965
2021-04-05T14:30:01Z,2,2,200,185835,11965,,8804
2021-04-05T14:30:01Z,3,2,200,185835,11965,,5643
2021-04-05T14:30:01Z,4,2,200,185835,11965,,2482
....
....
....

where all data files contain the fields:

datetime: ISO-8601 date-time stamp of the start of the Goose run. Used to distinguish between Goose runs. Not used to calculate response times. Not used to debug.
sample: integer, row counter
user: The Goose user ID (integer)
code: The HTTP response code

Data files include additional fields passed as a comma delimited list in the option --data-fields

raw: integer, raw response time (nanoseconds)
co_adjusted: integer, CO-adjusted response time (nanoseconds)
elapsed
co_elapsed

The text was updated successfully, but these errors were encountered:

jeremyandrews · 2021-07-06T11:54:01Z

Please see the --requests-file which is intended to provide request-related logs. It can be written in multiple formats, including csv, json, and raw (ie, Rust's debug display of a struct). When I only want a subset of the information, I generally create a json log and then use jq to grab the specific columns I'm interested in.

A datetime field would need to be added, PR welcome.

We don't currently count the rows, I'm not sure what it adds / why it's necessary? As there's lots of threads making requests at the same time we'd have to "fake" it with the counter.

Finally, the individual response times are not adjusted, so the last two rows aren't an option (and this is the same in HdrHistogram). If you review the mitigation section of the README you can see an example of what shows up in the Requests log. Specifically, if the coordinated_omission_elapsed column is non-zero, this metric was generated through coordinate omission mitigation.

jeremyandrews · 2021-07-06T11:54:56Z

BTW: It's also worth noting that I'm reworking the logger logic in this issue which should get merged soon:
#302

bbros-dev · 2021-07-06T14:46:47Z

Please see the --requests-file

Yes the requests file is a reasonable workaround.
The data file is targeted at analysis uses cases, for questions such as 'Is this latency sample distribution different from a prior run?'

We don't currently count the rows, I'm not sure what it adds

It adds two things:

It aids when truncating k-samples at a common row count. I've added an --append option which will help produced single (stacked) data file for analysis.
It's a ready made index that combined with the date-time should be unique in most cases.

If you review the mitigation section of the README...

Apologies for seeming obtuse, but I found it very confusing. I'll add comments in the relevant PR.

jeremyandrews · 2021-07-06T19:15:08Z

I'd consider a PR that makes it configurable which fields from a given entity are logged, but for now I think it's enough to just be sure that all necessary fields are included. I don't want to get to bogged down in this just yet when tools already exist that can easily map the --requests-file to what you need.

Due to the way that GooseUser's each run in their own thread, I don't think you're going to find sufficient consistency simply numbering all requests in the order made: this will inherently get shuffled a little each time you run (not due to Goose which works very hard to ensure each test is sequenced the same, but due to whatever Goose is interacting with and the network in between).

Instead, perhaps it would work better to filter by a given GooseUser (they each have in integer ID, starting at 0 and incrementing by 1) and then to compare the elapsed times for requests made by that thread. You could potentially include a per-GooseUser counter to aid in comparisons.

Apologies for seeming obtuse, but I found it very confusing. I'll add comments in the relevant PR.

Not at all: it's new documentation and a work in progress. PRs to improve readability are welcome.

bbros-dev · 2021-07-07T08:18:05Z

I don't want to get to bogged down in this just yet when tools already exist that can easily map the --requests-file to what you need.

Agree the requests file is a existing workaround. This is aimed at making it convenient to generate data ready for analysis.

I've amended the proposal to reflect this: ~~One thought was to alter the RFC to be a generic solution:~~

--analysis-file NAME  Create an CSV-formatted data file for analysis
--analysis-fields F1,F2,..FN Comma delimited field names to output to analysis-file.  Field names: <TBD>

I don't think you're going to find sufficient consistency simply numbering all requests in the order made

Not a problem. Generally people seem interested in the distributional properties, e.g. the tail, 99th %-tile etc. For distributional purposes the order requests are made is not relevant, hence the absence of a request time stamp - the datetime field is a 'run' identifier.

While time-series properties are, IMO, of interest, e.g. are slow/fast responses clustered (tests for heteroskedasticity are well established), they are not at this stage the subject of this RFC.
That is for the reason you cite - its not yet clear to us that Goose/Gaggle can generate that data.

Though, as we mentioned elsewhere, it is not clear to us these systems are always ergodic - perhaps that is why we generally see no sophisticated analysis of this type?

Instead, perhaps ...

Hopefully all that analysis is possible given the updated RFC.

PRs to improve readability are welcome.

OK we'll open a RFC to discover direction.

bbros-dev · 2021-07-10T12:08:31Z

We won't be able to pursue this beyond the documentation contribution which has landed.

bbros-dev mentioned this issue Jul 6, 2021

Feature: % of users experiencing >99th %-tile, etc. responses #285

Closed

bbros-dev mentioned this issue Jul 7, 2021

improve Coordinated Omission documentation and debug output #300

Merged

bbros-dev closed this as completed Jul 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Add data-file output option #305

RFC: Add data-file output option #305

bbros-dev commented Jul 5, 2021 •

edited

Loading

jeremyandrews commented Jul 6, 2021

jeremyandrews commented Jul 6, 2021

bbros-dev commented Jul 6, 2021

jeremyandrews commented Jul 6, 2021

bbros-dev commented Jul 7, 2021 •

edited

Loading

bbros-dev commented Jul 10, 2021 •

edited

Loading

RFC: Add data-file output option #305

RFC: Add data-file output option #305

Comments

bbros-dev commented Jul 5, 2021 • edited Loading

jeremyandrews commented Jul 6, 2021

jeremyandrews commented Jul 6, 2021

bbros-dev commented Jul 6, 2021

jeremyandrews commented Jul 6, 2021

bbros-dev commented Jul 7, 2021 • edited Loading

bbros-dev commented Jul 10, 2021 • edited Loading

bbros-dev commented Jul 5, 2021 •

edited

Loading

bbros-dev commented Jul 7, 2021 •

edited

Loading

bbros-dev commented Jul 10, 2021 •

edited

Loading