-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce vlaudit-stats #9
base: master
Are you sure you want to change the base?
Conversation
Introduce a new tool, called vlaudit-stats, which can generate vldb access statistics from a vlserver audit log (currently, only the 'pipe' audit interface format). Typically, a daemon process is run via 'vlaudit-stats daemon', and the stats are collected via 'vlaudit-stats stats-get'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit of a "first pass", though I think functionally it should be complete. this is missing proper end-user documentation (but I'm not sure how much that's needed?) and should probably have some more comments and docstrings. the tests don't cover everything, but they should cover a decent amount of functionality.
this also of course requires that the vlserver actually generates the needed audit messages. adding those to openafs is in openafs gerrit 14467, so this can't be used with any existing openafs release yet.
the performance of this has been a concern in the back of my mind... when using --bench with a big artificial audit log, I get about the following (on server-class hardware, a proliant of some kind):
- rh7 in a vm: 230k audit messages per second
- debian 10 bare metal: 320k
- debian 10 bare metal, pypy: 630k
and on my old laptop I get more like 110k. that seems like it should be good enough given the current performance capabilities of the actual vlserver.
warn("msgid went backwards: %d -> %d (audit tstamp %s -> %s)" % ( | ||
from_id, to_id, from_ts, to_ts)) | ||
|
||
def msgid_gap(self, n_miss, from_ts, to_ts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if message "gaps" should even be recorded in the stats, or we should just log them. the should be rare, and it might be easier to just detect them by looking in logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that 14467 has changed the format of the relevant audit line. I'm trying to -1 this PR to flag that this needs a change, but github seems to not want to let me (maybe I can't -1 my own PR? ugh)
# We don't really distinguish between GetEntryByID and GetEntryByName | ||
# requests in here. If we wanted to in the future, just check for | ||
# reqvol.isdigit() to see if it's a numeric request. | ||
getent_pat = re.compile(r'^\[\d+\] ... ... \d\d \d\d:\d\d:\d\d \d{4} EVENT AFS_VL_GetEnt CODE ([0-9]+) NAME .* HOST ([^ ]+) (STR|LONG) (.*) LONG ([0-9]+) STR (.*) $') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openafs gerrit 14467 has changed the relevant audit names here. I have this fixed locally, but I won't push the change here yet, in case 14467 changes any more, or if this PR yields any additional comments in the meantime :)
Introduce a new tool, called vlaudit-stats, which can generate vldb
access statistics from a vlserver audit log (currently, only the
'pipe' audit interface format). Typically, a daemon process is run via
'vlaudit-stats daemon', and the stats are collected via 'vlaudit-stats
stats-get'.