Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce vlaudit-stats #9

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

adeason
Copy link
Member

@adeason adeason commented Dec 10, 2020

Introduce a new tool, called vlaudit-stats, which can generate vldb
access statistics from a vlserver audit log (currently, only the
'pipe' audit interface format). Typically, a daemon process is run via
'vlaudit-stats daemon', and the stats are collected via 'vlaudit-stats
stats-get'.

Introduce a new tool, called vlaudit-stats, which can generate vldb
access statistics from a vlserver audit log (currently, only the
'pipe' audit interface format). Typically, a daemon process is run via
'vlaudit-stats daemon', and the stats are collected via 'vlaudit-stats
stats-get'.
Copy link
Member Author

@adeason adeason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit of a "first pass", though I think functionally it should be complete. this is missing proper end-user documentation (but I'm not sure how much that's needed?) and should probably have some more comments and docstrings. the tests don't cover everything, but they should cover a decent amount of functionality.

this also of course requires that the vlserver actually generates the needed audit messages. adding those to openafs is in openafs gerrit 14467, so this can't be used with any existing openafs release yet.

the performance of this has been a concern in the back of my mind... when using --bench with a big artificial audit log, I get about the following (on server-class hardware, a proliant of some kind):

  • rh7 in a vm: 230k audit messages per second
  • debian 10 bare metal: 320k
  • debian 10 bare metal, pypy: 630k

and on my old laptop I get more like 110k. that seems like it should be good enough given the current performance capabilities of the actual vlserver.

warn("msgid went backwards: %d -> %d (audit tstamp %s -> %s)" % (
from_id, to_id, from_ts, to_ts))

def msgid_gap(self, n_miss, from_ts, to_ts):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if message "gaps" should even be recorded in the stats, or we should just log them. the should be rare, and it might be easier to just detect them by looking in logs

Copy link
Member Author

@adeason adeason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that 14467 has changed the format of the relevant audit line. I'm trying to -1 this PR to flag that this needs a change, but github seems to not want to let me (maybe I can't -1 my own PR? ugh)

# We don't really distinguish between GetEntryByID and GetEntryByName
# requests in here. If we wanted to in the future, just check for
# reqvol.isdigit() to see if it's a numeric request.
getent_pat = re.compile(r'^\[\d+\] ... ... \d\d \d\d:\d\d:\d\d \d{4} EVENT AFS_VL_GetEnt CODE ([0-9]+) NAME .* HOST ([^ ]+) (STR|LONG) (.*) LONG ([0-9]+) STR (.*) $')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openafs gerrit 14467 has changed the relevant audit names here. I have this fixed locally, but I won't push the change here yet, in case 14467 changes any more, or if this PR yields any additional comments in the meantime :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant