-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hijack.sh: don't return error when test case is already skip #802
base: main
Are you sure you want to change the base?
Conversation
Empty logger in skipped test is expected. Don't return error because sof-logger is empty in skipped test. Signed-off-by: Fred Oh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confused!
@@ -72,7 +72,8 @@ function func_exit_handler() | |||
|
|||
local nlines; nlines=$(wc -l < "$logfile") # line count only | |||
# The first line is the sof-logger header | |||
if [ "$nlines" -le 1 ]; then | |||
# Don't override exit_status if already SKIPped test case | |||
if ([ "$nlines" -le 1 ] && [ $exit_status -ne 2 ]); then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't get the flow in this file. If an error was already identified (exist_status=1), then what is the point of trying to check additional things?
In this case, if the exit_status is already not zero, why add something related to an empty logger trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, flow is not clean here. exit_status is overwritten under many conditions in this hijack function. Original value of exit_status is easily forgotten.
I believe this was initiated by https://sof-ci.01.org/linuxpr/PR3136/build6666/devicetest/?model=APL_UP2_HDA&testcase=check-capture-3times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SKIP and FAIL should still have logs. If a SKIP or a FAIL has no logs then there is a logging issue and we want to know about it, see #297.
This code hasn't changed in months and has been working perfectly fine, what recent event prompts this?
I think I understand now, apologies to @fredoh9 for answering too fast. The recent event is that FW logging started failing for some reason (there have been many DMA code changes recently). The current code works as intended: a logging failure takes precedence over a SKIP. |
I agree on that it has been working fine as it is. |
The sof-logger (unless disabled with -s) is started at the very start of the test. Every SKIP we've done so far happened after that. I think it's fine. |
BTW we still miss the last line of the DMA logs more often than not: thesofproject/sof#4333 (comment) That does not explain why these logs were completely empty though because they should have at least the content from the previous test. I think we need a new bug.
That does not even matter because at least the logs from the previous test should be found. FW logs are NOT cleared when you read them multiple times, this is like running For instance have a look at this test which was also skipped (without issue). If you look at its FW logs, you can see they all come from the end of the previous test, it's a perfect match to the microsecond (so not a coincidence) |
@marc-hb wrote:
No, that's not correct. We only have "dmesg -c" behaviour. So if you consume all logs from DSP, and DSP is not powered on, there will be no traces.
In that case, DSP is booted to update the trace filter, so you have DSP log for the boot. This is a bit dubious to boot DSP just for this as the filter update is lost when the DSP is powered down. |
It looks like you never tried to run the sof-logger, hit Ctrl-C and then run it again. Just try it now. You also didn't look at this test results example I just gave:
|
You may have that impression because the DMA logs seem (sometimes?!?) cleared when going to D3. Disable D3 and you will see there is NO -c. My really big concern right now is not what happens but the fact that no one seems to know what's supposed to happen (I don't know either). |
Parallel discussion in issue #804 (an issue is probably a better place than a PR for this) |
@marc-hb @ujfalusi Restarting sof-logger while DSP is on has never worked (or it works as a convenience, but it's never been a reliable way to get all the traces from DSP and most certainly there has been no guarantee that you will get some traces).
That's the expected behaviour and how it has always been since before time SOF was added to upstream kernel a few years back. I today submitted a sof-docs addition that adds a specific note about starting sof-logger while DSP is running. This is by no means perfect, but this is what the driver currently does. If you or anyone wants "dmesg" type of semantics (reliably), new development in driver and FW is needed. The current trace module stores the traces into ringbuffer. Let's describe it as ABCD (with A the first entry and D the last entry). When sof-logger starts, it will start always reading from A. FW will keep writing to the ringbuffer independently, so the latest traces might have been written to A, B, C or D. It is possible FW just filled D, so if you start sof-logger with bad enough luck, you will not get any traces although the trace buffer is actually full. The FW will also restart from A whenever it's suspended and resumed and to ensure no data is lost, sof-logger needs to consume data all the time. So it is a very simple design. One could implement a double-buffering, and/or new mechanism for SW/FW to negotiate where to write, and/or enable DMA tracing only when sof-logger is running, but none of these have been implemented. And it's been like this for 2 years (or more). If you keep sof-logger running during a test/use-case, you'll reliably get all the traces from the DSP (independently of whether runtime-PM is used or not). |
logging is utterly broken and hopeless, do whatever you want
That's the short description you should submit to sof-docs instead of the way too short "things may go wrong" update in thesofproject/sof-docs#381 right now. Don't try to dumb down a very simple design. The logger is a developer tool, its users are expected to understand what a ringbuffer is (and even if they don't then they can easily skip it) |
This discussion unfortunately ended up scattered across multiple places, see links above. The latest news is thesofproject/linux#3136 was merged without the part that required this test change. So I suggest that for now this test PR waits until we see what directions logging goes to. |
If the test case is marked as SKIP on a given platform then it does not matter which direction the tracing is going to go, it will not change the fact that the case should not have been run in the first place, but if we run it at least we should not mark fail. |
As already explained above, there are several flavours of SKIP. If (for instance) the firmware crashes while trying to SKIP a particular test from within the test itself then yes, we absolutely want to report that as a FAIL. Because the firmware should never crash ever. Same for the logger: the logs may be empty or maybe not empty but the logger itself should never crash. Test SetUp, TearDown or anything in the test environment should never fail - even when the test decides to "skip itself". |
Empty logger in skipped test is expected. Don't return error
because sof-logger is empty in skipped test.
Signed-off-by: Fred Oh [email protected]