Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: sof_perf_analyzer: fix widget ID parsing error #1122

Merged
merged 1 commit into from
Nov 23, 2023

Conversation

btian1
Copy link
Contributor

@btian1 btian1 commented Oct 31, 2023

In existing code, we parse component name from the line
that contains 'Create widget', and component ID from the
next line, which normally is the ipc tx message:

Create widget host-copier.0.playback instance 0 - pipe 1 - core 0
ipc tx : 0x40000004|0x15: MOD_INIT_INSTANCE [data size: 84]

However, due to the concurrent nature of linux kernel, the
assumption that Create widget message and ipc tx message
are always adjacent is wrong. An exception will be raised
if the next line is not the 'ipc tx' message.

The first step to fix the issue is to make a kernel PR to print
all the information we need in a single 'Create widget' line,
thus to avoid the requirement for multiple lines of kernel message,
see thesofproject/linux#4709.

And next,this patch will parse the new 'Create widget' line to
get component name and ID.

@btian1 btian1 requested a review from a team as a code owner October 31, 2023 08:10
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
@btian1 btian1 force-pushed the fix_widget_id branch 2 times, most recently from 1a41d4c to 64ad709 Compare November 3, 2023 03:12
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Show resolved Hide resolved
@marc-hb
Copy link
Collaborator

marc-hb commented Nov 4, 2023

Long before we search the info in a different place, could we have a "quicker fix" where the entire script does not crash but replaces the missing ID with some "Unknown" category or something?

tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Show resolved Hide resolved
Comment on lines 274 to 275
comp_peak_max_cycles = perf_info.groupby('COMP_ID')['CPU_PEAK'].max()
perf_stats['MODULE_CPC'] = comp_peak_max_cycles
Copy link

@aiChaoSONG aiChaoSONG Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perf_stats['MODULE_CPC'] = perf_info.groupby('COMP_ID')['CPU_PEAK'].max() single line is good.

Let's make this commit a standalone PR, because

  • it has nothing to do with widget id issue
  • it can be merged quickly

commit message:

Previously, module CPC is calculated by multiplying the mean
value of cpu_peak with a margin scalar, after internal technical
discussion, change module CPC to the maximum value of module's
cpu_peak as aligned.

@aiChaoSONG
Copy link

@btian1 For the widget id issue, I am afraid we have to refine code and comments according Marc's review comments. Let's sync tomorrow.

aiChaoSONG
aiChaoSONG previously approved these changes Nov 9, 2023
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea what this is about

tools/sof_perf_analyzer.py Outdated Show resolved Hide resolved
@plbossart
Copy link
Member

"When there is no widget ID in kernel logs, try to find the widget ID in bind related logs."

Has anyone tried to change the kernel logs to make the problem go away?

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in both create widget and bind message parsing, two lines are involved. we cannot assume it is atomic. because the concurrent nature of the linux kernel.
[...]
there is still probability a component in the topology is missing in our comp_name dict, but the probability is low as we parse more and more information

I just had a quick chat with @plbossart and he told me that there has been zero attempt to simplify the kernel logs... is this real?

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's rather disappointing that after multiple asks I still get this sort of raw dump of information without clarity on the problem statement. This is starting to be a waste of everyone's time if I am honest. - @plbossart

About two weeks ago, 4 days after this PR was first submitted, I asked this:
#1122 (comment)

Which cases? And what exactly is missing in these cases and why? An entire line and if yes which one? A part of a line and if yes which part? You must be much more specific, otherwise it's impossible to review the code without knowing what it is trying to parse.

That question was marked as "resolved" without an answer.

At the same time I asked basically the same question in #1108. No answer either.

It's pointless to submit a fix to a problem that has not been analyzed, understood and explained. It's called "programming by coincidence".

  1. Why are the kernel logs incomplete, how and when?
  2. Why can't the kernel logs be fixed instead? Or at least made somewhat better.

Don't answer in this PR here, answer in the bug 1108

@btian1 btian1 marked this pull request as draft November 20, 2023 07:25
@btian1 btian1 changed the title audio: perf: fix no widget ID case in kernel logs tools: sof_perf_analyzer: fix no widget ID issue on SDW platform Nov 21, 2023
@btian1 btian1 marked this pull request as ready for review November 21, 2023 08:10
Create pipeline pipeline.3 (pipe 3) - instance 3, core 0

[ 59.622645] snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3:
Create widget host-copier.0.capture (pipe 3) - ID 4, instance 3, core 0

Copy link

@aiChaoSONG aiChaoSONG Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'''Process the dmesg to get component ID to component name mapping,
they are acquired from the line that contains 'Create widget':
    EXAMPLE_OF_CREATE_WIDGWT_LINE
By design in the kernel, pipeline ID is the instance ID of pipeline
widget, so it is acquired from the line that contains 'Create pipeline':
    EXAMPLE_OF_CREATE_PIPELINE_LINE
'''

Comment on lines 199 to 205
# with sof kernel PR: https://github.com/thesofproject/linux/pull/4709
# pipeline and module creation are using separate print in kernle log.
# So, for pipeline case, pipeline instance ID will be extracted from pipeline
# logs. For module creation, module instance ID and module ID will be
# extracted accordingly.
# ppln_id, widget_id and widget instance ID will be combined together to
# column COMP_ID as the unique ID for each module inside pipelines.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep my original comments, it explains why instance id of pipeline widget is the widget pipeline id for the widgets of the pineline

Comment on lines 210 to 211
module_instance_id = int(line_split[7][:-1])
widget_id = int(line_split[5][:-1])
Copy link

@aiChaoSONG aiChaoSONG Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain [:-1] with a simple comment.

widget_id = next_line.split('|')[0].split(':')[-1].strip()
# convert to the same ID format in mtrace
widget_id = int('0x' + widget_id[-6:], 16)
ppln_id = int(line_split[5][:-1])
Copy link

@aiChaoSONG aiChaoSONG Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain [:-1], no one know why if you don't explain.
# remove ending comma(,) with [:-1]

Copy link

@aiChaoSONG aiChaoSONG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit title and body:

tools: sof_perf_analyzer: fix widget ID parsing error

In existing code, we parse component name from the line
that contains 'Create widget', and component ID from the
next line, which normally is the ipc tx message:

Create widget host-copier.0.playback instance 0 - pipe 1 - core 0
ipc tx      : 0x40000004|0x15: MOD_INIT_INSTANCE [data size: 84]

However, due to the concurrent nature of linux kernel, the
assumption that `Create widget` message and `ipc tx` message
are always adjacent is wrong. An exception will be raised
if the next line is not the 'ipc tx' message.

The first step to fix the issue is to make a kernel PR to print
all the information we need in a single 'Create widget' line,
thus to avoid the requirement for multiple lines of kernel message, 
see https://github.com/thesofproject/linux/pull/4709.

And next,this patch will parse the new 'Create widget' line to
get component name and ID.

Fixes: https://github.com/thesofproject/sof-test/issues/1108

Always describe the issue (what), root cause of the issue (why), and how it is fixed(how).

@btian1 btian1 changed the title tools: sof_perf_analyzer: fix no widget ID issue on SDW platform tools: sof_perf_analyzer: fix widget ID parsing error Nov 22, 2023
@btian1 btian1 requested a review from aiChaoSONG November 22, 2023 02:53
In existing code, we parse component name from the line
that contains 'Create widget', and component ID from the
next line, which normally is the ipc tx message:

Create widget host-copier.0.playback instance 0 - pipe 1 - core 0
ipc tx      : 0x40000004|0x15: MOD_INIT_INSTANCE [data size: 84]

However, due to the concurrent nature of linux kernel, the
assumption that `Create widget` message and `ipc tx` message
are always adjacent is wrong. An exception will be raised
if the next line is not the 'ipc tx' message.

The first step to fix the issue is to make a kernel PR to print
all the information we need in a single 'Create widget' line,
thus to avoid the requirement for multiple lines of kernel message,
see thesofproject/linux#4709.

And next,this patch will parse the new 'Create widget' line to
get component name and ID.

Fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
Signed-off-by: Chao Song <[email protected]>
@aiChaoSONG
Copy link

see internal test report 35019, test result is good.

@aiChaoSONG aiChaoSONG dismissed stale reviews from plbossart and marc-hb November 23, 2023 08:21

dismiss as resolved

@aiChaoSONG aiChaoSONG merged commit 6b35e10 into thesofproject:main Nov 23, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants