Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] arecord fails with non-zero exit status with check-pause-resume-playback-100 #9191

Closed
kv2019i opened this issue Jun 4, 2024 · 5 comments
Labels
bug Something isn't working as expected LNL Applies to Lunar Lake platform
Milestone

Comments

@kv2019i
Copy link
Collaborator

kv2019i commented Jun 4, 2024

Describe the bug
Seen in CI with SOF main on Intel LNL:
https://sof-ci.01.org/softestpr/PR1202/build472/devicetest/index.html?model=LNLM_RVP_HDA&testcase=check-pause-resume-capture-100

Bug description updated on 4th Sep, see comment #9191 (comment)

Looks similar to (see an older Intel platforms):

@kv2019i kv2019i added bug Something isn't working as expected P1 Blocker bugs or important features LNL Applies to Lunar Lake platform labels Jun 4, 2024
kv2019i added a commit to kv2019i/sof that referenced this issue Jun 4, 2024
Set CONFIG_DMA_INTEL_ADSP_HDA_TIMING_L1_EXIT in board overlay
as the Zephyr platform default is not correct in the current
version of Zephyr.

Link: thesofproject#9191
Signed-off-by: Kai Vehmanen <[email protected]>
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jun 5, 2024

@abonislawski @serhiy-katsyuba-intel It seems just enabling the L1_EXIT is not sufficient, record failure still seen (see comment #9194 (comment) ).

The logs don't have much, but this very much seems like a case of host DMA not moving, and arecord exiting with error as no data is received.

@marc-hb
Copy link
Collaborator

marc-hb commented Jun 7, 2024

Is thesofproject/linux#5048 a duplicate? Then maybe close this one because 5048 has a better looking description :-)

@lgirdwood lgirdwood added this to the v2.10 milestone Jun 10, 2024
@lgirdwood
Copy link
Member

Lets keep it open for the moment, just as a place holder on the FW side for v2.10

@lgirdwood lgirdwood removed the P1 Blocker bugs or important features label Jun 13, 2024
@lgirdwood lgirdwood modified the milestones: v2.10, v2.11 Jun 13, 2024
@kv2019i kv2019i self-assigned this Sep 4, 2024
@kv2019i kv2019i changed the title [BUG] arecord fails with non-zero exit status [BUG] arecord fails with non-zero exit status with check-pause-resume-playback-100 Sep 4, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Sep 4, 2024

Looking at this again. Narrowing the scope of this bug to specifically cover the failure with pause-resume case with a recent occurence in:
https://sof-ci.01.org/linuxpr/PR4733/build4501/devicetest/index.html?model=LNLM_SDW_AIOC&testcase=check-pause-resume-playback-100

Failure fingerprint in sof-test console:

t=14110 ms: aplay: (29/100) Found   === PAUSE ===  ,  pausing for 166 ms
t=14110 ms: aplay: ERROR: aplay: do_pause:1567: pause push error: File descriptor in bad state

The errors happen when there are no clear errors in either kernel dmesg.txt or FW mtrace.txt. There does seem to be a timing anomaly seen in the logs, like here:

[  758.664736] kernel: snd_sof_intel_hda_common:hda_dai_trigger: soundwire_intel soundwire_intel.link.3: cmd=0 dai SDW3 Pin2 direction 0
[  759.996415] kernel: snd_sof_intel_hda_common:hda_dsp_stream_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x1e0]=0x140000 successful
[  759.996434] kernel: snd_sof:sof_ipc4_trigger_pipelines: sof-audio-pci-intel-lnl 0000:00:1f.3: trigger cmd: 0 state: 2

This is clearly larger pause in activity than the ~0.1-0.2ms delays that are used in the test case.

@kv2019i kv2019i added the P2 Critical bugs or normal features label Sep 4, 2024
@kv2019i kv2019i removed their assignment Sep 9, 2024
@kv2019i kv2019i modified the milestones: v2.11, v2.12 Sep 9, 2024
@lgirdwood lgirdwood removed the P2 Critical bugs or normal features label Sep 17, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Dec 13, 2024

Not seen in PR/daily tests for a week, closing.

@kv2019i kv2019i closed this as completed Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected LNL Applies to Lunar Lake platform
Projects
None yet
Development

No branches or pull requests

3 participants