Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASoC: SOF: disable dma trace in s0ix #3099

Closed

Conversation

libinyang
Copy link

When system enters s0ix, the dma trace won't be used. Otherwise,
the DMA will access the host memory, which will prevent entering
S0ix. Driver has notified firmware not to send message through
dma trace. Let's also trigger stop dma trace in driver side.

Signed-off-by: Libin Yang [email protected]

@plbossart
Copy link
Member

plbossart commented Aug 16, 2021

@libinyang What bug/issue is this related to?

Also this patch ignores the existing code where we added the ability to enable the trace in S0ix, see sound/soc/sof/intel/hda-dsp.c

static bool hda_enable_trace_D0I3_S0;
#if IS_ENABLED(CONFIG_SND_SOC_SOF_DEBUG)
module_param_named(enable_trace_D0I3_S0, hda_enable_trace_D0I3_S0, bool, 0444);
MODULE_PARM_DESC(enable_trace_D0I3_S0,
		 "SOF HDA enable trace when the DSP is in D0I3 in S0");
#endif

		/*
		 * Trace DMA need to be disabled when the DSP enters
		 * D0I3 for S0Ix suspend, but it can be kept enabled
		 * when the DSP enters D0I3 while the system is in S0
		 * for debug purpose.
		 */
		if (!sdev->dtrace_is_supported ||
		    !hda_enable_trace_D0I3_S0 ||
		    sdev->system_suspend_target != SOF_SUSPEND_NONE)
			flags = HDA_PM_NO_DMA_TRACE;

I am not sure what to make of your proposal...

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic to call trigger stop upon d0ix entry looks ok, but I think the patch needs some work before merging. See comments inline.

sound/soc/sof/pm.c Outdated Show resolved Hide resolved
dev_err(sdev->dev,
"error: snd_sof_dma_trace_trigger: stop: %d\n", ret);
else
sdev->dtrace_is_enabled = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think L216-221 code would be better moved to trace.c and manage all logic around "sdev->dtrace_is_enabled" in trace.c and just have simple function calls here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will do it.

Copy link
Author

@libinyang libinyang Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i @ranj063 Based on your suggestion, what about putting the check code and error handling code in the definition of snd_sof_dma_trace_trigger() in ops.h? With this, we can make the code platform independent and don't create a new function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@libinyang Sounds good to me. Maybe spin a version for review?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@libinyang Sounds good to me. Maybe spin a version for review?

OK. Thanks.

dev_err(sdev->dev,
"error: snd_sof_dma_trace_trigger: start: %d\n", ret);
} else
sdev->dtrace_is_enabled = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as below, the error print and modifying dtrace_is_enabled would be better handled in trace.c and just have a simple function call here.

sound/soc/sof/intel/hda-trace.c Outdated Show resolved Hide resolved
@kv2019i
Copy link
Collaborator

kv2019i commented Aug 19, 2021

@plbossart wrote:

@libinyang What bug/issue is this related to?
[...]
Also this patch ignores the existing code where we added the ability to enable the trace in S0ix, see sound/soc/sof/intel/hda-dsp.c

My take: this is not about enabling trace in S0ix, but rather taking additional steps to disable trace (trigger stop on the trace DMA on host side), to ensure system really can enter S0ix. On older platforms this was not needed (hw ignored the state end went to S0ix, but that is no longer the case).

@plbossart
Copy link
Member

plbossart commented Aug 19, 2021

@libinyang What bug/issue is this related to?
[...]
Also this patch ignores the existing code where we added the ability to enable the trace in S0ix, see sound/soc/sof/intel/hda-dsp.c

My take: this is not about enabling trace in S0ix, but rather taking additional steps to disable trace (trigger stop on the trace DMA on host side), to ensure system really can enter S0ix. On older platforms this was not needed (hw ignored the state end went to S0ix, but that is no longer the case).

Ah yes, sorry. The parameter is to keep the trace in S0, I read sideways.

Still, I am worried about this trigger. I don't think this should be in generic code as well, but instead part of the sequence where we write the D0I3C_I3 register. Otherwise, there's a risk of firmware thinking it can start traces but the DMA is not enabled.

	/* Set flags and register value for D0 target substate */
	if (target_state->substate == SOF_HDA_DSP_PM_D0I3) {
		value = SOF_HDA_VS_D0I3C_I3;

		/*
		 * Trace DMA need to be disabled when the DSP enters
		 * D0I3 for S0Ix suspend, but it can be kept enabled
		 * when the DSP enters D0I3 while the system is in S0
		 * for debug purpose.
		 */
		if (!sdev->dtrace_is_supported ||
		    !hda_enable_trace_D0I3_S0 ||
		    sdev->system_suspend_target != SOF_SUSPEND_NONE)
			flags = HDA_PM_NO_DMA_TRACE;
	} else {
		/* prevent power gating in D0I0 */
		flags = HDA_PM_PPG;
	}

	/* update D0I3C register */
	ret = hda_dsp_update_d0i3c_register(sdev, value);
	if (ret < 0)
		return ret;

	/*
	 * Notify the DSP of the state change.
	 * If this IPC fails, revert the D0I3C register update in order
	 * to prevent partial state change.
	 */
	ret = hda_dsp_send_pm_gate_ipc(sdev, flags);
	if (ret < 0) {
		dev_err(sdev->dev,
			"error: PM_GATE ipc error %d\n", ret);
		goto revert;
	}

I would disable the DMA after the D0i3 transition and conversely restore it prior to the D0 transition. That way we never have a case where the firmware is blocked by the kernel.

@libinyang
Copy link
Author

@plbossart @kv2019i Thanks for the review and comments. I took vacation last week and sorry for the delay reply.

@libinyang
Copy link
Author

I would disable the DMA after the D0i3 transition and conversely restore it prior to the D0 transition. That way we never have a case where the firmware is blocked by the kernel.

@plbossart Thanks for the suggestion. Your suggested sequence seems more reasonable. I will update the patch.
BTW: Here is some background for this patch:
This patch is try to fix #3034
This is a bug only found on ADL (TGL seems OK). I found it can enter S0ix only after triggering stop trace DMA when WoV is enabled. I'm not sure whether this is because ADL take more strict situation for entering S0ix or not. With comparison to this, WoV stream is still enabled and system can enter S0ix (I'm still debugging on why WoV doesn't need to be triggered stop while trace DMA needs to be triggered stop). However, I think we do need this patch as we should save as more power as possible and triggering stop DMA should help on saving power (at least, it's harmless).

Copy link
Collaborator

@ranj063 ranj063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to release trace unconditionally during system suspend irrespctive of the target state.

@libinyang
Copy link
Author

I think it is better to release trace unconditionally during system suspend irrespctive of the target state.

Yes, I did think of this releasing dma trace solution at first. I'm OK with this solution. Releasing dma trace is the simplest way. What I concern is there is an IPC and some extra register setting to re-init the dma trace which may impact the resume performance? @plbossart @kv2019i What's your opinion?

@ranj063
Copy link
Collaborator

ranj063 commented Aug 25, 2021

I think it is better to release trace unconditionally during system suspend irrespctive of the target state.

Yes, I did think of this releasing dma trace solution at first. I'm OK with this solution. Releasing dma trace is the simplest way. What I concern is there is an IPC and some extra register setting to re-init the dma trace which may impact the resume performance? @plbossart @kv2019i What's your opinion?

@plbossart @kv2019i this is my alternate proposal (not tested @libinyang would need to confirm if it has the same outcome as his patch)

diff --git a/sound/soc/sof/pm.c b/sound/soc/sof/pm.c
index c9e46f13b1f4..b6efe574bbd1 100644
--- a/sound/soc/sof/pm.c
+++ b/sound/soc/sof/pm.c
@@ -116,11 +116,18 @@ static int sof_resume(struct device *dev, bool runtime_resume)
 
        /*
         * Nothing further to be done for platforms that support the low power
-        * D0 substate.
+        * D0 substate other than reinitializing DMA trace
         */
        if (!runtime_resume && sof_ops(sdev)->set_power_state &&
-           old_state == SOF_DSP_PM_D0)
+           old_state == SOF_DSP_PM_D0) {
+               ret = snd_sof_init_trace_ipc(sdev);
+               if (ret < 0) {
+                       /* non fatal */
+                       dev_warn(sdev->dev, "failed to init trace after resume %d\n", ret);
+               }
+
                return 0;
+       }
 
        sdev->fw_state = SOF_FW_BOOT_PREPARE;
 
@@ -202,6 +209,9 @@ static int sof_suspend(struct device *dev, bool runtime_suspend)
                }
        }
 
+       /* release trace */
+       snd_sof_release_trace(sdev);
+
        target_state = snd_sof_dsp_power_target(sdev);
 
        /* Skip to platform-specific suspend if DSP is entering D0 */
@@ -210,9 +220,6 @@ static int sof_suspend(struct device *dev, bool runtime_suspend)
 
        sof_tear_down_pipelines(sdev, false);
 
-       /* release trace */
-       snd_sof_release_trace(sdev);
-
 #if IS_ENABLED(CONFIG_SND_SOC_SOF_DEBUG_ENABLE_DEBUGFS_CACHE)
        /* cache debugfs contents during runtime suspend */
        if (runtime_suspend)

@libinyang
Copy link
Author

@plbossart @kv2019i this is my alternate proposal (not tested @libinyang would need to confirm if it has the same outcome as his patch)

This patch can't be such simple as we still need release the resource in FW. At lease we need send one or two extra IPCs to FW to trigger stop the DMA and release the DMA resource.

@ranj063
Copy link
Collaborator

ranj063 commented Aug 26, 2021

@plbossart @kv2019i this is my alternate proposal (not tested @libinyang would need to confirm if it has the same outcome as his patch)

This patch can't be such simple as we still need release the resource in FW. At lease we need send one or two extra IPCs to FW to trigger stop the DMA and release the DMA resource.

@libinyang can you please be precise? do you mean we need 1 extra IPC or 2 extra IPCs? And which ones are those? release trace should already be sending the stop IPC. its good enough even for D3, what else is it missing?

@libinyang
Copy link
Author

libinyang commented Aug 26, 2021

@libinyang can you please be precise? do you mean we need 1 extra IPC or 2 extra IPCs? And which ones are those? release trace should already be sending the stop IPC. its good enough even for D3, what else is it missing?

@ranj063 Which stop IPC do you mean? I didn't find the IPC. I only can find the pm gate IPC, which only stop the trace_work in firmware. But I think we still need set the corresponding dma registers to the correct values after releasing DMA. After stopping the dma, we need to send an IPC to release the resource in firmware. So next time we initialize the dma trace, we can re-allocate the dma channel and other resources in firmware code

@ranj063
Copy link
Collaborator

ranj063 commented Aug 26, 2021

@libinyang can you please be precise? do you mean we need 1 extra IPC or 2 extra IPCs? And which ones are those? release trace should already be sending the stop IPC. its good enough even for D3, what else is it missing?

@ranj063 Which stop IPC do you mean? I didn't find the IPC. I only can find the pm gate IPC, which only stop the trace_work in firmware. But I think we still need set the corresponding dma registers to the correct values after releasing DMA. After stopping the dma, we need to send an IPC to release the resource in firmware. So next time we initialize the dma trace, we can re-allocate the dma channel and other resources in firmware code

@libinyang have you tried my patch? are you still seeing the problem with it? I am confused by your questions.

snd_sof_release_trace() takes care of sending the STOP IPC already and frees the resources. Doing this unconditionally for all system suspend targets is enough IMHO. Could you please try my patch and let me know if it works or not?

@libinyang
Copy link
Author

libinyang commented Aug 30, 2021

@libinyang have you tried my patch? are you still seeing the problem with it? I am confused by your questions.

snd_sof_release_trace() takes care of sending the STOP IPC already and frees the resources. Doing this unconditionally for all system suspend targets is enough IMHO. Could you please try my patch and let me know if it works or not?

@ranj063 Last Friday we had team building and I didn't have time to try and analyze. Here is the result.
With your patch, with below operations:

  1. running WoV with arecord.
  2. enter s0ix
  3. wakeup from s0ix.
    There will be error message in both dmesg and sof-logger.
  4. stop WoV and retry WoV with arecord, arecord will fail.

Below is the dmesg and sof-logger errror.
dmesg:
418.687103] coretemp coretemp.0: platform_pm_resume+0x0/0x3d returned 0 after 0 usecs
[ 418.689678] sof-audio-pci-intel-tgl 0000:00:1f.3: error: ipc error for 0x90030000 size 12
[ 418.689679] sof-audio-pci-intel-tgl 0000:00:1f.3: error: can't set params for DMA for trace -22
[ 418.689692] sof-audio-pci-intel-tgl 0000:00:1f.3: failed to init trace after resume -22
[ 418.689694] sof-audio-pci-intel-tgl 0000:00:1f.3: pci_pm_resume+0x0/0xc0 returned 0 after 3394 usecs
[ 418.799938] elan_i2c i2c-ELAN0000:00: acpi_subsys_resume+0x0/0x59 returned 0 after 110653 usecs
[ 418.800045] input input3: calling input_dev_resume+0x0/0x3e @ 5315, parent: i2c-ELAN0000:00
[ 418.800047] input input3: input_dev_resume+0x0/0x3e returned 0 after 0 usecs
[=== cut here ===]
[ 592.500200] udevd[6870]: Process '/sbin/restorecon ' failed with exit code 255.
[ 609.880796] sof-audio-pci-intel-tgl 0000:00:1f.3: error: ipc error for 0x60010000 size 20
[ 609.889955] sof-audio-pci-intel-tgl 0000:00:1f.3: error: hw params ipc failed for stream 1
[ 609.899207] sof-audio-pci-intel-tgl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -19
[ 609.911127] DMIC16kHz: ASoC: soc_pcm_hw_params() failed (-19)
[ 609.917677] DMIC16kHz: ASoC: dpcm_fe_dai_hw_params failed (-19)
[ 679.753650] init: missived main process (3860) terminated with status 254
[ 679.761357] init: missived main process ended, respawning

sof-logger:
[ 787788649.581508] ( 0.000000) c0 dma-trace src/trace/dma-trace.c:334 ERROR FW ABI 0x3012001 DBG ABI 0x5003000 tag v1.8-rc2-40-gfe728bde0f7a src hash 0x9b7b44f1 (ldc hash 0x9b7b44f1)
[ 27184453.190620] ( 27184454.000000) c0 kpb 11.55 src/audio/kpb.c:408 ERROR kpb_params(): kpb has been already configured.
[ 39800162.949734] ( 12615710.000000) c0 dma-trace src/trace/dma-trace.c:334 ERROR FW ABI 0x3012001 DBG ABI 0x5003000 tag v1.8-rc2-40-gfe728bde0f7a src hash 0x9b7b44f1 (ldc hash 0x9b7b44f1)
[ 39800672.480963] ( 509.531219) c0 hda-dma ..../intel/hda/hda-dma.c:494 ERROR hda-dmac: 4 no free channel 0
[ 39800815.918458] ( 143.437500) c0 dma-copy src/ipc/dma-copy.c:174 ERROR dma_copy_set_stream_tag(): dc->chan is NULL
[ 39801084.824697] ( 268.906250) c0 ipc src/ipc/handler-ipc3.c:746 ERROR ipc: failed to enable trace -22
[ 422106914.060312] ( 382305824.000000) c0 hda-dma ..../intel/hda/hda-dma.c:494 ERROR hda-dmac: 4 no free channel 0
[ 422106929.893644] ( 15.833333) c0 host 11.53 src/audio/host.c:769 ERROR host_params(): hd->chan is NULL
[ 422106943.799894] ( 13.906249) c0 pipe 11.60 ....../pipeline-params.c:235 ERROR pipeline_params(): ret = -19, host->comp.id = 53
[ 422106957.966560] ( 14.166666) c0 ipc src/ipc/handler-ipc3.c:284 ERROR ipc: pipe 11 comp 53 params failed -19

And I add a patch to trace the resource release:

diff --git a/src/drivers/intel/hda/hda-dma.c b/src/drivers/intel/hda/hda-dma.c
index f4633d0c7868..356e5bdcddeb 100644
--- a/src/drivers/intel/hda/hda-dma.c
+++ b/src/drivers/intel/hda/hda-dma.c
@@ -475,7 +475,7 @@ static struct dma_chan_data *hda_dma_channel_get(struct dma *dma,

        spin_lock_irq(&dma->lock, flags);

-       tr_dbg(&hdma_tr, "hda-dmac: %d channel %d -> get", dma->plat_data.id, channel);
+       //tr_err(&hdma_tr, "in hda_dma_ch_get ylb, hda-dmac: %d channel %d -> get", dma->plat_data.id, channel);

        /* use channel if it's free */
        if (dma->chan[channel].status == COMP_STATE_INIT) {
diff --git a/src/ipc/dma-copy.c b/src/ipc/dma-copy.c
index ebc54b42731f..8a582b11185d 100644
--- a/src/ipc/dma-copy.c
+++ b/src/ipc/dma-copy.c
@@ -154,6 +154,7 @@ int dma_copy_new(struct dma_copy *dc)

 #if !CONFIG_DMA_GW
        /* get DMA channel from DMAC0 */
+       tr_err(&dmacpy_tr, "in dma_copy_new ylb, get chan\n");
        dc->chan = dma_channel_get(dc->dmac, 0);
        if (!dc->chan) {
                tr_err(&dmacpy_tr, "dma_copy_new(): dc->chan is NULL");
@@ -170,6 +171,7 @@ int dma_copy_set_stream_tag(struct dma_copy *dc, uint32_t stream_tag)
 {
        /* get DMA channel from DMAC */
        dc->chan = dma_channel_get(dc->dmac, stream_tag - 1);
+       //tr_err(&dmacpy_tr, "ylb, get stream_tag: %d channel: %#x\n", (int)stream_tag, dc->chan);
        if (!dc->chan) {
                tr_err(&dmacpy_tr, "dma_copy_set_stream_tag(): dc->chan is NULL");
                return -EINVAL;
diff --git a/src/trace/dma-trace.c b/src/trace/dma-trace.c
index 86e007e6b543..9d3af36218d3 100644
--- a/src/trace/dma-trace.c
+++ b/src/trace/dma-trace.c
@@ -184,6 +184,7 @@ static int dma_trace_buffer_init(struct dma_trace_data *d)
        unsigned int flags;

        /* allocate new buffer */
+       tr_err(&dt_tr, "in dma_trace_buffer_init ylb, alloc buf\n");
        buf = rballoc(0, SOF_MEM_CAPS_RAM | SOF_MEM_CAPS_DMA,
                      DMA_TRACE_LOCAL_SIZE);
        if (!buf) {
@@ -216,6 +217,7 @@ static void dma_trace_buffer_free(struct dma_trace_data *d)

        spin_lock_irq(&d->lock, flags);

+       tr_err(&dt_tr, "in dma_trace_buffer_free ylb, release buf\n");
        rfree(buffer->addr);
        memset(buffer, 0, sizeof(*buffer));

@@ -248,6 +250,7 @@ static int dma_trace_start(struct dma_trace_data *d)
        config.dest_width = sizeof(uint32_t);
        config.cyclic = 0;

+       tr_err(&dt_tr, "in dma_trace_start ylb, alloc dma_sg\n");
        err = dma_sg_alloc(&config.elem_array, SOF_MEM_ZONE_SYS,
                           config.direction,
                           elem_num, elem_size, elem_addr, 0);

I can find the resource is not released until the next dma trace initialization, it met error then release the resource.
The sof-logger is like below:
[ 82037581.323453] ( 0.000000) c0 dma-trace src/trace/dma-trace.c:187 ERROR in dma_trace_buffer_init ylb, alloc buf

[ 147.708327] ( 147.708328) c0 dma-trace src/trace/dma-trace.c:337 ERROR FW ABI 0x3012001 DBG ABI 0x5003000 tag v1.8-rc2-41-g263ab4ab16a4-dirty src hash 0x298ff932 (ldc hash 0x298ff932)
[ 167.812493] ( 20.104166) c0 dma-trace src/trace/dma-trace.c:253 ERROR in dma_trace_start ylb, alloc dma_sg

[ 5919185.962709] ( 5919018.000000) c0 kpb 11.55 src/audio/kpb.c:408 ERROR kpb_params(): kpb has been already configured.
[ 118183095.824659] ( 112263912.000000) c0 dma-trace src/trace/dma-trace.c:187 ERROR in dma_trace_buffer_init ylb, alloc buf

[ 118183249.001736] ( 153.177078) c0 dma-trace src/trace/dma-trace.c:337 ERROR FW ABI 0x3012001 DBG ABI 0x5003000 tag v1.8-rc2-41-g263ab4ab16a4-dirty src hash 0x298ff932 (ldc hash 0x298ff932)
[ 118183268.376736] ( 19.375000) c0 hda-dma ..../intel/hda/hda-dma.c:494 ERROR hda-dmac: 4 no free channel 0
[ 118183285.772568] ( 17.395832) c0 dma-copy src/ipc/dma-copy.c:176 ERROR dma_copy_set_stream_tag(): dc->chan is NULL
[ 118183303.428817] ( 17.656250) c0 dma-trace src/trace/dma-trace.c:220 ERROR in dma_trace_buffer_free ylb, release buf

[ 118183339.730899] ( 36.302082) c0 ipc src/ipc/handler-ipc3.c:746 ERROR ipc: failed to enable trace -22
[ 212817257.689235] ( 94633920.000000) c0 hda-dma ..../intel/hda/hda-dma.c:494 ERROR hda-dmac: 4 no free channel 0
[ 212817273.782985] ( 16.093750) c0 host 11.53 src/audio/host.c:769 ERROR host_params(): hd->chan is NULL
[ 212817287.532984] ( 13.749999) c0 pipe 11.60 ....../pipeline-params.c:235 ERROR pipeline_params(): ret = -19, host->comp.id = 53
[ 212817301.543400] ( 14.010416) c0 ipc src/ipc/handler-ipc3.c:284 ERROR ipc: pipe 11 comp 53 params failed -19

@libinyang
Copy link
Author

If we are still trying to release dma trace all the time in sof_suspend(), we should use IPC to notify the firmware to release the resource, either with new IPC or in the pm gate IPC. And also we should have some change in FW code.

@kv2019i
Copy link
Collaborator

kv2019i commented Sep 1, 2021

@ranj063 @libinyang Hmm, it seems the IPC interface doesn't support disable trace. I.e. we only have a message to set up DMA trace (SOF_IPC_TRACE_DMA_PARAMS) but only way to disable is to power-off the DSP.

So @ranj063 your patch won't work when DSP remains in D0i3.

I'd go with @libinyang your original approach of adding TRIG_START/STOP (but address the remaining comments). Ok to you @ranj063 ?

@@ -119,8 +119,15 @@ static int sof_resume(struct device *dev, bool runtime_resume)
* D0 substate.
*/
if (!runtime_resume && sof_ops(sdev)->set_power_state &&
old_state == SOF_DSP_PM_D0)
old_state == SOF_DSP_PM_D0) {
ret = snd_sof_dma_trace_trigger(sdev, SNDRV_PCM_TRIGGER_START);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not check if sdev->dtrace_is_enabled is already true before triggering again? It could have failed during suspend right? And then this would mean you don't need to do the check for it again in the platform-specific trigger as above in hda-trace.c

@ranj063
Copy link
Collaborator

ranj063 commented Sep 7, 2021

I'd go with @libinyang your original approach of adding TRIG_START/STOP (but address the remaining comments). Ok to you @ranj063 ?

ack @kv2019i lets go with this approach but I think it needs a minor adjustment in the placement of the check for trace_enabled

@libinyang libinyang force-pushed the stop_dtrace_s0ix branch 2 times, most recently from be01823 to cbedd8b Compare September 15, 2021 02:56
@libinyang
Copy link
Author

patch updated. Change log from v1:

  1. move the error log dtrace_is_enabled setting into function snd_sof_dma_trace_trigger()
  2. remove the error log and dtrace_is_enabled from the dma trace initialization.

if (!sdev->dtrace_is_supported)
return 0;

/* FIXME: return 0 or return -EINVAL ? */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace_trigger is an optional op, so return 0 please and remove FIXME and can you please combine the 2 checks for dtrace_is_supported and trace_trigger in 1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK


switch (cmd) {
case SNDRV_PCM_TRIGGER_START:
/* FIXME: return 0 when dtrace_is_enabled? */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this fixme intended for? whether you should return 0 or error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean whether we should return other value instead of "0" to tell the caller dma trace is already enabled. I will remove this comment.

ret = sof_ops(sdev)->trace_trigger(sdev, cmd);
if (ret < 0) {
dev_err(sdev->dev,
"error: snd_sof_dma_trace_trigger: start: %d\n", ret);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove err prefix

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. Yes, removing the "error" is better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hold on, although not a fan, @ranj063 I think we still generally add "error: "to dev_errs . @plbossart @ujfalusi have we changed policy on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i we want to remove "error:" since it makes no sense, but we will do it gradually through new changes to avoid backport issues. So were we should remove the "error:" on line 409.

"error: snd_sof_dma_trace_trigger: start: %d\n", ret);
} else {
sdev->dtrace_is_enabled = true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and remove unnecessary braces

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

"error: snd_sof_dma_trace_trigger: stop: %d\n", ret);
} else {
sdev->dtrace_is_enabled = false;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same 2 comments as above here

sound/soc/sof/ops.h Outdated Show resolved Hide resolved
sound/soc/sof/trace.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me to have the logic in ops. I'd move this to ops.c though. This is not performance sensitive, so no need to have the code in ops.h.

ret = sof_ops(sdev)->trace_trigger(sdev, cmd);
if (ret < 0) {
dev_err(sdev->dev,
"error: snd_sof_dma_trace_trigger: start: %d\n", ret);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hold on, although not a fan, @ranj063 I think we still generally add "error: "to dev_errs . @plbossart @ujfalusi have we changed policy on this?

sound/soc/sof/ops.h Outdated Show resolved Hide resolved
@@ -238,6 +242,9 @@ static int sof_suspend(struct device *dev, bool runtime_suspend)

suspend:

/* release trace */
snd_sof_release_trace(sdev, target_state == SOF_DSP_PM_D0 ? true : false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After giving this a thought I think we should aim a more generic approach by passing the target_state directly and let the dtrace code to decipher from there.
But the SOF_DSP_PM_D0 is not really the state we are entering (from SOF_DSP_PM_D0), but it is SOF_DSP_PM_D0:SOF_HDA_DSP_PM_D0I3, so is is more like SOF_DSP_PM_D0_LP.

Basically we will tell the dtrace that we are suspending to the given level:
SOF_DSP_PM_D3 - DSP is going to be off
SOF_DSP_PM_D0 - DSP remains on

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After giving this a thought I think we should aim a more generic approach by passing the target_state directly and let the dtrace code to decipher from there.
But the SOF_DSP_PM_D0 is not really the state we are entering (from SOF_DSP_PM_D0), but it is SOF_DSP_PM_D0:SOF_HDA_DSP_PM_D0I3, so is is more like SOF_DSP_PM_D0_LP.

Basically we will tell the dtrace that we are suspending to the given level:
SOF_DSP_PM_D3 - DSP is going to be off
SOF_DSP_PM_D0 - DSP remains on

@ujfalusi This should be another topic beyond this PR, which may impact the dtrace behavior and policy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@libinyang, the policy is:
if the DSP is going off then free up all resources and on resume set up everything from scratch
if the DSP is not going to be off then stop the host DMA and on resume only restart the host DMA

Doing the target_state == SOF_DSP_PM_D0 ? true : false in pm.c or in the trace.c does not change the policy and behavior especially that this PR is changing it in the first place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ujfalusi I mean the policy whether we should keep trace.c as simple as possible and won't take the responsibility to decide trigger stop or trigger start.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trace is moving out as a separate driver and it will have to make it's own decisions anyways.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it is not trigger start or stop question. It is suspend and resume question. Are suspending to a level where the DSP is off or are we keeping it on.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trace is moving out as a separate driver and it will have to make it's own decisions anyways.

After reading the email you sent, it seems I understand your ideas now. Let's talk more in the weekly meeting.

@ujfalusi
Copy link
Collaborator

ujfalusi commented Sep 24, 2021

I'm a bit puzzled on under which scenario this PR is going to have effect.
The only way target_state == SOF_DSP_PM_D0 in sof_suspend() if snd_sof_stream_suspend_ignored() returns true. Which only happens if we have ignored a suspend trigger for a running stream marked with d0i3_compatible in sof_pcm_trigger()
This means that at least one audio DMA is still active?

Afaik if we do a system suspend (either S3 or s2idle via /sys/pm/state) user space is going to frozen and audio would be stopped, we don't set SNDRV_PCM_INFO_RESUME for SOF and afaik it is not supported by Intel platforms, so trigger:suspend should not be coming at all.

I'm sure I got this wrong as there would not have been this PR...

@libinyang
Copy link
Author

I'm a bit puzzled on under which scenario this PR is going to have effect.
The only way target_state == SOF_DSP_PM_D0 in sof_suspend() if snd_sof_stream_suspend_ignored() returns true. Which only happens if we have ignored a suspend trigger for a running stream marked with d0i3_compatible in sof_pcm_trigger()
This means that at least one audio DMA is still active?

WoV is working in this case.

Afaik if we do a system suspend (either S3 or s2idle via /sys/pm/state) user space is going to frozen and audio would be stopped, we don't set SNDRV_PCM_INFO_RESUME for SOF and afaik it is not supported by Intel platforms, so trigger:suspend should not be coming at all.

No, for deep buffer and WoV audio, it still works. trigger suspend will be called but trigger resume won't based on my test.

I'm sure I got this wrong as there would not have been this PR...

Currently DMA trace has only 2 states: enabled or disabled.
However DMA trace may have more states. For example, enabled but
not started.

This patch refines the dma trace to provide a mechanism to store
more states than enabled/disabled and adds SOF_DTRACE_STOPPED state.

Peter Ujfalusi <[email protected]>
Signed-off-by: Libin Yang <[email protected]>
When system enters s0ix, the dma trace won't be used. Otherwise,
the DMA will access the host memory, which will prevent entering
S0ix. Driver has notified firmware not to send message through
dma trace. Let's also trigger stop dma trace in driver side.

Signed-off-by: Libin Yang <[email protected]>
@ranj063
Copy link
Collaborator

ranj063 commented Sep 29, 2021

@libinyang lets pause this PR for a bit until I fix the HD-DMA sequence for DMA stop in the kernel and the firmware. Sorry for the trouble!

@libinyang
Copy link
Author

@libinyang lets pause this PR for a bit until I fix the HD-DMA sequence for DMA stop in the kernel and the firmware. Sorry for the trouble!

@ranj063 Does this PR conflict with your patch?

@ranj063
Copy link
Collaborator

ranj063 commented Sep 30, 2021

@ranj063 Does this PR conflict with your patch?

I dont know yet. I'm still deciding how to solve the problem I have with trace. So its better to see how it pans out before we make another change in trace. Please give me a day or two and I should have a PR open. This issue is lower priority compared to the HD-DMA programming fix

@libinyang
Copy link
Author

@ranj063 Does this PR conflict with your patch?

I dont know yet. I'm still deciding how to solve the problem I have with trace. So its better to see how it pans out before we make another change in trace. Please give me a day or two and I should have a PR open. This issue is lower priority compared to the HD-DMA programming fix

@ranj063 Got it. Please let me know whether this PR is needed or not needed based on your latest solution. Thanks.

@plbossart
Copy link
Member

@libinyang @ranj063 is this PR still relevant?

@ranj063
Copy link
Collaborator

ranj063 commented Dec 4, 2021

no idea @plbossart . @libinyang was supposed to recheck after the HDA audio DMA patches were merged

@libinyang
Copy link
Author

libinyang commented Dec 9, 2021

@libinyang @ranj063 is this PR still relevant?

@plbossart @ranj063 Sorry for the delay reply. This bug is reproduced on ADL Chrome (TGL platform can't reproduce this bug, and ADL RVP platforms doesn't support s0ix well), which is using 5.10 kernel. As Ranjani said her DMA patches may impact on this PR. So I have to port Ranjani's patch to chrome kernel. But there is a big gap between chrome kenrel and SOF, and hard to me to back port it. So I'm waiting Chrome team to decide whether they will back port Ranjani's patch or not. @kv2019i has already ported Ranjani DMA patches for Chrome.

---- update ----
After talking with Mengdong, we don't know Chrome will merge Ranjani's DMA patches or not. So I will port Kai's patches to chrome kernel and verify whether Ranjani's patches fix this issue or not tomorrow. If Ranjani's patches can't fix the issue, I will rebased this patch based on Ranjani's patches and verify it.

@plbossart
Copy link
Member

@libinyang @ranj063 is this PR still relevant?

@plbossart @ranj063 Sorry for the delay reply. This bug is reproduced on ADL Chrome (TGL platform can't reproduce this bug, and ADL RVP platforms doesn't support s0ix well), which is using 5.10 kernel. As Ranjani said her DMA patches may impact on this PR. So I have to port Ranjani's patch to chrome kernel. But there is a big gap between chrome kenrel and SOF, and hard to me to back port it. So I'm waiting Chrome team to decide whether they will back port Ranjani's patch or not. @kv2019i has already ported Ranjani DMA patches for Chrome.

---- update ---- After talking with Mengdong, we don't know Chrome will merge Ranjani's DMA patches or not. So I will port Kai's patches to chrome kernel and verify whether Ranjani's patches fix this issue or not tomorrow. If Ranjani's patches can't fix the issue, I will rebased this patch based on Ranjani's patches and verify it.

@libinyang is there an issue to track the problem?
If yes, let's close this PR and re-submit a new one it when we've verified that the solution fixes the problem.

@kv2019i
Copy link
Collaborator

kv2019i commented Dec 9, 2021

@plbossart The issue is tracked in #3034 and we have a suspicioun this is related to newer issue thesofproject/sof#5042

@plbossart
Copy link
Member

thanks @kv2019i so let's close this PR and submit a new one when we have a solution.

@plbossart plbossart closed this Dec 9, 2021
@libinyang
Copy link
Author

I have confirmed that Ranjani's DMA patches can't fix the bug #3034. We still need this patch. I will submit a new one after the run_all test.

@ujfalusi
Copy link
Collaborator

Not even with the dtrace dma free IPC (PR #3197 and #3228)?

@kv2019i
Copy link
Collaborator

kv2019i commented Dec 10, 2021

Ack @libinyang and @ujfalusi , I was able to confirm the same. With stop-dma patches for both kernel and FW, including the dtrace dma free IPC, I can still reproduce. It would seem system does not enter S0ix in this case.

@libinyang
Copy link
Author

Not even with the dtrace dma free IPC (PR #3197 and #3228)?

I have applied all @kv2019i patches to Chrome and it doens't work.
https://github.com/kv2019i/linux/tree/topic/cros/5.10-stop-dma-backport-draft

@ranj063
Copy link
Collaborator

ranj063 commented Dec 11, 2021

@libinyang @kv2019i I think we all know none of the HDA-DMA patches will help with the issue. When we enter D0i3, we do not stop trace and hence won't free it. I only asked to pause this PR because the fix will be slightly different with the HDA-DMA patches with the trace_free IPC. @libinyang please resend what's needed to fix the issue.

@ujfalusi
Copy link
Collaborator

But now we do not need to hack around to handle the hoat side only. We can stop and restart the firmware side as well.
I think the dtrace reconfiguration in sof is still blocked.
The only thing is that we might want to keep the dtrace up in firmware and skip the host reconfig to be able to retrieve logs later.

@libinyang
Copy link
Author

@ranj063 Thanks for clarification. I will resend the patch. @ujfalusi Sometimes, we need FW working and just disable dtrace, for example in D0i3.

@ujfalusi
Copy link
Collaborator

@ranj063 Thanks for clarification. I will resend the patch. @ujfalusi Sometimes, we need FW working and just disable dtrace, for example in D0i3.

@libinyang, I know, I was saying that we now have a way to disable the dtrace in firmware which allows reconfiguration. One of the issue you have cited is that consequent dtrace enable fails and you needed to have workaround this.

@libinyang
Copy link
Author

I found this PR can't work with the latest sof linux kernel. After applying this PR, it will block sof driver enter pm-runtime suspend on the latest sof linux kernel. I'm still debugging on it.

@libinyang
Copy link
Author

A new patch is submitted: #3332

@libinyang libinyang mentioned this pull request Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants