-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] multiple-pause-resume.sh sof-test fails to FW_GEN_MSG failed with err 3018 #8792
Comments
zephyrproject-rtos/zephyr#68101 didn't work. 5000us is a REALLY long time to wait already and we still had fails, so something else goes wrong. On other platforms using chain-dma, this is not seeing, making this quite curious. Seems to be specific to cAVS2.5. |
The host DMA (controlled by BE ops) must be stopped before sending PAUSE/STOP IPC (sent from FE ops) to chain DMA. Unless this is done, the DMA stop flow is not following programming sequence and DMA engine may get stuck in busy state. Link: thesofproject/sof#8792 Signed-off-by: Kai Vehmanen <[email protected]>
I now suspect the DMA stop problem the Zephyr update uncovers, is actually a problem in our host DMA stop sequence. With current kernel and SOF, the order of stop seems wrong:
As far as I can tell, this is against the recommended programming flow. With lcoal stress tests thesofproject/linux#4798 seems to fix the issue (GBUSY no longer stuck), but more stress testing is needed. |
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), the pause-resume sof-test cases started failing on Intel cAVS2.5 platforms. Add a delay loop around DMA stop code in chain DMA to workaround the issue while a proper fix is under investigation. This allows to resume integration of newer Zephyr versions to SOF and ensure we detect any new regressions in time. Link: thesofproject#8792 Signed-off-by: Kai Vehmanen <[email protected]>
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), the pause-resume sof-test cases started failing on Intel cAVS2.5 platforms. Add a delay loop around DMA stop code in chain DMA to workaround the issue while a proper fix is under investigation. This allows to resume integration of newer Zephyr versions to SOF and ensure we detect any new regressions in time. Link: thesofproject#8792 Signed-off-by: Kai Vehmanen <[email protected]>
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), the pause-resume sof-test cases started failing on Intel cAVS2.5 platforms. Add a temporary workaround to ignore this error on the affected Intel platforms. This allows to resume integration of newer Zephyr versions to SOF. Link: thesofproject#8792 Signed-off-by: Kai Vehmanen <[email protected]>
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), the pause-resume sof-test cases started failing on Intel cAVS2.5 platforms. Add a delay loop around DMA stop code in chain DMA to workaround the issue while a proper fix is under investigation. This allows to resume integration of newer Zephyr versions to SOF and ensure we detect any new regressions in time. Link: thesofproject#8792 Signed-off-by: Kai Vehmanen <[email protected]>
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), the pause-resume sof-test cases started failing on Intel cAVS2.5 platforms. Add a delay loop around DMA stop code in chain DMA to workaround the issue while a proper fix is under investigation. This allows to resume integration of newer Zephyr versions to SOF and ensure we detect any new regressions in time. Link: thesofproject#8792 Signed-off-by: Kai Vehmanen <[email protected]>
Tried a combination of zephyrproject-rtos/zephyr#68304 and thesofproject/linux#4798 but still fails |
Starting with Zephyr commit e021ccfc745221c6 ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), the pause-resume sof-test cases started failing on Intel cAVS2.5 platforms. Add a delay loop around DMA stop code in chain DMA to workaround the issue while a proper fix is under investigation. This allows to resume integration of newer Zephyr versions to SOF and ensure we detect any new regressions in time. Link: thesofproject#8792 Signed-off-by: Kai Vehmanen <[email protected]>
This kernel PR thesofproject/linux#4801 seem to help. One test plan (#37589) has passed, err 3018 not seen at all. |
A test PR with a Zephyr-side fix for the issue -> #8826 |
Issue closed with zephyrproject-rtos/zephyr#68415 merged via #8903 today. |
After commit e021ccf ("drivers: dma: intel-adsp-hda: add delay to stop host dma"), SOF project tests for "chain DMA" feature started failed with high failure rate on Intel cAVS2.5 ADSP platforms. Debugging shows the the 1000us timeout is not enough to clear the GBUSY bit on these platforms and the chain DMA tests. Link: thesofproject/sof#8792 Signed-off-by: Kai Vehmanen <[email protected]>
Describe the bug
The multiple-pause-resume.sh sof-test started to fail with latest Zephyr on Intel cAVS2.5 based platforms (TGL, ADL). First seen on
#8764
To Reproduce
Build SOF with recent Zephyr, run sof-test.
Reproduction Rate
50+%
Expected behavior
Chain-dma host dma stop fails
Impact
SOF CI PR tests are failing with high rate
Environment
See [DNM] Zephyr smp rework #8764
Screenshots or console output
See https://sof-ci.01.org/sofpr/PR8764/build2195/devicetest/index.html?model=TGLU_UP_HDA-ipc4&testcase=multiple-pause-resume-50
The text was updated successfully, but these errors were encountered: