Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB Device next cdc acm can get in a deadlock when the host doesn't pull the endpoint data #78160

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

SynchronicIT
Copy link

@SynchronicIT SynchronicIT commented Sep 9, 2024

Fixes #78104

Current situation before the fix:

  • The cdc_acm code starts claiming buffers to hand over to the lower layer driver for TX.
  • The lower layer drivers puts the data into the endpoint for the application / host
  • lower layer gives a request callback when the endpoint is read
  • cdc_acm layer frees the buffer.

When the endpoint is not read, due to no terminal application running, the last two steps doesn't occur and the cdc_acm simply occupies all available buffers, followed by rescheduling the same work task for tx_fifo directly, creating an endless loop.

By checking if a TX task is ongoing, the new work task scheduled as soon as the callback is given.

There is no timeout used in this case, as I couldn't find an interface to configure a timeout.

sample application which connects a physical uart to a usb cdc acm uart

The complete process in handled in a single interrupt handler without
interference of the application code.

Signed-off-by: Vincent van der Locht <[email protected]>
Copy link

github-actions bot commented Sep 9, 2024

Hello @SynchronicIT, and thank you very much for your first pull request to the Zephyr project!
Our Continuous Integration pipeline will execute a series of checks on your Pull Request commit messages and code, and you are expected to address any failures by updating the PR. Please take a look at our commit message guidelines to find out how to format your commit messages, and at our contribution workflow to understand how to update your Pull Request. If you haven't already, please make sure to review the project's Contributor Expectations and update (by amending and force-pushing the commits) your pull request if necessary.
If you are stuck or need help please join us on Discord and ask your question there. Additionally, you can escalate the review when applicable. 😊

In case the host doesn't pull the new data from the endpoint, the work task
would schedule itself again (at the max. priority). When there is no terminal
program or active applicaiton, this could result in an infinite loop blocking all
other communications.

By using a atomic flag for tx busy, the deadlock is solved.

Signed-off-by: Vincent van der Locht <[email protected]>
@tmon-nordic
Copy link
Contributor

Is this essentially #76642 that is fixed by #75439 or some other issue?

@SynchronicIT
Copy link
Author

@tmon-nordic : The fix does take in account that poll could also be used, but I haven't validated that this would fix this mixed usage.

@@ -0,0 +1,8 @@
# SPDX-License-Identifier: Apache-2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but we do not need another CDC ACM sample.

@@ -48,6 +48,7 @@ UDC_BUF_POOL_DEFINE(cdc_acm_ep_pool,
#define CDC_ACM_IRQ_TX_ENABLED 3
#define CDC_ACM_RX_FIFO_BUSY 4
#define CDC_ACM_LOCK 5
#define CDC_ACM_TX_FIFO_BUSY 6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want to introduce TX_FIFO_BUSY, we can handle it by using delayed work, which is needed anyway, see 741d8a7.

@@ -217,6 +218,11 @@ static int usbd_cdc_acm_request(struct usbd_class_data *const c_data,
atomic_clear_bit(&data->state, CDC_ACM_RX_FIFO_BUSY);
}

if (bi->ep == cdc_acm_get_bulk_in(c_data)) {
atomic_clear_bit(&data->state, CDC_ACM_TX_FIFO_BUSY);
LOG_ERR("TX ERROR !!");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to log an error here.

Comment on lines +248 to +255
if (ring_buf_is_empty(data->tx_fifo.rb)) {
/* Raise TX ready interrupt */
if (data->cb) {
cdc_acm_work_submit(&data->irq_cb_work);
}
} else {
/* Queue pending TX data on IN endpoint */
cdc_acm_work_submit(&data->tx_fifo_work);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to do anything other than trigger irq_cb_work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Samples Samples area: USB Universal Serial Bus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

USB CDC_ACM locks system or at least slows it down heavily when no terminal is connected (USB DEVICE-NEXT)
5 participants