Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluetooth: host: Run scan/slow test on nRF53 #78550

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jori-nordic
Copy link
Collaborator

@jori-nordic jori-nordic commented Sep 17, 2024

This enables the bluetooth/host/scan/slow test on nRF53 in the Zephyr
Continuous Integration pipeline.

Why waste more CI cycles?

The 5340 build will:

  1. stress-test the HCI IPC driver (controller + host sides)
  2. use a different buffer allocation model, where the allocator doesn't
    know what other events are in the queue

2. is especially important, as this is the model a lot of host users
will be running with. Any controller driver that is not directly on-chip
cannot know what is in the HCI queues, and cannot take the same
convenient shortcuts as the on-chip function-called drivers.

Stacked PR on top of:

Blocked on

jori-nordic and others added 3 commits September 17, 2024 10:51
Why is it ok to use the sync pool?

Because command complete/status is processed in prio: that means on the
same stack as the `bt_recv()` call from the driver.

Why does it fix the issue?

Because the complete/status event goes into a pool that is guaranteed to
have one free buffer any time `bt_recv()` is not executing.

Since the driver is the one calling bt_recv(), it (hopefully) will
finish one `bt_recv()` before starting another one.

Fixes zephyrproject-rtos#78223

Co-authored-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Jonathan Rico <[email protected]>
More information in the test description.

Why: At the time of writing, this test exposes a bug in the Bluetooth stack
where all the RX buffers are in use, and the controller driver fails to
synchronously allocate a command response buffer.

This is manifested in an assertion failure in `hci_common.c`.

Signed-off-by: Jonathan Rico <[email protected]>
This enables the bluetooth/host/scan/slow test on nRF53 in the Zephyr
Continuous Integration pipeline.

Why waste more CI cycles?

The 5340 build will:
1. stress-test the HCI IPC driver (controller + host sides)
2. use a different buffer allocation model, where the allocator doesn't
know what other events are in the queue

2. is especially important, as this is the model a lot of host users
will be running with. Any controller driver that is not directly on-chip
_cannot_ know what is in the HCI queues, and cannot take the same
convenient shortcuts as the on-chip function-called drivers.

Signed-off-by: Jonathan Rico <[email protected]>
Copy link
Member

@aescolar aescolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No complaints from me

@aescolar aescolar assigned jhedberg and unassigned aescolar Sep 17, 2024
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Do we need extended advertising for this test?
And if so, should we consider an non-ISO overlay for extended advertising in the sample itself (ping @cvinayak)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes extended is crucial, as without it, the host will discard reports thus hiding the issue.
Extended is used to repro #50786 basically.

Copy link
Collaborator Author

@jori-nordic jori-nordic Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah and we need extended (on the host) for the RPA to update while the scanner is running. Else the codepath nopes out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right :)

I don't see this file being added to the build explicitly; is it implicitly included when building the netcore when using sysbuild since it's in a sysbuild directory? Or is the name hci_ipc.conf the reason why it's in the build?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it's sysbuild magic. any ./sysbuild/*.conf is appended to the sysbuild image with the * name.
There are other ways of doing it, but imo this is the cleanest.
You're right that it's not obvious though, I will add a comment on L1 of that file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Do we need extended advertising for this test? And if so, should we consider an non-ISO overlay for extended advertising in the sample itself (ping @cvinayak)?

I am not well verse with sysbuild; maybe the current way in PR is common to Zephyr Controller and SoftDevice Controller under BabbleSIM ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - Learned something new today :) I guess there are some sysbuild documentation somewhere that I should read sometime soon to learn the magic tricks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup. It's actually more straightforward than reading the old child_image stuff we had in ncs: https://docs.zephyrproject.org/latest/build/sysbuild/index.html#zephyr-application-kconfig-fragment-and-devicetree-overlay

Copy link
Collaborator

@Thalley Thalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will approve once the PR this PR depends on is merged (suggest to add DNM on this until then, to avoid accidentally merging this before the other)

@jori-nordic
Copy link
Collaborator Author

suggest to add DNM

no need, the CI does the needful :)
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants