-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async sensor API #56963
async sensor API #56963
Conversation
yperess
commented
Apr 18, 2023
•
edited
Loading
edited
- Introduce a new data flow:
- Request - data is requested and returned by the sensor asynchronously and without a syscall via RTIO
- Raw - data is in a raw buffer format directly from the sensor driver
- Processing - sensor drivers provide a decoder which will decode the data into a Q31 format which is compatible with zDSP (Zephyr Digital Signal Processing) subsystem.
- New API is fully backwards compatible. Benchmarking demonstrated being within 1% performance of the exiting API when USERSPACE was disabled (it's likely that it might even beat it when userspace is enabled because no syscalls are required in the new API). Additionally, data is much more compressed in the raw data format while the data is waiting for processing.
- Using the RTIO enables the use of mempools which will allocate data as needed which allows several sensors to share the same context.
- Migrate the sensor shell to use the new API
@teburd I got things working (kinda). I still can't get the thermistor to work side by side with the IMU (@asemjonovs do you have any thoughts?). But overall this is working really well. Let me know what you think. |
I'll take a closer look today. I guess at first glance I'm concerned about passing the rtio context to the sensor_read() call but I need to understand better why that might be needed by reading this. |
Ok just to kind of re-dump what we had brainstormed... RTIO_DEFINE_WITH_MEMPOOL(r, 4, 4, BUF_POOL_COUNT, BUF_SIZE);
int main()
{
const struct device *icm42688 = DEVICE_DT_GET(icm42688);
const struct device *akm09918 = DEVICE_DT_GET(akm09918);
/* TODO Setup sensors including trigger source here (e.g. fifo en, watermark level, timers, whatever) */
/* Setup a pool read for the sensors, using the sensor as the userdata */
sensor_read_with_pool(icm42688, r);
sensor_read_with_pool(akm09918, r);
while (true) {
struct device *read_dev;
void *buf;
uint32_t len;
/* Submit with wait count 1, immediately reads and deciphers the completion into the out params */
int res = sensor_read_await(r, &read_dev, &buf, &len);
/* We now have a buffer with the sensor data or an error, its ours until we free it */
if (res < 0) {
LOG_ERR("failed to read sensor data %d\n", res);
continue;
} else {
if (read_dev == icm42888) {
process_icm42688(read_dev, buf, len);
} else {
process_akm09918(read_dev, buf, len);
}
sensor_read_free(r, buf, len);
sensor_read_with_pool(read_dev, r);
}
}
} And honestly this doesn't seem too far off from that. I think the difference here is what maybe my mind was going to with the sensor_read() call was a wrapper around what amounts to the sensor acting like an I/O device that you can submit requests to, and sensor_read does all the setup for that for you. So making sensor_read() a syscall I think is the the unexpected part in this setup. I was expecting there to be a driver level submit() type call (should not be a syscall) that takes the submission (has the rtio context as well!) to read and produces a completion. I guess the difficulty is matching the idea that you'd like to read some N channels. This can be encoded in the iodev's data and then the read call can take the iodev (configured with the channels and such). The read call doesn't need to be a syscall but can be an inline helper that creates the submission and submits it with the given context. SENSOR_DT_IODEV(icm42688_gyros, sensor_dt_node, somechannels, num_channels)
sensor_read(icm42688_gyros, r); Does that make sense or maybe I'm overlooking something else? To make this backwards compatible then, a sensor_submit API is available that does kind of what you do here already... void sensor_iodev_submit(struct rtio_iodev_sqe *iodev_sqe) {
const struct sensor_iodev_data *iodev_data = iodev_sqe->sqe->iodev->data;
const struct device *sensor_dev = iodev_data->sensor_dev;
const struct sensor_driver_api *api = sensor_dev->api;
/* Device does supports async reads with rtio */
if (api->submit != NULL) {
return api->submit(dev, iodev_sqe);
}
/* submit not implemented, but presumably fetch and get are so use those to mimic submit */
enum sensor_channel *channels = iodev_data->channels;
const uint32_t channel_num = iodev_data->channel_num;
sensor_sample_fetch(dev, SENSOR ...);
/* for channel in channels */
for(int i = 0; i < channel_num; i++) {
sensor_channel_get(dev, ....);
}
/* Return completion... */
rtio_cqe_submit(iodev_sqe->r, cqe, ...);
} I do kind of like that you've made the processing take a function pointer so cleanup and everything is automatic. This too should probably be a helper function kind of thing. So rather than sensor_read_await you you call sensor_read_process(r, process_func) kind of thing perhaps? Anyways, I think this is really on the right track, but maybe some missed thoughts on my part weren't conveyed well. I hope that helps! |
Yes, passing the context for every read was not my first choice, but it was the only way I could think of making this 100% backwards compatible. A workflow that does:
Will not work without changes from the driver since we don't have anywhere to store the RTIO context. There are alternatives to create a meta data array of |
Correct, and what I have here is basically the same. The sample we have assumes FIFO and that the sensor is generating the data (which will come soon after this PR). What I have currently in this PR basically is a trigger one-shot read (
I think the difficulty here was actually making this backwards compatible (though maybe I just wasn't creative enough). My current thinking was that with the existing approach
I was aiming for abstracting away the RTIO details though I'll admit it comes at the cost of granularity. If the developer wants to do a single read with a pre-allocated buffer to a mempool RTIO, they will not be able to with my API, though we could add a buffer/size argument and allow the developer to use
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good to me, some commentary on the mempool changes would be good
include/zephyr/rtio/rtio.h
Outdated
map_entry->state = RTIO_MEMPOOL_ENTRY_STATE_ALLOCATED; | ||
map_entry->block_idx = __rtio_compute_mempool_block_index(r, *buf); | ||
map_entry->block_count = num_blks; | ||
mutable_sqe->buf = *buf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering what happened with the mempool situation here, seems like a complete change to what you had?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the other way wasn't working. We can chat about it offline but basically we were keeping the sqe index and using it to recover the buffer. That works when the sqe and cqe are used at the same pace, but the sqe is released already and in my case was then assigned to another submit which the everything off
Something isn't quite right, it is not working well with high data rates. I'll try to debug today. |
ca79839
to
5430bfb
Compare
@teburd, performance is abysmal currently (see output below). The fact that we're only getting 512 callbacks though makes me wonder if this is an issue with the CQE not being released properly?
|
Ok correction, I do see what you are seeing now, but lets look at whats on the logic analyzer. Between spi reads chip select times is 30us, in the polling loop. This is 50KHz polling over a spi bus that itself uses a tight polling loop. Meaning in effect zero computation time is available, that other 20us is the cost of doing business. Then at a larger scale there's a large bank of time where no spi work is done, nearly a second in fact. So something isn't quite right there. Your processing thread isn't being scheduled in between reads like I think you want it to be is that right? That seems like a different issue. The throughput of queueing reads and getting completions seems to be quite good to me. |
@teburd @MaureenHelm I believe this is ready for first pass review (the sample used here is a hacked sensor_shell and that final commit will be replaced tomorrow by actually updating the sensor shell implementation to use this API instead. Some metrics from the TDK Robokit1 board: When running the sensor at 32kHz, and using a tight sampling loop: Existing fetch/get APIs Async API (Precision is only theoretical while drivers don't implement this because the Q values are generated from the original APIs) What's next?
|
f3a650c
to
73ebfe5
Compare
@teburd @MaureenHelm Updates:
Sample:
|
aafcb9c
to
1dd06c2
Compare
west.yml
Outdated
@@ -22,16 +22,9 @@ manifest: | |||
- name: upstream | |||
url-base: https://github.com/zephyrproject-rtos | |||
|
|||
group-filter: [-babblesim] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder why this is needed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to run west or twister without this revert
cd5c2b2
to
0cebe93
Compare
c26fdc6
to
0a11578
Compare
Introduce a streaming API that uses the same data path as the async API. This includes features to the decoder: * Checking if triggers are present Adding streaming features built ontop of existing triggers: * Adding 3 operations to be done on a trigger * include - include the data with the trigger information * nop - do nothing * drop - drop the data (flush) * Add a new sensor_stream() API to mirror sensor_read() but add an optional handler to be able to cancel the stream. Signed-off-by: Yuval Peress <[email protected]> topic#sensor_stream
Add streaming implementation for icm42688 using both threshold and full FIFO triggers. Signed-off-by: Yuval Peress <[email protected]> topic#sensor_stream
Signed-off-by: Yuval Peress <[email protected]>
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |