-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC][DNM] DMM: Skeleton for a DMA Memory Management subsystem #57602
Conversation
In memory bus architectures, CPU running software is one of the masters accessing the bus, while DMA engines are other masters accessing the same (or other, but connected) bus. In basic MCUs there is a simple bus architecture, where CPU and DMA engines can access the whole memory range, no caches are present, and memory addressing is consistent between all the bus masters. The memory access in these devices can be summarized with the following rules: * CPU and DMA have access to the whole memory space * memory address used by CPU to point to any memory location A is equal to the memory address used by DMA to point to the memory location A * any data stored in memory by DMA is immediately visible to the CPU * any data stored in memory by the CPU is immediately visible to DMA In more complex systems with a more complex memory architecture the interaction between CPU and DMA, or between multiple CPUs, can be complicated: * DMA can be restricted to access only part of the system memory (or only part of the memory can be accessed efficiently by DMA) * CPU may need to translate its buffer address (virtual or physical) to an address (or a set of addresses if the buffer is not continuous in the bus memory space) usable by DMA before passing it to a DMA engine (to take into account cache line alignment, memory region allocation, etc...) * CPU cache can contain stale data and CPU may need to invalidate cache to read the data that DMA or other CPU updated in the memory * Data intended by a CPU to be stored in memory can get stuck in a CPU's write-back cache and be invisible to DMA or other CPUs so the CPU has to flush caches before informing DMA or other CPUs about data availability in the memory All of the discussed challenges must be addressed by the software running in a system with a complex memory architecture. In this patch I'm proposing to add a new sub-system called DMM - DMA Memory Management with the following responsabilities: * allocating and freeing "bounce buffers" if the buffer provided by a driver user cannot be used by the DMA engine (memory accessible by DMA, aligned to DMA requirements, if cacheable: aligned and padded to cache lines), * copying data to and from the bounce buffers * translation of CPU buffer addresses (virtual or physical) to DMA-usable addresses * cache management if buffers are allocated in cacheable memory Signed-off-by: Carlo Caione <[email protected]>
* - copy the data from the user-provided buffer to the bounce buffers | ||
* - translate CPU buffer addresses to DMA addresses if needed | ||
* - do the proper cache management if buffers are allocated in cacheable | ||
* memory (for example flushing cache lines associated with the memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: flushing is not necessary
* memory (for example flushing cache lines associated with the memory | |
* memory (for example writing back cache lines associated with the memory |
* memory (for example flushing cache lines associated with the memory | ||
* backing the DMA accessible buffers) | ||
* | ||
* @param[in] dev Device pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which device is this? The device that is to perform DMA transfer from the buffer pointed by dma_handle
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, that is the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add this information somewhere to the docstring to make it clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do.
* backing the DMA accessible buffers) | ||
* | ||
* @param[in] dev Device pointer | ||
* @param[in] flags Flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these flags defined somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, TBD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify here sth like
* @param[in] flags Flags | |
* @param[in] flags Flags reserved for future use |
* This function is used to translate the user-provided buffer into something | ||
* that is usable for a DMA outbound data transfer and return that to the user | ||
* as an opaque DMA handle (usually a physical/bus address). | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a paragraph that the user cannot access buffer pointed by buffer
or dma_handle
from the time this function is called until dmm_release_buffer_for_dma_out()
is called. Access includes reading from this buffer.
Also I would clarify that dmm_release_buffer_for_dma_out()
must be called after the DMA transfer is finished. Not calling this function can lead to memory leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
* @param[in] dev Device pointer | ||
* @param[in] flags Flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar ambiguity like in the outbound transfer function
* This function is used to translate the user-provided buffer into something | ||
* that is usable for a DMA inbound data transfer and return that to the user | ||
* as an opaque DMA handle (usually a physical/bus address). | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a paragraph that the user cannot access buffer pointed by buffer
or dma_handle
from the time this function is called until dmm_release_buffer_for_dma_in()
is called.
Also I would clarify that dmm_release_buffer_for_dma_in()
must be called after the DMA transfer is finished, before data is read by CPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inherent problem seems to be someone wants to use a buffer that isn't aligned or in the right region for DMA. Shouldn't we make that easier to sort out statically rather than defining a new allocation interface for this task?
Can we use device tree to describe the requirement for a buffer with a particular peripheral + dma pair I wonder. Something like...
DT_SHARED_BUF(dma0, spi0) uint8_t some_spi_buf[128];
Where all the correct compiler attributes are supplied for alignment, placement, and size.
That idea I like very much. Puts a lot of preprocessor work onto the platform/controller layer, but cleanly extends to all sorts of nonsense like alternate memory spaces (even madness like iommu's or 24 bit ISA DMA!) that are otherwise really hard to abstract. And it falls back to obvious behavior for "it's just memory" systems that tend to be the common case. |
@teburd, do I understand correctly that you suggest to move the responsibility to manage memory buffers from a device driver to its user? If that's the case, let's consider following application code:
If SPI device driver takes care of copying
Portability is an important software attribute, especially in hardware agnostic projects like Zephyr. That's why I think the responsibility for managing memory buffers for DMA should be hidden from application code and implemented in device drivers. Applications should have a possibility to improve performance and use buffer allocation methods meeting buffer memory management requirements to prevent device drivers from copying data to bounce buffers. But it should be an option, not a requirement for an application. |
@hubertmis the problem with that kind of scheme is that once we decide on a dynamic allocation API we can't go back. This isn't linux, and we don't get to assume we're running on devices with copious virtual memory available for stuff. Existing Zephyr paradigms work very hard (to be fair, occasionally too hard) to enable static allocation for everything an application might want. @teburd's idea is very much in that vein: it's an API that allows an app to say "I want a buffer that will work for DMA with this device", and get it, without having to do much more than know that the device exists and call whatever acquire/release API we decide on around code that mutates it. |
@teburd , could you draft your proposal? I would like to understand it better to be able to compare properties of both solutions |
I think looking at the imxrt 685 is useful here, as it has many bus masters including two dmas, usb hs, sdhc, and one arm m33 core, one xtensa hifi4 core. E.g. lets consider configuring dma0 with i2s0 (flexcomm) and now add in some buffer requirements to a device tree node...
Then build some macros /**
* @brief Shared buffer
*/
#define DT_SHARED_BUF(bus_ctlr, bus_periph) __attributes__(aligned(DT_PROP_OR(buf_periph, buf-align, DT_PROP_OR(buf_ctlr, buf-align, WORD_ALIGN))), section(DT_SRAM_SECTION_OR(buf_periph, buf-region, DT_SRAM_SECTION_OR(bus_ctlr, buf-region, BUF_DEFAULT_SECTION)))
/* Later on can be used, without requiring bounce buffers... */
DT_BUF(dma0, i2s0) uint16_t audio_rx_buf[128];
/* Configure receive with above described buffer */
dma_config(dma_dev, ....); I'm not being very precise here, and I very much doubt any of this works... I hope the idea is clear here though. Describe what the hardware demands of a buffer in device tree, use that to setup statically allocated buffers or block allocators. That's it. Caching could be a totally separate thing or maybe its considered a sram region thing (often you can configure uncached/cached regions). Or perhaps the buffer is placed in a struct container with some metadata about caching and state. Would be interesting to look at the sram bank properties if provided in DT as the source of truth on this. No allocation of bounce buffers needed, and tests that utilize DMA no longer have to have ifdefs for part specifics about this sort of thing as they do today. Caching is understood by the peripheral/dma/cpu in some manner to flush and invalidate when most appropriate or by a new API like you've proposed here. |
Related prs/issues/code where having a DT_BUF macro might be incredibly helpful for setting up buffers for alignment, placement, and with cache properties. #57786 There's probably many more... |
Thanks @teburd for these details. I like the idea of describing memory buffer requirements in the device tree and creating macros to easily create buffers meeting them. It looks the main open question in this discussion is about the device driver model. I don't think Zephyr's device driver model currently puts any requirements on the memory buffers passed to the drivers' functions by their users: users can pass any buffers allocated in any memory (writable or not, cacheable or not, aligned or not, dynamically or statically allocated, ...) and they expect the drivers to correctly perform requested operations using the passed memory buffers. It seems we are discussing several options on how to incorporate DMA requirements regarding used memory buffers in a unified way for all device drivers in Zephyr:
The proposed device tree properties and macros can be used in each of these solutions. The difference would be which layer in the system would be able and/or responsible to use the macros to allocate the buffers correctly. Is this summary correct? Do you happen to see any other options I didn't list? When we decide the list of options is around being completed, we should start listing properties of these options that would help us to find out which one is the most aligned with Zephyr's needs. |
I think this is mostly the right approach, the interfaces mostly remain the same, and with asserts enabled it should be considered a programming error to give an invalid buffer pointer.
This could be accomplished by supplying asserts and perhaps helper macros to validate buffer X
A bounce buffer is the wrong approach in my humble opinion. It should be a programming error if this is needed.
I would place the responsibility for correctly allocating buffers on the application. With helpers for setting up static buffers and block allocators. Assertions can validate that the buffer pointer is aligned and in the correct region.
I think those options are right. I believe pretty strongly that memory should be managed by an application for Zephyr. Any sort of automatic memory management should be entirely opt in and not the default. It's unclear why a bounce buffer would ever be needed in Zephyr if the buffers are correctly setup, and that should be a goal. |
@teburd, thanks for your comment. User code with modified device driver modelI'm looking at selected Zephyr samples to assess changes needed if we decided to modify the device driver model so that the application is responsible for correct buffer allocation. spi_bitbangzephyr/samples/drivers/spi_bitbang/src/main.c Lines 32 to 35 in c48422f
Currently DT_BUF(dma0, dev) uint16_t buff[5] = { 0x0101, 0x00ff, 0x00a5, 0x0000, 0x0102};
int len = 5 * sizeof(buff[0]);
struct spi_buf tx_buf = { .buf = buff, .len = len };
dt_buf_acquire(buff); // To do cache management I'm not sure if the DT_BUF(dma0, dev) uint16_t buff[DT_BUF_SIZE(dma0, dev, 5)] = { ... }; Using uart/echo_botThis sample includes both RX and TX paths of UART communication. RXIt's pretty easy because the following code
would need to be modified to DT_BUF(dma0, uart_dev) static char rx_buf[DT_BUF_SIZE(dma0, uart_dev, MSG_SIZE)]; assuming these macros would work correctly for global static objects. TXWe can't use anymore just any zephyr/samples/drivers/uart/echo_bot/src/main.c Lines 64 to 71 in c48422f
To use this kind of API the sample code would need to allocate a bounce buffer meeting the UART TX and use it here: DT_BUF(dma1, uart_dev) static char tx_buf[DT_BUF_SIZE(dma1, uart_dev, 1)];
void print_uart(char *buf)
{
int msg_len = strlen(buf);
for (int i = 0; i < msg_len; i++) {
tx_buf[0] = buf[I];
dt_buf_tx_acquire(tx_buf);
uart_poll_out(uart_dev, tx_buf);
dt_buf_tx_release(tx_buf);
}
} This bounce buffer would be allocated by the application for any platform, even if given UART does not have DMA, or includes a DMA capable of printing any application. This potentially unnecessary bounce buffer makes usage of driver APIs harder and is wasteful. blinky_pwmSome hardware peripherals like PWM, even though it's a rare case, can also include DMA. To be consistent with the device driver model and potentially handle DMA here, the API user would need to pass uart/echo_bot againAssume we have a UART peripheral that uses DMA to read data from RAM in the following structure: Byte 1: the size of the transfer The solution discussed above with ConclusionIt seems impossible to impose responsibility on an application to allocate correct buffers for each potential DMA usage in any system. It would require an application (or a subsystem) to structure data in a buffer in such a way that is required for its target platform, rendering the application non-portable. This makes option 2 I described in #57602 (comment) invalid. Do you agree? |
Comments inline, but no I don't agree, and I don't think we've fully explored this. I realized when re-reading this the dma node probably isn't needed in the macros, only the device tree node representing the peripheral. That should alleviate portability concerns. That and providing a define macro rather than macros to generate attributes.
Then change the macro /**
* Defines a buffer with a given symbolic name such that it is aligned, placed, sized, and
* cached appropriately. The symbol buf_name may not be the actual start address of the buffer
* if the device requires a prefixed space for metadata about the operation.
*/
#define DT_BUF_DEFINE(node, buf_name, buf_len) \
DT_BUF_ATTRS(node) uint8_t __buf_name[DT_BUF_LEN(node, buf_len)];
uint8_t *buf_name = &__buf_name[DT_BUF_OFFSET(node)]; Just like that, I have a macro appropriately created. How could we verify the buffer is correctly done though for that device you ask? A marker (well understood sequence like a stack sentinel, e.g. 0xDEADBEAD) could be prefixed or suffixed and asserted on when CONFIG_ASSERT=y.
That's the whole point of applying macros and device tree to the problem. To make it portable.
No I do not agree, I don't think we've fully explored options to statically allocate buffers correctly. |
Thanks, indeed it would cover much more cases. I'll reiterate some and discuss a few more examples to see if we have a common understanding of how this macro would change the Zephyr codebase. blinky_pwmI believe if a PWM peripheral uses DMA then the PWM API user could not create a buffer and fill it with It would mean that the allocation of buffers by the API user would be available only for the device drivers that are streaming data (UART, SPI, I2C, USB, radios, ...), but for other drivers, it would be the driver's responsibility to allocate a bounce buffer and fill necessary data like UART (again) with a dynamic payload offset in DMA bufferSome peripherals have a dynamic structure of data consumed by DMA. One example would be UART which requires the following structure if the transfer is shorter than 255:
and following structure if the transfer is longer:
Do you think we could somehow capture it in a macro, or would we need to treat such peripherals as exceptions and let device drivers allocate bounce buffers? UART transferring constant dataLet's discuss an application that transfers constant byte-strings using UART. Currently, the application code would define these byte-strings as something like:
Potentially placing data in NVM where some DMA have access, but others not necessarily. With the new macro definition, I understand the code would need to be refactored to something like
what is functionally correct. However, it would cause unnecessary RAM buffers allocation and spi_flashSPI flash sample uses SPI (potentially with DMA) indirectly: I don't think it should be done by An alternative would be to define a bounce buffer in |
It seems like there's some special metadata header that needs to exist inline with the data in the cases you've laid out. Fair enough, lets work through it.
This seems like a pretty special case, and perhaps a specialized PWM set of buffer setup macros would be useful that understand the period and pulse. We have C and while that limits our ability to express some compile time things, I think this could be done with PWM specific macros. Is there some datasheet/ref manual or driver in the tree that does this today I could look at?
Why is a bounce buffer needed here vs multiple length prefixed transfers in a list? Heap allocations and memcpy aren't free. Is there some hardware datasheets or drivers I can look at?
That's true today to with bounce buffers isn't it?
In this case wouldn't the flash device tree node define things appropriately for applications? I'm a little lost on this example in particular. |
I would like to highlight here that every single peripheral in the system could be a special case like this. It is possible that any peripheral, including basic ones like PWM, GPIO, or timers, can use DMA internally to offload a CPU from accessing peripheral registers.
Do I understand correctly that it would require deprecating
Solutions like this would cause an overhead I described in "UART transferring constant data" for every single device driver.
An example of PWM using DMA to load pulse values is here: https://infocenter.nordicsemi.com/topic/ps_nrf5340/pwm.html?cp=4_0_0_6_22
I'm not sure if a bounce buffer is needed. However, if the transfer size is determined in run-time, I cannot imagine a macro that could set the size and the offsets in the static buffer correctly. Maybe it would be possible with more complicated macros.
I'm not aware of any publicly available datasheet or driver. Assuming an 802.15.4 radio's DMA expects PHR immediately followed by PSDU in RAM (like nRF5 radios), for 802.15.4 MSK PHY described in 17.1.1 of IEEE 802.15.4-2020, the length of PHR (encoding of the frame length) is one or two bytes long depending on the payload (PSDU) length. It's similar to the example I drafted in the previous comment.
Some SoCs require special handling of RAM buffers for DMAs, others do not. Currently, Zephyr works perfectly fine with devices that do not require any special handling. DMAs in devices like this can access The RFC created by @carlocaione and this PR aim to design a systematic way to support SoCs requiring special handling of DMA RAM buffers. To answer your question: Yes, in these more complicated devices the additional buffer overhead and But my point is that we should avoid creating this overhead for SoCs which work fine without it. I'm curious if we could define a driver API that allows its user to avoid the discussed overhead in simple SoCs, but still statically allocate buffers to avoid bounce buffers in complex SoCs.
I'll try to show my concern in a little more detailed example Currently, the spi_flash sample contains code like: const uint8_t expected[] = { 0x55, 0xaa, 0x66, 0x99 };
uint8_t buf[sizeof(expected)];
rc = flash_write(flash_dev, SPI_FLASH_TEST_REGION_OFFSET, expected, len);
rc = flash_read(flash_dev, SPI_FLASH_TEST_REGION_OFFSET, buf, len); The uint8_t buf[5] = { 0 };
struct spi_buf spi_buf[2] = {
{
.buf = buf,
.len = 1,
},
{
.buf = data,
.len = length
}
}; I understand the spi_nor.c code would need to be modified a little to use a static allocator like DT_BUF_DEFINE(&driver_cfg->spi, buf, 5);
memset(buf, 0, 5);
struct spi_buf spi_buf[2] = {
{
.buf = buf,
.len = 1,
},
{
.buf = data,
.len = length
}
}; However, it's not clear to me how the One option would be to allocate a buffer in the const uint8_t expected[] = { 0x55, 0xaa, 0x66, 0x99 };
DT_BUF_DEFINE(/* how to get spi dev, if it's decoupled from flash API user? */, expected_buf, sizeof(expected));
memcpy(expected_buf, expected, sizeof(expected));
rc = flash_write(flash_dev, SPI_FLASH_TEST_REGION_OFFSET, expected_buf, len); But to make it work we would need to increase coupling between software modules so that flash API user gets access to the bus device to which the flash module is connected. I don't think increasing coupling like this is a good idea. Another solution would be to define a bounce buffer in spi_nor.c: DT_BUF_DEFINE(&driver_cfg->spi, bounce_buf, length);
memcpy(bounce_buf, data, length); But this solution relies on a bounce buffer that we discuss how to avoid. |
My 2 cents. Shortly, I think that the problem here is that there are cases where the application layer has no idea about the requirement at driver / hardware level about buffer allocation / overhead / metadata / etc... Now, in many cases this information can indeed be obtained from DT, but there are cases where (for example the overhead, the internal buffer organization, etc...) the information is only known at run-time and not compile-time. For all these cases a static buffer allocation is probably not optimal (because we would end up introducing a different macro for each exception) or not achievable at all, so we must resort to a dynamic allocation of sorts (aka bounce buffers). Now, I understand @andyross @teburd concerns but I don't see an easy way out. If the buffer cannot be statically defined at app layer, then it must be allocated by drivers and filled up with a user buffer provided by app (when needed, plus all the other needed DMA bits and pieces). |
That's what should be solved, making that available to the application.
All the cases in Zephyr I've seen thus far can be solved with a macro that does just that. Please help me understand where this is coming from.
To repeat what @andyross said, this isn't Linux. To have this setup where the default would be to malloc+memcpy silently on a peripheral API call just because the application author didn't have the tools to setup a buffer right is massively wasteful and I'll -1 it all day. If you have special hardware with special needs because it has some unique logic as being described here, well that to me that seems like a vendor specific problem not a Zephyr problem. That should be solved in a vendor specific way not in a broad way that implies this is how Zephyr wishes to handle DMA buffers generally by default. I'm happy to be wrong if examples can be found and given or some docs can be linked so I can better understand what problem is trying to be solved here. I haven't seen any good examples in the tree after some grepping around. UART:
SPI:
I2C:
FLASH: (harder to grep for memcpy as many peripherals allow for mapping the flash into a bus address space...)
Happy to be shown something differently but I just don't see where this problem actually is. I do see the problem with setting up aligned and placed buffers. That is a well trod problem I think anyone who's written a test case with DMA involved can attest to. But a macro could solve it. I've linked to ones I know of first hand already. |
Agreed. But from the analysis I presented in previous comments, it seems forcing the application authors to use I'm curious if we can modify any of these solutions or find a new one that is not wasteful in systems where DMA can use any buffers, and in systems where DMAs have special needs. |
Driving by again: I don't have the DMA depth of expertise, but I'm still with @tomb here. This seems like a needlessly heavyweight solution for a problem that's well-constrained and solvable statically for almost everything we're talking about. Note also that there's a natural abstraction layering here that's being violated: it's possible (if not trivially) to take an API designed for static allocation and indirect it at runtime with a dynamic buffer. It's not possible to magically make a runtime allocation into a static one in the general case (c.f. the undecidability of the Halting Problem, Marvin Minsky et. al.). This is what I meant upthread about "not being able to go back". My general worry is that we'll continue to see use on devices and applications where static buffer allocation is a natural fit (SOF is exactly such an app, even though right now it has its own heap to allocate them), and they'll continue to use some kind of backdoor anyway to avoid the dynamic overhead. Let's not tempt them? |
Maybe it will be easier to find common ground if we decide on the basic expected properties of the system we would like to get. I believe for simplicity we can discuss an application that sends through UART two strings:
Both buffers are configured statically by an application. Now, let's consider two types of SoCs for which we compile this application
For STM32 we would like to:
For nRF52 we would like to:
Do you agree these are the system properties we would like to get in a single portable application? |
* @retval 0 on success | ||
* @retval other errno codes depending on the implementation | ||
*/ | ||
int dmm_release_buffer_for_dma_out(const struct device *dev, uint32_t flags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if release a buffer why I need to know the flags?
* @retval 0 on success | ||
* @retval other errno codes depending on the implementation | ||
*/ | ||
int dmm_get_buffer_for_dma_out(const struct device *dev, uint32_t flags, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this possible to define where is the buffer comes from? say a memory pool point?or just a fixed physical address
* @param[out] dma_handle DMA handle. | ||
* @param[in] buffer User-provided buffer. | ||
* @param[in] buffer_size Size of the user-provided buffer. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as out, can we add a parameter for where the buffer is comming from?
I was thinking a little about the discussion we had during the last arch meeting. We would like to achieve multiple quality attributes of the system:
Changing priorities of these attributes leads to different solutions. I'm thinking if we could merge (make available) all proposed solutions to make it possible for the application developer to choose which of the attributes is the highest priority for the application. Here is what came to my mind looking at the example we discussed: audio application: getting audio data from SPI and sending them to I2S output. Proposal 1: drivers responsible to manage buffersThe application code would be something like this:
User-friendliness is perfect - code is easy to write, understand, and maintain. Performance varies: in some socs Proposal 2: application responsible to manage buffersCode example:
User-friendliness is a little worse - using macros to allocate buffers is not intuitive. Performance is worse: we always have at least one Merged proposal: the application may manage buffers using macros, but the responsibility is in the drivers.With this approach, the application could be implemented in multiple ways. One example would be exactly like "Proposal 1", another exactly like "Proposal 2", sharing all of its attributes. One more which I think is the most interesting would be:
User-friendliness is similar to "Proposal 2" - using unintuitive macros to allocate buffers. Performance is perfect - the SoC-specific code in the drivers will ensure that we avoid But what I think is most important in this solution is that we do not enforce "stability over performance" or "performance over user-friendliness". It's the application developer who decides which property is the most important for their application. Other benefits of the merged proposal are:
Required changes:
|
This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time. |
This is a hopefully-to-be-discussed proposal for an actual implementation of what was discussed in #57220 (please read that before keep reading).
TL;DR: Zephyr is currently missing a way to allocate and prepare buffers to be used by DMA engines and DMA-enabled peripherals on platforms with a complex memory organization. This PR is a proposal for a subsystem interface dealing with that.