Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: drivers: stm32: SPI: Check that SPI buffers are in a nocache region #57786

Merged
merged 1 commit into from
Jul 28, 2023

Conversation

dgastonochoa
Copy link
Contributor

@dgastonochoa dgastonochoa commented May 11, 2023

This is a proposal for adding a memory region to the STM32H7 SoCs to be used by DMA peripherals.

In H7, DMA cannot access certain memory regions. A peripheral using DMA will not work correctly if the provided read/write buffers are in the wrong memory regions. This can lead to confusing problems, e.g. if SPI is used with DMA but with buffers at the wrong addresses, SPI will generate its clock bu it will not send anything. Furthermore, if the buffers are in a valid region but the latter is not configured as nocache, SPI will still generate its clock but send all zeroes, or even worse, it will send the data that was already present in that memory region (for instance, if it was initialised at startup time) but it won't send new values written to such region.

This PR reclares a memory region in SRAM3, 2 or 1 (depending on which one is available) and sets them as nocache. This is equivalent to what is already done for Ethernet in H7.

This intents to be the first of 2 PRs: if this gets approved, the plan is to upload another PR to verify, within the stm32 SPI driver, that the given send/receive buffers are within the correct regions, or return an error otherwise, to warn the user.

How to test

Below there is a file which is a Zephyr application called 'dma-test' that can be placed in zephyr/samples/basic, then built and run to test the PR. It has been tested on a nucleo-h743zi board. It sends some data over SPI (with DMA enabled), then it modifies that data, and then it sends it again to verify DMA got the changes.

It also contains a test to see what happens if a buffer, which is placed in an invalid memory region, is sent. For now, SPI does not return an error. It tries to send it, but no data will be actually sent as explained above.

dma-test.tar.gz

You can build it with: west build -b nucleo_h743zi zephyr/samples/basic/dma-test

Here is a logic analyzer capture:

asd

Above you can see how the first 2 data chunks send the expected data but the last one sends garbage, as the buffer is in the wrong regions.

@zephyrbot zephyrbot added the platform: STM32 ST Micro STM32 label May 11, 2023
@dgastonochoa dgastonochoa changed the title soc: arm: stm32h7: Add RAM sections for DMA RFC: soc: arm: stm32h7: Add RAM sections for DMA May 11, 2023
@dgastonochoa dgastonochoa force-pushed the dma-mem-degion branch 5 times, most recently from 178c7ec to 2b12ac4 Compare May 11, 2023 12:36
@erwango
Copy link
Member

erwango commented May 11, 2023

Thanks for this proposal, this problem has been identified since some time, see #36471 and #57220.
Note that the solution you proposed was evoked in #36471 with following evaluation: "this is a significant constraint to map user buffers in this NoCache area, else memcopy are also needed."

I suggest to follow #57220 going forward.

This being said, I'll try to review the proposal for the sake of discussion

@erwango erwango added the RFC Request For Comments: want input from the community label May 11, 2023
@dgastonochoa
Copy link
Contributor Author

Hi @erwango, would this change make sense as short-term solution until #57220 is carried on?

@erwango
Copy link
Member

erwango commented May 11, 2023

Hi @erwango, would this change make sense as short-term solution until #57220 is carried on?

Did you check #52965 ?
For short term, I'd prefer we follow the same pattern (if possible, which I think it is)

@dgastonochoa
Copy link
Contributor Author

I see, please let me double check if I have understood it correctly:

I would need to enable DMA and declare a memory region as nocache in a DTS overlay file, as tests/drivers/adc/adc_dma/boards/nucleo_h743zi.overlay does, e.g.:

/* ADC driver expects a buffer in a non-cachable memory region */
&sram4 {
	zephyr,memory-region-mpu = "RAM_NOCACHE";
};

Then, in my app. source file, I would declare the buffer to be used with DMA to be placed in "SRAM4.dma" (SRAM4 or whatever region I declare as nocache). This would all be application-level, so it must not be in the PR.

The PR should do something similar to drivers/adc/adc_stm32.c::address_in_non_cacheable_sram but in SPI (which accomplishes to what my planned second PR was going to do). Is this correct?

If so, is there any good common place to put the address_in_non_cacheable_sram function, so it's accessible for all drivers? I would like to avoid copying it to the SPI driver.

@dgastonochoa dgastonochoa requested a review from erwango May 15, 2023 14:28
@erwango
Copy link
Member

erwango commented May 15, 2023

If so, is there any good common place to put the address_in_non_cacheable_sram function, so it's accessible for all drivers? I would like to avoid copying it to the SPI driver.

Maybe something like soc/arm/st_stm32/stm32h7/mpu_regions.c?

@dgastonochoa
Copy link
Contributor Author

I haven't found any place that seems right to put this function (mpu_regions.c has not a public header). Also, modifying the ADC driver to take it from a common place would require further testing. Thus, I have placed the new logic in the SPI driver directly.

@dgastonochoa
Copy link
Contributor Author

As per the comments above, this RFC will change its approach to something similar to what #52965 does, that is, it declares the nocache region in DT and then the driver in question checks whether or not the buffer/s received as a parameter are in a nocache memory region.

Therefore, this PR will now cover the second PR mentioned in the original descriptuion, that is, the stm32 SPI driver will return an error if the provided buffers are not in a valid memory region (only for H7, with DMA enabled).

Notice that the problem of detecting if the buffers are in a memory area accessible by DMA (regardless it is nocache or not) remains.

How to test

The changes in the PR have required changes in the tests. Below you can find the new test sample:

dma-test.tar.gz

The results are here:

spi

As you can see, now only the valid transactions are made. The invalid one is not carried on as the SPI driver detects the problem and returns an error.

@dgastonochoa dgastonochoa changed the title RFC: soc: arm: stm32h7: Add RAM sections for DMA RFC: drivers: stm32: SPI: Check that SPI buffers are in a nocache region May 16, 2023
@dgastonochoa dgastonochoa force-pushed the dma-mem-degion branch 6 times, most recently from 929f446 to 4c17688 Compare July 25, 2023 18:21
Comment on lines 7 to 19
#include "spi.h"

bool spi_buf_set_in_nocache(const struct spi_buf_set *bufs)
{
for (size_t i = 0; i < bufs->count; i++) {
const struct spi_buf *buf = &bufs->buffers[i];

if (!dt_buf_in_nocache((uintptr_t)buf->buf, buf->len)) {
return false;
}
}
return true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this static inline and move this function into include/zephyr/drivers/spi.h so to not have at all spi.c and spi.h from this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this static inline and move this function into include/zephyr/drivers/spi.h so to not have at all spi.c and spi.h from this PR?

It was static inline before and I moved it because it was requested in comments above, and I agreed as I prefer to have implementations in C files since tests are simpler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I disagree. Otherwise we should have i2c.c, i2s.c, etc ... for each bus with only one single function in it?

All the bus helpers are shipped in the header file for that bus.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @carlocaione this should be in spi.h as a static inline

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong on many levels. See the other comments but the point is that we must try to keep all this work contained in the spi layer.

If we want to have generic helpers to work with memory attributes this can be done but this must support all the attributes in a generic way, we cannot stick to only one single attribute. It also a much complex and bigger work that can be addressed in a different PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong on many levels. See the other comments but the point is that we must try to keep all this work contained in the spi layer.

That is what I had in my first commit about 2 months ago, before getting several requests to change it and make it generic. This is explained in the first and second descriptions so I won't dive into it anymore.

Copy link
Collaborator

@carlocaione carlocaione Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the problem is that this is not generic at all, you are only dealing with the nocache attribute. If you want to add generic helpers for the memory attributes you must do that supporting any kind of attribute, not just one, otherwise at minimum this is not scalable at all.

Comment on lines 38 to 42
struct dt_mem_region {
const uintptr_t start;
const uintptr_t end;
bool isnocache;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dt_mem_region is a generic struct but carrying a isnocache attribute? What about the other attributes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mentioned before I was doubtful about leaving the attributes string, but I didn't have a strong opinion on it so I wanted to receive some feedback.

Comment on lines 66 to 72
#define DT_GET_MEM_REGION(node_id) \
{ \
.start = (const uintptr_t)DT_REG_ADDR(node_id), \
.end = \
(const uintptr_t)(DT_REG_ADDR(node_id) + DT_REG_SIZE(node_id)) - 1, \
.isnocache = _DT_IS_NOCACHE_MEM_REGION(node_id) \
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a generic helper DT_GET_MEM_REGION is only dealing with the RAM_NOCACHE attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same explanation as the above comment.

Comment on lines 83 to 85
const struct dt_mem_region mem_regions[] = {
DT_MEMORY_ATTR_FOREACH_NODE(DT_GET_MEM_REGION)
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hard no. You are actually allocating RODATA by only including an header file whether you use it or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this was in the SPI driver source file only until I got requested to move to a generic place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, an header is definitely not a proper place for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice work but (sorry) totally useless at the moment. We can have generic helpers to deal with memory attributes but that is a totally different work. If you want to deal with SPI it is fine as long as you keep everything contained in the SPI layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to deal with SPI it is fine as long as you keep everything contained in the SPI layer.

That is exactly what I did in my first commit. This is already going to much out of scope for me. Can you please @erwango , @teburd and @carlocaione agree whether you want this to be generic or self-contained in the SPI driver? I repeat, my only purpose for this PR is to protect the SPI driver transceive_dma function to return an error in case of a buffer is in a no-nocache memory area, which I think it's a good idea given that otherwise the SPI driver behaviour is very confusing. I never intended to make this a generic solution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale here is simple:

  • If you want to introduce helpers in include/zephyr/devicetree/mem.h these helpers must be generic and deal with all kind of memory attributes, not just SPI and not just CACHE / NOCACHE. This is a pretty big and complex work.
  • If you are only interested in SPI + CACHE / NOCACHE, then everything must be self-contained in the SPI layer and possibly header file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are only interested in SPI + CACHE / NOCACHE, then everything must be self-contained in the SPI layer and possibly header file.

I am only interested in SPI + CACHE / NOCACHE. Moreover, I'm interested in STM32 SPI only, so please can we agree to go back to the scope I proposed initially and modify only spi_ll_stm32.c to perform this check? Not even the header file, as this will require to allocate the memory region array in it as you mentioned. I want to restrict it to solely STM32 SPI.This is the approach that was adopted in #52965.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgastonochoa If there's a solution fitting between @carlocaione and @teburd that will be fine with me.

Copy link
Collaborator

@teburd teburd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the complaints here @carlocaione, the dt macros and types can easily be extended and encode at this moment the most useful information, the start, end, and cached attributes. This seems like it could be extended further as needed?

Or is the issue that its working on a struct rather than using DT macro wrappers for getting each attribute of a memory region?

Unclear to me what the problem is.

Comment on lines 7 to 19
#include "spi.h"

bool spi_buf_set_in_nocache(const struct spi_buf_set *bufs)
{
for (size_t i = 0; i < bufs->count; i++) {
const struct spi_buf *buf = &bufs->buffers[i];

if (!dt_buf_in_nocache((uintptr_t)buf->buf, buf->len)) {
return false;
}
}
return true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @carlocaione this should be in spi.h as a static inline

@carlocaione
Copy link
Collaborator

@teburd

Unclear to me what the problem is.

There are several problems:

  1. This is not scalable, at all. The current solution is implying that you have one check function per attribute. This should be generic, with the attribute as parameter.
  2. Yes, this is incomplete. The problem is that as soon as you start thinking how to extend this and how to make it generic, you will soon realize that this approach is too simplistic (been there, done that).
  3. include/zephyr/devicetree/mem.h is not the correct place for functions like dt_buf_in_nocache. That function is not related to DT at all, in there you should only find DT macros. See for example the difference between include/zephyr/devicetree/mbox.h and include/zephyr/drivers/mbox.h.
  4. The header file is using a const struct array built at compile-time. If anything, that array should be passed as parameter and the function only checking that.

I'm fine with the general idea of this PR, but this is not the correct (and complete) solution to a complicated problem. This problem cannot be solved with DT macros only or DT helpers, unfortunately something resembling a subsystem must be introduced and I'm working on that. If this work is kept in the realm of SPI (or platform specific even) I have no complains and this is what I'm suggesting here.

@teburd
Copy link
Collaborator

teburd commented Jul 26, 2023

@teburd

Unclear to me what the problem is.

There are several problems:

  1. This is not scalable, at all. The current solution is implying that you have one check function per attribute. This should be generic, with the attribute as parameter.

e.g. DT_MEM_HAS_ATTR(DT_NODE(some_mem_region), NOCACHE) kind of thing?

  1. Yes, this is incomplete. The problem is that as soon as you start thinking how to extend this and how to make it generic, you will soon realize that this approach is too simplistic (been there, done that).

Broadly... simpler things tend to win out

  1. include/zephyr/devicetree/mem.h is not the correct place for functions like dt_buf_in_nocache. That function is not related to DT at all, in there you should only find DT macros. See for example the difference between include/zephyr/devicetree/mbox.h and include/zephyr/drivers/mbox.h.
  2. The header file is using a const struct array built at compile-time. If anything, that array should be passed as parameter and the function only checking that.

Agree, should be static const in a .c and used with an extern in the header, no argument there.

I'm fine with the general idea of this PR, but this is not the correct (and complete) solution to a complicated problem. This problem cannot be solved with DT macros only or DT helpers, unfortunately something resembling a subsystem must be introduced and I'm working on that. If this work is kept in the realm of SPI (or platform specific even) I have no complains and this is what I'm suggesting here.

Understood, I guess I'll be interested to see what that looks like. Until then I think we should give the PR author a clear answer. It sounds like you are working on a more general proposal to solve this which is great. Given that I don't think the requirement to generalize here in this PR makes sense. So @dgastonochoa it seems like the simplest path forward is to revert back to what you had initially. Sorry about that! I know this idea is more broadly useful to other platforms (Intel and NXP have similar things) and so I was hopeful we could get the start of something going here. I guess @carlocaione will have something for us to peruse soon in that regard. Looking forward to it!

@carlocaione
Copy link
Collaborator

e.g. DT_MEM_HAS_ATTR(DT_NODE(some_mem_region), NOCACHE) kind of thing?

Yeah, but the parameters that you want the function / macro to take are address, size and attribute of the region you want to check. So this is not really a DT_MEM_HAS_* kind of thing but possibly a real function mem_check_buf(void *addr, size_t size, int attr) where the address is possibly a virtual address while the DT addresses are physical addresses, so you have also some MMU complexity in the mix.

This, together with your correct other comment:

Agree, should be static const in a .c and used with an extern in the header, no argument there.

means (to me) that this is a problem that cannot be exclusively fixed at include/zephyr/devicetree/ level, but this requires a proper library / subsystem that is basically gathering all the info coming from DT about memory regions and act on this information using a proper API.

I'm currently working on that, I will have something ready in the next few days.

@dgastonochoa
Copy link
Contributor Author

dgastonochoa commented Jul 26, 2023

So @dgastonochoa it seems like the simplest path forward is to revert back to what you had initially.

This will involve to move all the logic to spi_ll_stm32.c, nothing will be in headers, not even in spi.h, as it is done in other drivers. Also, all logic will be restricted to be used only for stm32h7. @carlocaione @teburd could you please confirm you are both happy with this?

@carlocaione
Copy link
Collaborator

So @dgastonochoa it seems like the simplest path forward is to revert back to what you had initially.

This will involve to move all the logic to spi_ll_stm32.c, nothing will be in headers, not even in spi.h, as it is done in other drivers. @carlocaione @teburd could you please confirm you are both happy with this?

Yes, I will be happy with that. Eventually we can then move to the API I'm preparing if that is accepted upstream.

@teburd
Copy link
Collaborator

teburd commented Jul 26, 2023

So @dgastonochoa it seems like the simplest path forward is to revert back to what you had initially.

This will involve to move all the logic to spi_ll_stm32.c, nothing will be in headers, not even in spi.h, as it is done in other drivers. Also, all logic will be restricted to be used only for stm32h7. @carlocaione @teburd could you please confirm you are both happy with this?

Yes that's fine

@dgastonochoa
Copy link
Contributor Author

@carlocaione @teburd all changes applied. Could you review them please?

drivers/spi/spi_ll_stm32.c Outdated Show resolved Hide resolved
.start = (const uintptr_t)DT_REG_ADDR(node_id), \
.end = \
(const uintptr_t)(DT_REG_ADDR(node_id) + DT_REG_SIZE(node_id)) - 1, \
.isnocache = IS_NOCACHE_MEM_REGION(node_id) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are interested only in the NOCACHE regions, why not filtering out the other regions when constructing mem_regions[]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean using COND_CODE_1 in combination with IS_NOCACHE_MEM_REGION to not add "no nocache" entries to the array? That would be ideal, I will try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

drivers/spi/spi_ll_stm32.c Show resolved Hide resolved
Copy link
Collaborator

@carlocaione carlocaione left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits but we are getting there

Comment on lines +57 to +60
#define IS_NOCACHE_MEM_REGION(node_id) \
COND_CODE_1(DT_ENUM_HAS_VALUE(node_id, zephyr_memory_attr, RAM_NOCACHE), \
(1), \
(0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way to do this: #60850

const struct mem_region *mem_reg = &nocache_mem_regions[i];

const bool buf_within_bounds =
(buf >= mem_reg->start) && ((buf + len_bytes - 1) <= mem_reg->end);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you do not remove the -1 from mem_reg->end and just:

(buf + len_bytes) < mem_reg->end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this on purpose. In general terms, this is checking if a number X is within a closed interval. If you would need to do this, you would naturally use <= and >=. I wanted to follow this pattern as it seems more intuitive. Otherwise, the bool variable is called "within_bounds" but it seems to not be doing exactly that. It seams more clear to me this way. In other contexts (most of them possibly), I would do as you say.


#define GET_MEM_REGION(node_id) \
{ \
.start = (const uintptr_t)DT_REG_ADDR(node_id), \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this cast is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

DMA only works with non-cached memory regions in H7. Check them
and return an error if they don't match this condition.

Signed-off-by: Daniel Gaston Ochoa <[email protected]>
@carlescufi carlescufi merged commit 1b3e2d9 into zephyrproject-rtos:main Jul 28, 2023
15 checks passed
@dgastonochoa dgastonochoa deleted the dma-mem-degion branch July 28, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Devicetree area: DMA Direct Memory Access area: SPI SPI bus platform: STM32 ST Micro STM32 RFC Request For Comments: want input from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants