Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MdeModulePkg/PciHostBridgeDxe: Add MemoryFence after write. #6031

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

apop5
Copy link
Contributor

@apop5 apop5 commented Aug 1, 2024

Description

On AARCH64, there is no ordering guarantee between configuration space (ECAM) writes and memory space reads (MMIO). ARM AMBA CHI only guarantees ordering for reads and writes within a single address region, however, on some systems MMIO and ECAM may be split into separateaddress regions.

A problem may arise when an ECAM write is issued a completion before a subsequent MMIO read is issued and receives a completion.

For example, a typical PCI software flow is the following:

  1. ECAM write to device command register to enable memory space
  2. MMIO read from device memory space for which access was enabled in step 1.

There is no guarantee that step 2. will not begin before the completion of step 1.
on systems where ECAM/MMIO are specified as separate address regions, even
if both spaces have the memory attributes device-nGnRnE.

  • Add a barries after the final PCI Configuration space write in RootBridgeIoPciAccess. Configuration space reads should not have side-efects.

  • When configuration space is strongly ordered, this ensures that program execution cannot continue until the completion is received for the previous Cfg-Write, which may have side-effects.

  • Risk of reading a "write-only" register and causing a CA which leaves the device unresponsive. The expectation based on the PCI Base Spec v6.1 section 7.4 is that all PCI Spec-defined registers will be readable, however, there may exist design-specific registers that fall into this category.

  • Breaking change?

    • Breaking change - Does this PR cause a break in build or boot behavior?
    • Examples: Does it add a new library class or move a module to a different repo.
  • Impacts security?

    • Security - Does this PR have a direct security impact?
    • Examples: Crypto algorithm change or buffer overflow fix.
  • Includes tests?

    • Tests - Does this PR include any explicit test code?
    • Examples: Unit tests or integration tests.

How This Was Tested

AARCH64 platform on real silicon

Integration Instructions

N/A

@github-actions github-actions bot requested review from lgao4 and niruiyu August 1, 2024 23:45
@makubacki
Copy link
Member

Bringing a prior discussion of this patch to attention https://edk2.groups.io/g/devel/topic/102435564.

@mdkinney based on his feedback there.

@lgao4
Copy link
Contributor

lgao4 commented Aug 13, 2024

@apop5 , this change is reviewed before soft feature freeze. Do you request it to catch this stable tag 202408?

Copy link
Member

@niruiyu niruiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 11/8/23 02:06, Michael D Kinney wrote:
Hi Jose,

  1. This logic needs to move into an AARCH64 specific directory/file.
    Other architectures handle this in other ways.
  2. There are many places throughout edk2 sources that perform PCI
    config write operations. You are only fixing it in a single
    location. You may want to look at the MdePkg PciLibs to see if it
    can be addressed there with an AARCH64 specific dir/file, but that
    still may not address all possible PCI config write accesses. Fill
    analysis of the target platform sources may be required to make sure
    it is fixes for all locations.
    Hm.

Perhaps if this change is pushed down as much as possible, into a
special AARCH64 PciLib instance, so that the readback or the DSB (IIRC)
is performed after every PCI config space write, then that might
actually suffice. Because, ultimately, PciHostBridgeDxe too depends on
PciSegmentLib for the access sycles.

(Previously I suggested extending PciHostBridgeLib with a new API, but
that wasn't right, per your second point -- I didn't realize the same
issue must exist (for example) in the PEI phase, before the PCI-related
protocols even exist.)

Laszlo

@ardbiesheuvel
Copy link
Member

I don't think inventing a whole new library hierarchy for this (and the associated maintenance burden of keeping those mostly identical implementations in sync) is worth the hassle. Also, I don't see the problem of having a specific fix in this location, rather than creating new MMIO accessors for AARCH64 specifically that always issue barriers after MMIO writes.

The case where one MMIO region controls whether other MMIO regions are actively decoding their address ranges is highly specific to a PCI host bridge, and drivers that access the ECAM space directly instead of going through this abstraction are violating the layering already.

So what we'll need to address here is the fact that this readback, while most likely harmless in practice, is unnecessary on systems with strict MMIO ordering (i.e., X64) and therefore undesirable.

Could we perhaps address this by introducing a fixed/feature PCD with appropriate default values on all architectures? E.g., declare PcdArchHasWeaklyOrderedMmio as TRUE on AARCH64 and FALSE everywhere else, and guard the added code by testing for this PCD?

@mdkinney
Copy link
Member

Adding a PCD is equivalent to adding an internal API with arch specific implementation in an arch specific file. The advantage of the file one is to prevent the introduction of another PCD. Adding PCDs should be minimized because it just increases the combinations that need to be validated when there is maintenance in code that depends on the PCD setting.

In my feedback on the mailing list, I noted that the change in this one file does not fix all places that PCI writes occur.

I will even note that this one change seems like it could be incomplete because the API modified supports both single write and writing a buffer. What if a read is required for every write in the buffer.

If this is special behavior required for writing a specific PCI register or registers, then why can't the code that calls the API modified in this PR do the additional read after the write?

@ardbiesheuvel
Copy link
Member

ardbiesheuvel commented Aug 30, 2024

Adding a PCD is equivalent to adding an internal API with arch specific implementation in an arch specific file. The advantage of the file one is to prevent the introduction of another PCD. Adding PCDs should be minimized because it just increases the combinations that need to be validated when there is maintenance in code that depends on the PCD setting.

In my feedback on the mailing list, I noted that the change in this one file does not fix all places that PCI writes occur.

Fair.

I will even note that this one change seems like it could be incomplete because the API modified supports both single write and writing a buffer. What if a read is required for every write in the buffer.

Because we want the interconnect order to follow the program order, and the program cannot expect any side effects to be visible while the API call is in progress.

What happens here is that an ECAM write is followed by a MMIO read or write (in program order), but due to interconnect reordering, the accesses arrive at the PCIe host bridge in the opposite order. This may result in the MMIO window to be accessed before the ECAM write that enables this MMIO window has taken effect.

If this is special behavior required for writing a specific PCI register or registers, then why can't the code that calls the API modified in this PR do the additional read after the write?

The requirement is for CPU accesses to MMIO regions not to be reordered. On ARM, the architecture only guarantees this inside a single page, which is why this is usually not an issue: device drivers that manipulate MMIO registers don't typically manage disjoint MMIO regions like this, it is a case that is only common for PCIe host bridges that decode different, disjoint memory regions.

Updating the callers would involve updating all generic host controller drivers (USB, NVMe, SATA, etc) to issue a barrier after enabling MMIO decoding in the Start() handler, so that subsequent accesses to the control registers in the MMIO window will not be issued before the ECAM write that makes it accessible.

This is why the rootbridge IO API is an appropriate place to address this, although I have to concede that there may be other places that need to be fixed as well (for PCIe users in PEI, for instance) <-- the PEI abstractions appear to rely on platform drivers that hide the PCIe specifics, so no changes should be needed there

@leiflindholm
Copy link
Member

This isn't a workaround for an AArch64 quirk though. This is an operation that happens not to be mandatory on x86. And while the commit message talks about AArch64, likelihood is this will apply to any new architecture (including RiscV64, Loongarch) added.

I understand the argument that this isn't the kind of thing Pcds are for. And indeed, it would be at best ugly to duplicate Pcd definitions (with different defaults based on architecture).

From a side chat with Ard, we're thinking this readback could be replaced by a MemoryFence(), which is already going to be folded out by the toolchain on x86, right?

@ardbiesheuvel
Copy link
Member

This isn't a workaround for an AArch64 quirk though. This is an operation that happens not to be mandatory on x86. And while the commit message talks about AArch64, likelihood is this will apply to any new architecture (including RiscV64, Loongarch) added.

I understand the argument that this isn't the kind of thing Pcds are for. And indeed, it would be at best ugly to duplicate Pcd definitions (with different defaults based on architecture).

From a side chat with Ard, we're thinking this readback could be replaced by a MemoryFence(), which is already going to be folded out by the toolchain on x86, right?

More context here: https://edk2.groups.io/g/devel/message/110456

I went back and forth on this, but as I see it, a MemoryFence() should be sufficient to prevent reordering of MMIO accesses occurring later in program order with the ECAM write. Even if this does not strictly guarantee completion of the access when the instruction is retired, I'm not convinced it makes a meaningful difference in practice, given that all subsequent memory accesses (not just MMIO ones) will be deferred until after the MemoryFence().

TL; DR the following should be sufficient here

  if (!Read) {
    MemoryFence ();
  }

and as Leif points out, LTO will elide all of this on architectures where MemoryFence() does nothing.

@lgao4
Copy link
Contributor

lgao4 commented Sep 13, 2024

This isn't a workaround for an AArch64 quirk though. This is an operation that happens not to be mandatory on x86. And while the commit message talks about AArch64, likelihood is this will apply to any new architecture (including RiscV64, Loongarch) added.
I understand the argument that this isn't the kind of thing Pcds are for. And indeed, it would be at best ugly to duplicate Pcd definitions (with different defaults based on architecture).
From a side chat with Ard, we're thinking this readback could be replaced by a MemoryFence(), which is already going to be folded out by the toolchain on x86, right?

More context here: https://edk2.groups.io/g/devel/message/110456

I went back and forth on this, but as I see it, a MemoryFence() should be sufficient to prevent reordering of MMIO accesses occurring later in program order with the ECAM write. Even if this does not strictly guarantee completion of the access when the instruction is retired, I'm not convinced it makes a meaningful difference in practice, given that all subsequent memory accesses (not just MMIO ones) will be deferred until after the MemoryFence().

TL; DR the following should be sufficient here

  if (!Read) {
    MemoryFence ();
  }

and as Leif points out, LTO will elide all of this on architectures where MemoryFence() does nothing.

If MemoryFence is used, I think it is fine.

@ardbiesheuvel ardbiesheuvel force-pushed the personal/apop5/readafterlastwrite branch from d83bd36 to 914c758 Compare September 13, 2024 14:40
@ardbiesheuvel
Copy link
Member

If MemoryFence is used, I think it is fine.

Thanks. I've taken the liberty to update the PR accordingly.

@apop5
Copy link
Contributor Author

apop5 commented Sep 13, 2024

If MemoryFence is used, I think it is fine.

Thanks. I've taken the liberty to update the PR accordingly.

Thank you Ard, for keeping the discussion going, and for pushing an update

@mdkinney
Copy link
Member

There are 2 concepts that may apply to this issue. One is a HW mechanism such as MemoryFence(), and the other is a compiler mechanism such as _ReadWriteBarier() to prevent optimizing compilers from reordering operations. The link below shows an example were a compiler specific _ReadWriteBarrier() intrinsic had to be introduced to prevent a change in order of the I/O port
operations.

Is the root cause the need for a HW fence or a compiler reordering operations?

@ardbiesheuvel
Copy link
Member

There are 2 concepts that may apply to this issue. One is a HW mechanism such as MemoryFence(), and the other is a compiler mechanism such as _ReadWriteBarier() to prevent optimizing compilers from reordering operations. The link below shows an example were a compiler specific _ReadWriteBarrier() intrinsic had to be introduced to prevent a change in order of the I/O port operations.

Is the root cause the need for a HW fence or a compiler reordering operations?

The former. The order in which the memory accesses are issued by the CPU might be different from the order in which they arrive on the host side of the PCI host bridge. A memory fence will prevent this from happening.

@apop5 apop5 changed the title MdeModulePkg/PciHostBridgeDxe: Add readback after final Cfg-Write. MdeModulePkg/PciHostBridgeDxe: Add MemoryFence after write. Sep 19, 2024
@apop5 apop5 force-pushed the personal/apop5/readafterlastwrite branch from 914c758 to b04c372 Compare September 19, 2024 18:24
On AARCH64, there is no ordering guarantee between configuration
space (ECAM) writes and memory space reads (MMIO). ARM AMBA CHI
only guarantees ordering for reads and writes within a single address
region, however, on some systems MMIO and ECAM may be split into
separateaddress regions.

A problem may arise when an ECAM write is issued a completion before a
subsequent MMIO read is issued and receives a completion.

For example, a typical PCI software flow is the following:

1. ECAM write to device command register to enable memory space
2. MMIO read from device memory space for which access was enabled
   in step 1.

There is no guarantee that step 2. will not begin before the completion
of step 1.
on systems where ECAM/MMIO are specified as separate address regions,
even
if both spaces have the memory attributes device-nGnRnE.

- Add a barrier after the final PCI Configuration space write
in RootBridgeIoPciAccess. Configuration space reads should not have
side-efects.

- When configuration space is strongly ordered, this ensures
that program execution cannot continue until the completion
is received for the previous Cfg-Write, which may have side-effects.

- Risk of reading a "write-only" register and causing a CA which leaves
the device unresponsive. The expectation based on the PCI Base Spec
v6.1 section 7.4 is that all PCI Spec-defined registers will be readable,
however, there may exist design-specific registers that fall into
this category.

Signed-off-by: Aaron Pop <[email protected]>

Co-authored-by: Ard Biesheuvel <[email protected]>
@apop5 apop5 force-pushed the personal/apop5/readafterlastwrite branch from b04c372 to d2b199f Compare September 19, 2024 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants