Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupts on RISC-V #847

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Interrupts on RISC-V #847

wants to merge 7 commits into from

Conversation

thejpster
Copy link
Member

) Adds an Xh3irq controller driver
) Adds a default MachineExternal handler which uses the Xh3irq to run whichever interrupts are pending
) Adds a bunch of interrupt control stuff to hal::arch
) Fixes the powman_test example to use the new interrupt APIs - it now works on RISC-V and Arm.

Tested powman_test on a Pico 2 in both riscv32imac-unknown-none-elf and thumbv8m.main-none-eabihf.

1) Use the Machine Timer, not the cycle counter
2) Use the hal::arch module to control interrupts
3) Write a MachineExternal interrupt handler which asks Xh3irq which interrupt to run next, and then runs it.

Note: we lose the macros that do that static-mut hack for us. Maybe we can add that back in later.
rp235x-hal-examples/src/bin/i2c_async.rs Outdated Show resolved Hide resolved
rp235x-hal/src/xh3irq.rs Outdated Show resolved Hide resolved
@thejpster
Copy link
Member Author

We might want to block this until we've written a new interrupt macro that works like the cortex-m-rt macro and does the static mut transform and type-checks the function signature.

Because in RISC-V mode the interrupt controller is the Xh3irq. Unless its the vector-table example, which only works on Arm currently.
I switched to the ADC FIFO IRQ so that the index and the bit number would be different.
#[no_mangle]
#[allow(non_snake_case)]
unsafe fn DefaultIrqHandler() {
panic!();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cortex_m does not panic but loop {} in the default handler.
It did panic for a short while in the past, but that was reverted here: https://github.com/rust-embedded/cortex-m-rt/pull/289/files#diff-b1a35a68f14e696205874893c07fd24fdb88882b47c23cc0e0c80a30c7d53759L993

I'm not sure if the reasons for that decision also apply for RISC-V, but at least for the rp2040/rp235x case it would be nicer if both implementations did the same.

@jannic
Copy link
Member

jannic commented Sep 13, 2024

We might want to block this until we've written a new interrupt macro that works like the cortex-m-rt macro and does the static mut transform and type-checks the function signature.

As the rp235x is a multi-core processor, and given that rust-embedded/cortex-m#411 indicates that the transformation made by cortex-m-rt is unsound on multi-core systems, I don't think we should just copy it. Is there a good way to make the transform safe on rp235x? Taking a full critical section on each interrupt entry is probably too expensive?

fn IO_IRQ_BANK0() {
// The `#[interrupt]` attribute covertly converts this to `&'static mut Option<LedAndButton>`
static mut LED_AND_BUTTON: Option<LedAndButton> = None;
let led_and_button = unsafe { &mut *core::ptr::addr_of_mut!(LED_AND_BUTTON) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This &mut *core::ptr::addr_of_mut!( ... ) IMHO doesn't make sense.

The whole raison d'être of addr_of_mut! is avoiding the &mut, because having a &mut (even for a short time, and not derefencing it) can potentially cause UB if you are not careful.

But here, you convert it to a &mut anyway. So I don't see any advantage of this difficult to read incantation, you can as well just do unsafe { &mut LED_AND_BUTTON }; instead.

Now, that causes a warning. And this warning looks like it led you to writing the code above:

warning: creating a mutable reference to mutable static is discouraged
   --> src/bin/gpio_irq_example.rs:152:35
    |
152 |     let led_and_button = unsafe { &mut LED_AND_BUTTON };
    |                                   ^^^^^^^^^^^^^^^^^^^ mutable reference to mutable static
    |
    = note: for more information, see issue #114447 <https://github.com/rust-lang/rust/issues/114447>
    = note: this will be a hard error in the 2024 edition
    = note: this mutable reference has lifetime `'static`, but if the static gets accessed (read or written) by any other means, or any other reference is created, then any further use of this mutable reference is Undefined Behavior
    = note: `#[warn(static_mut_refs)]` on by default
help: use `addr_of_mut!` instead to create a raw pointer
    |
152 |     let led_and_button = unsafe { addr_of_mut!(LED_AND_BUTTON) };
    |                                   ~~~~~~~~~~~~~              +

I think the warning is valid, but the help is misleading. It even says "to create a raw pointer", and as written it would return a pointer, not a &mut, so the resulting code wouldn't build.

TL;DR:

Suggested change
let led_and_button = unsafe { &mut *core::ptr::addr_of_mut!(LED_AND_BUTTON) };
#[allow(static_mut_refs)]
// Safety: As long as this interrupt is only enabled on a single core,
// and the interrupt isn't somehow called reentrantly, there will not
// be a second reference to LED_AND_BUTTON at the same time.
let led_and_button = unsafe { &mut LED_AND_BUTTON };

@thejpster
Copy link
Member Author

So I guess the risk is that both CPUs enter the same interrupt handler at the same time. If one peripheral interrupt signal is unmasked on both CPUs, that could happen. So perhaps the fix is to design the interrupt API to make that impossible (like a bitmask in a static atomic or the watchdog scratch that records which cpu has which interrupt enabled). Or we could ensure that if you enter an interrupt on Core A, it is masked for the duration on Core B - which is nicer than turning all interrupts off or holding a spin lock.

But given 2040 has the same problem the question is whether we want to fix it here or make a plan and fix it later.

@9names
Copy link
Member

9names commented Sep 14, 2024

The user already unmasks the interrupt, and that operation is unsafe. It is almost certainly a bug if they do so on both cores, and we have had no reports of people doing this accidentally.

Until someone comes has a use-case where having both cores handle an interrupt is beneficial, I think trying to solve this is not a good use of our time.

My opinion is we should document "don't do this" and move on with our lives

@jannic
Copy link
Member

jannic commented Sep 14, 2024

But given 2040 has the same problem the question is whether we want to fix it here or make a plan and fix it later.

That's an option, I just wouldn't want to spend time on porting the macro only to remove it again later.

Until someone comes has a use-case where having both cores handle an interrupt is beneficial,

Are there situations where it's not avoidable? Any non maskable interrupts or exceptions?

@jannic
Copy link
Member

jannic commented Sep 14, 2024

But to be honest, there's a second thing I dislike about the interrupt macro, so I'm biased.

For me, it's totally non-obvious that the macro annotation changes the code inside the function.
So if I see this:

#[interrupt]
fn SOME_INTERRUPT() {
  static mut SOMETHING: Option<Whatever> = None;
  [...]
  *SOMETHING = foo(); 
}

I always feel like the usage of the variable doesn't match the declaration. This was very confusing when I was new to embedded rust, and it still sometimes surprises me.

Therefore I'd prefer something more obvious. Perhaps a macro directly on the static mut?

#[interrupt]
fn SOME_INTERRUPT() {
  let something: &mut Option<Whatever> = interrupt_singleton!( didn't think about what comes here yet :Option<Whatever> );
  [...]
  *something = foo(); 
}

Probably needs an unsafe declaration as well, as using this macro outside an interrupt (or in a re-entrant interrupt) would be unsound. So it's far from a finished proposal, of course. This is only meant as an explanation why I don't like the current interrupt macro.

Back to topic: Personally I'd merge the PR as it is, and not wait for an #[interrupt] macro implementation for RISC-V. The current situation is not perfect, but it's good enough, and it's not clear yet how an improvement would look like.

(That said, if someone ported #[interrupt] I wouldn't oppose that either. While I still don't like it, I do value consistency.)

@jannic
Copy link
Member

jannic commented Sep 14, 2024

So I guess the risk is that both CPUs enter the same interrupt handler at the same time. If one peripheral interrupt signal is unmasked on both CPUs, that could happen.

BTW, do I understand xh3irq correctly that it already solves this issue? xh3irq::get_next_interrupt() should only return a given interrupt to a single CPU core, even if the interrupt is enabled on both, right?

@jannic
Copy link
Member

jannic commented Sep 14, 2024

For the rp2040, we could use separate vector tables for core0 and core1, and let the macro create two distinct functions with their own statics, so even if an interrupt was actually running on both cores at the same time, it would not access the same data.
While this would be sound, I don't know if it would be useful: At least the common pattern of initializing the static on the first interrupt would break:

  #[interrupt]
  fn IO_IRQ_BANK0() {
      // The `#[interrupt]` attribute covertly converts this to `&'static mut Option<LedAndButton>`
      static mut LED_AND_BUTTON: Option<LedAndButton> = None;

      // This is one-time lazy initialisation. We steal the variables given to us
      // via `GLOBAL_PINS`.
      if LED_AND_BUTTON.is_none() {
          critical_section::with(|cs| {
              *LED_AND_BUTTON = GLOBAL_PINS.borrow(cs).take();
          });
      }
      [...]

The second core to run this interrupt would always get a None value, and therefore wouldn't be able to do any useful work.

@thejpster
Copy link
Member Author

So I guess the risk is that both CPUs enter the same interrupt handler at the same time. If one peripheral interrupt signal is unmasked on both CPUs, that could happen.

BTW, do I understand xh3irq correctly that it already solves this issue? xh3irq::get_next_interrupt() should only return a given interrupt to a single CPU core, even if the interrupt is enabled on both, right?

I think the UART interrupt is cleared by reading the FIFO which can happen before the ISR ends. So it could re-fire before the ISR ends. Not sure either Interrupt Controller can avoid that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants