Problem with tag assignment for asynchronous events #33

oowekyala · 2022-10-21T10:06:39Z

There was a bug in the C++ runtime, and it can also happen in Rust theoretically (I wasn't able to reproduce it with an unmodified runtime, it depends on thread interleaving).

Possible faulty execution

An async thread reads the current time, computes the tag T for its new event. Thread is parked before the event is put into the event queue.
The scheduler continues executing some reactions (eg from a timer) until tag T is exceeded.
The async thread wakes up and pushes the event into the queue.
The scheduler then sees an event that was scheduled for the past - that was a bug in C++ and would currently crash the Rust runtime (assertion failure).

C++ fix

In C++ there is a global event queue and a global mutex protecting it. The fix is to put the time reading and the pushing of the event in the same critical section.

Rust

In Rust the event queue is split:

the scheduler owns the only reference to the global, sorted queue. This is where events are popped from for execution.
each async thread uses a channel Sender to push events to the scheduler asynchronously. The Receiver end maintains an unsorted buffer of events that is periodically flushed into the main queue by the scheduler thread. Events pushed through the Sender have already been assigned a tag.

We can assume Sender/Receiver communicate atomically.

Possible solutions for the Rust runtime

Global mutex

We could reproduce the C++ solution by introducing a mutex to guard the receiver and sender. This would however defeat part of the purpose of using channels, which is that we don't need to block the async sender thread when sending something.

Let the scheduler assign tags

Another solution would be to let the scheduler thread assign tags to asynchronous events. There are several possible problems with this:

This relies on the assumption that reaction execution times are negligible. A long-running reaction could delay the tag assignment for an asynchronous event significantly. This would compromise the real-time capabilities of the runtime, however, the lag can be measured and reported on.
Another problem with this approach is that async events would be "bucketed" into fewer tags than what would be were they assigned tags asynchronously. This might make more events simultaneous than necessary.

Mixed solution

We could use the asynchronously assigned tag as long as it is greater than the latest processed tag. If it isn't, then we're in the problematic situation described above. Then, we can do something else:

crash
drop the event, and go on
assign the current latest processed tag + 1 microstep, and go on
and report in any case to the user that something wrong happened.

None of these look super appealing in the general case - maybe it should be selectable

globally, eg with a compile time feature flag.
per individual action, with an annotation in the source.

The text was updated successfully, but these errors were encountered:

lhstrh · 2022-10-21T15:20:23Z

I think reassigning the tag of the new event is the only reasonable option. I think we should think of it as a transaction. If the race occurs and the tag of the new event is wrong, we roll back, get a new tag, and attempt inserting it again.

lhstrh · 2022-10-21T15:21:52Z

Note that whatever tag is obtained for the scheduled physical action is uncertain, anyway.

oowekyala · 2022-11-04T13:46:00Z

Ok, I'll implement this.

For the record, I could not reproduce the bug without adding a thread::sleep in the middle of the critical section, in the code of the runtime (not of the LF program). I suspect this bug is mostly theoretical...

lhstrh · 2022-11-07T21:49:36Z

These kinds of bugs are load dependent and might only surface rarely, yet I wouldn't call them theoretical because that wrongfully suggests that they cannot really happen in deployment.

oowekyala mentioned this issue Nov 4, 2022

Proper handling of asynchronous interactions in Rust lf-lang/lingua-franca#1459

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with tag assignment for asynchronous events #33

Problem with tag assignment for asynchronous events #33

oowekyala commented Oct 21, 2022

lhstrh commented Oct 21, 2022

lhstrh commented Oct 21, 2022

oowekyala commented Nov 4, 2022

lhstrh commented Nov 7, 2022

Problem with tag assignment for asynchronous events #33

Problem with tag assignment for asynchronous events #33

Comments

oowekyala commented Oct 21, 2022

Possible faulty execution

C++ fix

Rust

Possible solutions for the Rust runtime

Global mutex

Let the scheduler assign tags

Mixed solution

lhstrh commented Oct 21, 2022

lhstrh commented Oct 21, 2022

oowekyala commented Nov 4, 2022

lhstrh commented Nov 7, 2022