Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary support for scheduling enclaves in the C target #2104

Open
wants to merge 140 commits into
base: master
Choose a base branch
from

Conversation

erlingrj
Copy link
Collaborator

@erlingrj erlingrj commented Nov 14, 2023

This PR adds preliminary support for scheduling enclaves in the C target. This a new attempt at a simpler AST transformation based on only replacing the connections between enclaves with generated EnclavedConnection reactors. This is modeled after the DelayedConnections and PhysicalConnections in the C target.

Limitations:

  1. No zero-delay cycles. Enclaves do not coordinate with PTAGs yet
  2. No banks of enclaves
  3. No enclaves with multiport
  4. No enslaved connections using either broadcast connections or connections with multiple ports on each side of the connection operator.

Corresponding reactor-c PR: lf-lang/reactor-c#308

Copy link
Collaborator Author

@erlingrj erlingrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback Edward, I have followed your suggestions and improved the code quality. Now we just need to verify that it didn't break CI. Also if you could have a look at the changes in reactor-c it would be great. Then we can finally (after more than a year...) merge this

core/src/main/java/org/lflang/generator/c/CGenerator.java Outdated Show resolved Hide resolved
core/src/main/java/org/lflang/generator/c/CGenerator.java Outdated Show resolved Hide resolved
core/src/main/java/org/lflang/ast/ASTUtils.java Outdated Show resolved Hide resolved
core/src/main/java/org/lflang/ast/ASTUtils.java Outdated Show resolved Hide resolved
@erlingrj
Copy link
Collaborator Author

Hmmmm, all the Enclave test programs crashes on Windows with a STATUS_ACCESS_VIOLATION which means "Reading or writing to an inaccessible memory location."

@edwardalee
Copy link
Collaborator

edwardalee commented Dec 16, 2023

Hmmmm, all the Enclave test programs crashes on Windows with a STATUS_ACCESS_VIOLATION which means "Reading or writing to an inaccessible memory location."

I have found a number of errors in the termination function w.r.t. enclaves. I am fixing these in the remove-absent-messages branch. Not sure whether they are related...

Actually, I take it back. These were not errors...

@erlingrj
Copy link
Collaborator Author

I have experienced that MSVC is more strict than GCC and that running C programs with undefined behaviour leads to crashes on Windows and not Linux. My hunch here is that there is some issue in rti_common.c introduced/exposed recently, I didn't see this error until I merged this branch with the latest changes on main. Before the enclave support, rti_common.c was never compiled for Windows since it was only part of the RTI which is only compiled for macOS and Linux.

Is it possible that the issues we are having on remove-absent-messages are caused by the same potential coding error which results in undefined behaviour?

@edwardalee
Copy link
Collaborator

Is it possible that the issues we are having on remove-absent-messages are caused by the same potential coding error which results in undefined behaviour?

Yes, this is very possible.

@edwardalee
Copy link
Collaborator

I double checked the code in the last-time merge and couldn't find anything suspicious. Unfortunately, ssh into CI doesn't work anymore, so have neither a way to see the output nor a way to replicate the errors occurring in remove-absent-messages (macOS only)... not sure how to proceed.

@erlingrj
Copy link
Collaborator Author

Tomorrow I will try to reproduce the error in this branch on a Windows VM. Hopefully I will get some info on where the memory error is, I will report back maybe these errors are related

@erlingrj
Copy link
Collaborator Author

@edwardalee this is a long shot. But I address a few memory issues in this commit: lf-lang/reactor-c@37a1249 it does affect the RTI. Malloc is used to create a new struct, one of the fields of the struct is a pointer. This field is not set explicitly to NULL and later, if it is non-NULL, it is assumed to have been allocated and is freed.

You could cherry-pick it to your branch. But, yeah, its a long shot.

@edwardalee
Copy link
Collaborator

Thanks. I've cherry picked this, but it's unlikely to make a difference because the deadlock appears to be occurring during shutdown. I do now have a lead, however. It turns out that a failure to write to a socket triggers a SIGPIPE signal which, by default, shuts down the program. However, shutting down the program causes a termination function to be invoked, which tries to acquire a mutex lock that is used to protect writes to a socket. This could lead to a deadlock. I'm trying two experiments now: First, see whether ignoring the SIGPIPE signal prevents the problem. If this fails (each experiment takes a while), then I will try avoiding acquiring the mutex lock in the terminate_execution function. This could, however, result in a corrupted resign message to the RTI, so I'm hoping the first solution will work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c Related to C target feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants