Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add %foreign_impl pragma for augmenting ffi functions #3303

Merged
merged 2 commits into from
Jun 11, 2024

Conversation

dunhamsteve
Copy link
Contributor

Description

Currently in Idris, we can't add a new backend to an existing %foreign definition without editing the library containing the definition. This issue came up for me when I wanted to use Control.App in node and needed to stub out prim__fork for the node codegen. But it also arises when working with a third party backend.

In a discord discussion @stefan-hoeck suggested the following syntax to add additional bindings to an existing %foreign definition:

%foreign_impl Prelude.IO.prim__putStr "javascript:lambda:x=>..."

This implementation of that syntax follows the pattern of %foreign and allows an indented block of expressions that evaluate to strings. The test case uses the pragma to make Control.App work with the node codegen.

The general idea is to update the definition of the %foreign function to add the new bindings. We store the additional information in Options, accumulate it as TTC files are read in, and then update the defs in Compiler.Common. We can't directly modify the definition when the pragma is evaluated because the context is dropped after compiling the module.

Should this change go in the CHANGELOG?

  • If this is a fix, user-facing change, a compiler change, or a new paper
    implementation, I have updated CHANGELOG_NEXT.md (and potentially also
    CONTRIBUTORS.md).

@stefan-hoeck
Copy link
Contributor

Very nice! Thanks for taking the time to implement this.

Copy link
Collaborator

@mattpolzin mattpolzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice to have this. I haven’t played with it yet, but a couple of additional questions I think would be good to have answered and probably mentioned in the docs:

  1. Is redefinition of an existing backend allowed (foreign directive and foreign impl directive specifying the same backend)?
  2. What happens when two foreign impls collide (not a foreign def and new impl per (1) but rather two foreign impl directives)?

@dunhamsteve
Copy link
Contributor Author

Currently, it just aggregates the directives. The most recently loaded module is first. I was on the fence on whether this should be an error as I can see a case for overriding the declaration in the library and possibly a case for a backend wanting more than one declaration for a given name. So I took the simpler path and the handling of duplicates is left to the backend.

From auditing the code (I haven't experimented to verify):

  • The scheme and RefC backends go through the list of supported backends in order and pick the first match in the list. So it may pick the first "scheme,chez" and if there is none, it will pick the first "scheme", etc. This happens in parseCC.
    • Chez looks for: ["scheme,chez", "scheme", "C__collect_safe", "C"]
    • Racket looks for: ["scheme,racket", "scheme", "C"]
    • RefC looks for: ["RefC","C"]
  • Javascript goes through the list and picks the first one that matches any supported backend (supported backends are either ["node","javascript"] or ["browser","javascript"]). This happens in searchForeign.

There is a little bit of ambiguity here, if two TTC are loaded that have an override for the same backend, the behavior depends on the load order. I do check that the name we override/augment is a %foreign, ignoring visibility, and give an error if it is not a foreign function.

Aside from what is already implemented, if we want to detect these conflicts, we could either give a warning or an error when there is a duplicate. We could check:

  1. When processing the pragma
    • This will miss the cases where two conflicting TTC are loaded independently
  2. When loading the TTC file and merging it's foreignImpl into the context
    • This approach gets us the name of the module with the conflicting definition
  3. When merging the foreign_impl into the context in Compiler.Common.addForeignImpl
  4. When a backend processes the list of bindings
    • The backends currently choose the first match, but they could be changed
    • This allows a backend to decide what to accept
    • We don't have FC on the individual bindings

@mattpolzin
Copy link
Collaborator

mattpolzin commented Jun 9, 2024

I think this capability may be too useful not to merge even with some shortcomings (though non-determinism is particularly unfortunate). Long term, I wonder if foreign functions would be more ergonomic if an error for ambiguity could be given and a hide directive could be used to address that error.

@dunhamsteve
Copy link
Contributor Author

For now, I have added documentation for how duplicate declarations are handled.

Note that it is also possible to introduce ambiguity with the existing compiler by doing:

%foreign "javascript:foo"
         "javascript:bar"
something : Nat -> Nat

@gallais gallais merged commit 9e84b15 into idris-lang:main Jun 11, 2024
22 checks passed
@dunhamsteve dunhamsteve deleted the foreignImpl branch June 11, 2024 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants