Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nominal (abstract) types #4

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Nominal (abstract) types #4

wants to merge 11 commits into from

Conversation

garrigue
Copy link

@garrigue garrigue commented Nov 4, 2019

A proposal for a uniform handling of abstract types w.r.t. GADTs and injectivity.

Cf. #9042, #5985.

@ygrek
Copy link

ygrek commented Nov 21, 2019

I have no idea about typing part but using annotations to influence type-checker seems wrong? I would expect this to be part of language syntax proper.

@garrigue
Copy link
Author

I'm not sure what the best syntax would be for this one.
In practice, the impact is more on exhaustiveness warnings than on typing itself.
I.e., this is tracked by the type system, but [@@unboxed] is tracked that way too.
(However, [@@nominal] can also generate more equations, so this is not only about warnings.)

An alternative syntax could be type t : new for anonymous nominals, and type t : new M.t for named nominals.
It would be nicer to use paths rather than arbitrary strings.
In the future, if we can get some notion of namespace in the language, one could require than when defining a fresh nominal type in an implementation, the prefix should be the name of the library.

@lpw25
Copy link
Contributor

lpw25 commented Feb 12, 2020

Copying my response from the original thread, as it still represents my thoughts on this proposal:

All in all, I'm still not convinced by this proposal as compared to your original external-style proposal. Here you are adding a new component to OCaml's notion of type, whereas before you where just adding a new form of representation. The former seems quite invasive semantically for quite a small benefit.

In particular, I'm think your proposal creates a new question that must be answered for each new type the user creates: "What should its name be?". I don't see a simple systematic way for users to answer that question and that worries me.

In my experience, GADTs are either proper GADTs -- in which case the indices are ordinary types and their compatibility doesn't really matter -- or they are poor-mans indexed types -- in which case some fake types are created to encode some form of data and the external version would be sufficient to control their compatibility. I have not yet seen real examples that need the more invasive version of your proposal.

Additionally I would like to add that I am strongly against using paths instead of arbitrary strings. Using path syntax for things that are not semantically paths and do not behave as paths (e.g. in substitution) is a recipe for confusion.

@garrigue
Copy link
Author

Thanks for putting back the feedback here.
A short answer, because I'm on holidays.

  • I have got feedback from Nomadic labs, confirming that the inability to distinguish (non-external) abstract types in GADT indices is a problem for them, and they end up adding some assert false around. So this PR answers a real need, and the external proposal is not sufficient.
  • Feedback at Inria (@Drup among others) is very opposed to using strings. I personally have no strong opinion, but using paths with an intended meaning seems an intermediate option.

This problem has been left as is for more than 5 years, so I would like to eventually get to a solution. Of course, we still have time to refine things :)

@garrigue
Copy link
Author

Concerning the syntax, my feeling is that the attribute syntax is closer to what we need here, as this information strengthens the type checker, but does not change the semantics of types.
And we have now a number of interpreted attributes on type definitions in the compiler.
The drawback is however that attributes are weak to typos. But this is not specific to this case.

@lpw25
Copy link
Contributor

lpw25 commented Feb 14, 2020

Could we see the examples from Nomadic labs? I think the details of such examples are important to the discussion.

It would also help if someone who thinks that these labels are somehow paths instead of strings could give their reasoning.

Fundamentally, I think this proposal needs to answer this question: how do users answer the question "What should its name be?" every time they create a new time?

This problem has been left as is for more than 5 years, so I would like to eventually get to a solution.

Maybe we should consider more thorough solutions like tracking whether types are freshly created or not. It probably "just" requires using a bunched context for modules to track sharing.

@garrigue
Copy link
Author

garrigue commented Apr 6, 2020

Sorry, going to Japan and back and all that delayed me a lot.

I thought a bit about one of the main applications, which is allowing to define witnesses for parametric abstract types, but I could actually find a workaround in that case. Whether it should be described as practical is another question.

Here is the code. The idea is to define a new abstract type Vec.t, which cannot be guaranteed to be injective, and then find a way to make type witnesses that allow to unwrap dynamic values of type int Vec.t Vec.t. Note that if we attempt to define witnesses in the usual way, i.e. type _ ty = Vec : 'a ty -> 'a Vec.t ty, the type definition would not be accepted.

module Vec : sig
  type +'a t
  val make : int -> (int -> 'a) -> 'a t
  val get : 'a t -> int -> 'a
end = struct
  type 'a t = Obj.t array
  let make = Obj.magic Array.init
  let get = Obj.magic Array.get
end

type ('a,'b) vec = Vec of 'a * 'b

type (_,_) ty =
  | Int : (int,int) ty
  | Fun : ('a,'c) ty * ('b,'d) ty -> ('a -> 'b,'c -> 'd) ty
  | Vec : ('a,'b) ty -> ('a Vec.t, ('a,'b) vec) ty

type dyn = Dyn : ('a,_) ty * 'a -> dyn

type (_,_) eq = Refl : ('a,'a) eq

let rec eq_ty : type a b c d.
    (a,c) ty -> (b,d) ty -> ((a,b) eq * (c,d) eq) option =
  fun t1 t2 -> match t1, t2 with
  | Int, Int -> Some (Refl, Refl)
  | Fun (t11, t12), Fun (t21, t22) ->
      begin match eq_ty t11 t21, eq_ty t12 t22 with
      | Some (Refl, Refl), Some (Refl, Refl) -> Some (Refl, Refl)
      | _ -> None
      end
  | Vec t1, Vec t2 ->
      begin match eq_ty t1 t2 with
      | Some (Refl, Refl) -> Some (Refl, Refl)
      | None -> None
      end
  | _ -> None

let undyn : type a b. (a,b) ty -> dyn -> a option =
  fun t1 (Dyn (t2, v)) ->
    match eq_ty t1 t2 with
    | Some (Refl, _) -> Some v
    | None -> None

let v = Vec.make 3 (fun n -> Vec.make n (fun m -> (m*n)))

let d = Dyn (Vec (Vec Int), v)

let Some v' = undyn (Vec (Vec Int)) d

This seems to work for this example, but I'm curious whether this solution is generic enough in practice.

@garrigue
Copy link
Author

garrigue commented Apr 8, 2020

Here is another version using Oleg's workaround of reifying the equation.
I think this is weaker, but I must look for another example.

type (_,_) eq = Refl : ('a,'a) eq

type _ ty =
  | Int : int ty
  | Fun : 'a ty * 'b ty -> ('a -> 'b) ty
  | Vec : 'a ty * ('a Vec.t, 'b) eq -> 'b ty

type dyn = Dyn : 'a ty * 'a -> dyn

let rec eq_ty : type a b. a ty -> b ty -> (a,b) eq option =
  fun t1 t2 -> match t1, t2 with
  | Int, Int -> Some Refl
  | Fun (t11, t12), Fun (t21, t22) ->
      begin match eq_ty t11 t21, eq_ty t12 t22 with
      | Some Refl, Some Refl -> Some Refl
      | _ -> None
      end
  | Vec (t1, Refl), Vec (t2, Refl) ->
      begin match eq_ty t1 t2 with
      | Some Refl -> Some Refl
      | None -> None
      end
  | _ -> None

let undyn : type a. a ty -> dyn -> a option =
  fun t1 (Dyn (t2, v)) ->
    match eq_ty t1 t2 with
    | Some Refl -> Some v
    | None -> None

let v = Vec.make 3 (fun n -> Vec.make n (fun m -> (m*n)))

let int_vec_vec = Vec (Vec (Int,Refl), Refl)

let d = Dyn (int_vec_vec, v)

let Some v' = undyn int_vec_vec d

@garrigue
Copy link
Author

Following @gasche 's suggestion, I'm going to use ! for injective type parameters, in an independent prototype. Except if @Drup has a problem with this...

@garrigue
Copy link
Author

PR#9500 proposes an alternative solution to this RFC, through injectivity annotations on type parameters. Its implementation is simpler, and it is finer grain, but this does not allow to distinguish abstract types which cannot have the same implementation. Even for injectivity, they are in some way complementary:

  • injectivity annotations can handle the injectivity of non-nominal types
  • nominal types do not only say that a type constructor is injective in all its parameters, but also that one can unify parameters of two nominal types independently of their type constructor (i.e. they are matchable, in Haskell speak IIRC).

@lpw25
Copy link
Contributor

lpw25 commented Mar 8, 2021

I've read through the ml workshop abstract and the relevant parts of the related Haskell work. My current thoughts on the proposal are as follows.

Firstly, I think that contractiveness is useful to express for non-generative types, especially polymorphic variants and object types. It is also closer to injectivity than generativity, being a property of a type parameter rather than a whole type constructor. I think it would be worth having e.g.:

type ^'a t

for a contractive type definition.

Matchability in OCaml amounts to being a datatype and it is a property of the whole type constructor -- essentially a different arrow in the kind language as in the Haskell work. Personally, I'd propose making that very explicit and having a syntax like:

datatype 'a t

to represent a matchable type. Since all datatype are injective and contractive that would imply ! and ^. Matchability is sufficient for inferring higher-kinded types and does not really have to relate to separability, so I'd avoid making the connection to the "nominal" part of your proposal used for separability.

I remain somewhat unconvinced of the need for expressing the separability of datatypes that have a definition. I can see the need for creating fresh separate types, and I'd support adding syntax like:

type 'a t = external "file handle"

where external "file handle" is a datatype "kind" that is known to be separate from any other kind of data type.

I'm not strongly against having the ability to attach a name to a type in order to force it to be separate, but I still haven't seen convincing examples of the need for it, and remain a bit worried about how a library author is to decide whether to give their type a name or not. Perhaps using attribute syntax to add the feature as you propose and leaving it marked "experimental" in the "extensions" section of the manual for a while would allow people to try it out and find some good examples for it.

@garrigue
Copy link
Author

garrigue commented Mar 9, 2021

Thank you for your feedback. This is indeed the alternative approach I had in mind.
I am a bit concerned that providing each feature independently would be both painful to track and confusing to the user. On the other hand, it is correct that separability being optional in the proposal, it may make sense to view it as an independent attribute (it might even give rise to a stronger handling, such as saying that a type is not something, for instance not a float). Contractiveness is a bit more dubious, as it is a direct consequence of both matchability and separability, and handling it independently for each argument seems an overkill.

@lpw25
Copy link
Contributor

lpw25 commented Mar 9, 2021

separability being optional in the proposal, it may make sense to view it as an independent attribute (it might even give rise to a stronger handling, such as saying that a type is not something, for instance not a float)

Yeah, there seems to be quite a large potential design space for ways of controlling and exposing separability between types.

Contractiveness is a bit more dubious, as it is a direct consequence of both matchability and separability, and handling it independently for each argument seems an overkill.

It's true that matchability implies that a type is a datatype which implies that it is contractive. However, not all contractive types are matchable. In particular, polymorphic variants and objects are contractive but not matchable, and I think people using these types are much more likely to run into problems with contractiveness anyway. Certainly I have tried in the past to use functors to construct recursive polymorphic variant types, and found the contractiveness restrictions to be a real pain for that.

I think that as long as declaring a type matchable implicitly declares it contractive (and injective) in all its parameters then the contractiveness annotations (and indeed the injectivity annotations) will only be needed very rarely and will not confuse people too much.

@gasche
Copy link
Member

gasche commented Mar 9, 2021

Minor point: instead of having a zoo of weird non-variance signs that we always never see in code and are not self-discoverable (good luck googling for ^'a), I think that this is a case where some attribute/extension grammar with human words could make it much easier for the readers.

@lpw25
Copy link
Contributor

lpw25 commented Mar 9, 2021

I think that is a reasonable argument for using a (contextual?) keyword rather than a symbol, but I dislike using attributes and extension nodes for genuine language features. Half the point of attributes and extension nodes was to make it clear when you were using an extension or extra-linguistic feature.

@alainfrisch
Copy link
Contributor

I agree with @lpw25 , we should not abuse attributes for actual linguistic features. I think it's ok for backend-related stuff, but the use of attributes even for constructor unboxing is already in the gray-area (removing one annotation can break type checking).

We have used several times (cryptic) operator symbols to avoid introducing actual keywords (e.g. for method override, shadowing open). Adding reserved keywords is always tricky, but one could rely on identifiers with some custom syntactic delimiters (e.g. method(override) foo = ... instead of method! foo = ..). Such modifiers would be lexed as LIDENTs, and analysed in the parser (i.e. a clean representation in the Parsetree), yielding syntax errors for unknown modifiers. We could do something like that for type definitions as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants