Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial on Polymorphic Variants #1531

Closed
wants to merge 48 commits into from
Closed

Tutorial on Polymorphic Variants #1531

wants to merge 48 commits into from

Conversation

cuihtlauac
Copy link
Collaborator

@cuihtlauac cuihtlauac commented Sep 20, 2023

Polymorphic variants deserve a tutorial supplementing documentation from the manual and gathering information spread in the mailing list, discuss and blog posts.

Here is the outline:

  1. Introduction and historical notes
  2. A First Example
  3. Exact, Closed and Open Variants
  4. Subtyping
  5. Named Polymorphic Variants
  6. Advanced: Alised, Parametrized and Recursive Polymorphic Variabnts
  7. Advanced: Combining Polymorphic Variants and Simple Variants
  8. Uses-Cases
  9. Drawbacks
  10. When to Use Polymorphic Variants
  11. Conclusions & References

@christinerose
Copy link
Collaborator

christinerose commented Oct 2, 2023

Apologies, this commit ^^ is through li. 65.


**Prerequisites**: This is an intermediate-level tutorial. It is required to have completed tutorials on [Functions and Values](/docs/functions-and-values), [Basic Data Types](/docs/basic-data-types), and [Lists](/docs/lists) to begin this one.

### Origin and Context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this bits of trivia <3 but I'd put them all the way at the end of the tutorial. Almost an appendix for those that are interested in deepening the context in which polyvars where created.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Your feedback on this is consistent with the discuss thread; it will go at the end of the document.

- Choose to use polymorphic variants when really needed
-->

### A Note on Simple Variants and Polymorphism
Copy link
Contributor

@leostera leostera Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the learning goals for this page, I'm not sure if this clarification helps me write/use/assess polyvars at all.

Feels like a "nice to know" but not a "must know" 🤔 – maybe it would fit better near ## When to Use Polymorphic Variant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, if this section belongs to this tutorial, it is not at its place.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion makes sense too. I'll do that.


However, they are type-checked using different algorithms, which results in a different programming experience. The relationship between value and type (written with the colon symbol `:`) is changed with polymorphic variants. Usually, values are thought of as inhabitants of the type, which is regarded as a set-like thing. Rather, polymorphic variant values should be considered as pieces of data that several functions can accept. Polymorphic variants types are a way to express compatibility relationships between those functions. The approach in this tutorial is to build sense from experience using features of polymorphic variants.

**Prerequisites**: This is an intermediate-level tutorial. It is required to have completed tutorials on [Functions and Values](/docs/functions-and-values), [Basic Data Types](/docs/basic-data-types), and [Lists](/docs/lists) to begin this one.
Copy link
Contributor

@leostera leostera Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love prerequisites! ✨ I think I'd phrase it around skills/concepts rather than completed tutorials, and it could be placed right below the main title or in a dedicated section so the very first thing I notice is whether I need to learn something before digging into this tutorial.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard to reach a consensus on this. Some want to have prerequisites as the very first thing. Others prefer to start with the learning goals. Ideally, both could be turned into metadata. That would allow more rendering options.


This tutorial teaches you how to use polymorphic variants. This includes starting to use them, maintaining a project already using them, deciding when to use them or not, and balancing their unique benefits against their drawbacks.

Product types and data types such as `option` and `list` are variants and polymorphic. In this tutorial, they are called _simple variants_ to distinguish them from the _polymorphic variants_ presented here. Simple variants and polymorphic variants are close siblings. Their values are both introduced using labels that may carry data. Both can be recursive and have type parameters. By the way, don't trust a [LLM](https://en.wikipedia.org/wiki/Large_language_model) chatbot if it tells you polymorphic variants are dynamically typed; it is a hallucination. Like simple variants, polymorphic variants are type-checked statically.
Copy link
Contributor

@leostera leostera Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rather than preambling around LLMs or whether other kinds of types are also polymorphic (but meaning something different), I'd just immediately start building an intuition:

  1. what is a polymorphic variant,
  2. why would I want to use them,
  3. and how to identify them.
A polymorphic variant is a more flexible kind of variant type that does not need
to be defined upfront, and is useful for working with groups of constructors
(we call them _tags_) that can grow or shrink.

When they are used to create values, you'll see them prefixed with a backtick:
`` `hello_world ``.

When they are used in type-signatures, you'll see them wrapped in square brackets:
 `` [ `hello_world ] ``.

The rest of the tutorial can then dig into these facets and make sure we go from an intuition to a proper understanding, including things like its origins or why the relationship between value and type may be different.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems consistent with the feedback in the discuss thread. A differently toned introduction is demanded. Let's write it.


## A First Example

### Creating Polymorphic Variant Types From Pattern Matching
Copy link
Contributor

@leostera leostera Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An even smaller example we could use here is a list. I think it could cement the intuition that polyvars are closer to the data itself than to defined types:

# let my_favorite_fruits = [ `banana ];;
val my_favorite_fruits : [> `banana ] list

# let my_favorite_fruits = [ `banana; `apple ];;
val my_favorite_fruits : [> `apple | `banana ] list

# let my_favorite_fruits = [ `banana; `apple; `kiwi ];;
val my_favorite_fruits : [> `apple | `banana | `kiwi ] list

You could easily go from there to using polyvars that carry data:

# let my_favorite_fruits = [ `banana; `apple `green; `orange `blood; ];;
val my_favorite_fruits :
  [> `apple of [> `green ] | `banana | `orange of [> `blood ] ] list

Copy link
Contributor

@leostera leostera Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the path I'd take when explaining them, starting from the smallest possible example and growing one step at a time:

  1. `banana is of type [> `banana ]

  2. if you put it together with other things that are their own type (`apple) then they form a larger type ([> `apple | `banana ])

  3. if you use an argument like a polyvar in a function, then the input type gets formed by its usage:

# let f `hello = `world ;;
val f : [< `hello ] -> [> `world ] = <fun>
  1. then we can explain > and <, the exact polyvar type, and what combinations like [> `name of string < `age of int ] mean

  2. then we can introduce sharing

# let say_hi (`name name) = "hello, " ^ name;;
val say_hi : [< `name of string ] -> string

# let say_bye (`name name) = "goodbye, " ^ name;;
val say_bye : [< `name of string ] -> string
  1. and upfront definitions to make the reuse cleaner
type name = [ `name of string ]
  1. and now that you have a polyvar defined upfront, you can introduce composition
type name = [ `name of string ]
type age = [ `age of int ]
type attributes = [ name | age | `other of (string, string) ]
  1. profit!


Note each of the above translations into simple variants is correct. However, entering them as-is into the environment would lead to constructor shadowing (unless type annotation is used at pattern or expression level, see section on [Shared Constructors](#Shared-Constructors)).

This also illustrates the other striking feature of polymorphic variants: values can be attached to several types. For instance, the tag `` `Broccoli`` inhabits ``[ `Broccoli ]`` and``[ `Broccoli | `Fruit of String ]``, but it also inhabits any type that contains it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"but it also inhabits any type that contains it."

This is going to raise eyebrows. Inhabits alone is a term that confused the hell out of me for a long time.

I think if we build the intuition that `Broccoli is a value that belongs to the type [ > `Broccoli ] then it would make more sense that anywhere that we see `Broccoli in the type, then `Broccoli is a valid value.

Of course there's the case where `Broccoli belongs to [ > `Apple ] because of > but I'd leave > until we talk about polyvars as function inputs/outputs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The terminology I've used here is clumsy. This is needs to be rewritten.


This also illustrates the other striking feature of polymorphic variants: values can be attached to several types. For instance, the tag `` `Broccoli`` inhabits ``[ `Broccoli ]`` and``[ `Broccoli | `Fruit of String ]``, but it also inhabits any type that contains it.

What is displayed by the type-checker, for instance ``[< `Broccoli | `Fruit of string ]``, isn't a single type. It is a type expression that designates a constrained set of types. For instance, all the types that are defined by a group of tags containing either`` `Broccoli`` or `` `Fruit of string`` and nothing more. This is the meaning of the `<` sign in this type expression. This is a bit similar to what happens with `'a list`, which isn't a single type either, but a type expression that designates the set of `list` types of something, i.e. the set of types where `'a` has been replaced by some other type. It is also meant to indicate that the exact types are subsets of the indicated ones (this will be explained in the section on [Subtyping](#Subtyping-of-Polymorphic-Variants)).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"isn't a single type"

Not entirely sure if the idea that "in OCaml every value has a single type" (principality) has been hammered in by now in the tutorials, but if it has then this would be weird to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right again. This needs to be said differently.

# let g = function `Edible name -> name | `Broccoli -> "Brassica oleracea";;
val g : [< `Broccoli | `Edible of string ] -> string = <fun>

# f `Broccoli;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider that if I jump into this section, I don't know where f came from or what it is. Keeping it self-contained should be okay since these examples are super small.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. The document does not state explicitly examples are meant to be played in order and jumping into a section does not work. This needs to be clarified.


Polymorphic variant tags are meant to be used as stand-alone values, wherever it makes sense. As long as used consistently, with a single implicit type per tag, the type checker will accept any combination of them in pattern-matching expressions.

### Static Type-Checking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section seems unnecessary if the remarks of the LLM hallucinating are removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, it is just a remark.

Product types and data types such as `option` and `list` are variants and polymorphic. In this tutorial, they are called _simple variants_ to distinguish them from the _polymorphic variants_ presented here. Simple variants and polymorphic variants are close siblings. Their values are both introduced using labels that may carry data. Both can be recursive and have type parameters. By the way, don't trust a [LLM](https://en.wikipedia.org/wiki/Large_language_model) chatbot if it tells you polymorphic variants are dynamically typed; it is a hallucination. Like simple variants, polymorphic variants are type-checked statically.

However, they are type-checked using different algorithms, which results in a different programming experience. The relationship between value and type (written with the colon symbol `:`) is changed with polymorphic variants. Usually, values are thought of as inhabitants of the type, which is regarded as a set-like thing. Rather, polymorphic variant values should be considered as pieces of data that several functions can accept. Polymorphic variants types are a way to express compatibility relationships between those functions. The approach in this tutorial is to build sense from experience using features of polymorphic variants.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this paragraph is trying to explain.

First, since typecheckers are not the subject on the article, bringing type-checker algorithms only encourage the idea that you can only understand OCaml language if you are a type system expert .

Second, polymorphic variants are using the same type-checking algorithm that the rest of the language.

Third, None already belongs to multiple type, there is nothing special at this level with respect to the row type variable of objects or polymorphic variable.

I advocate to completely remove this paragraph.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of an introduction that would list the practical features of polymorphic variants, from a developer's perspective. I agree it would be friendlier. What do you think of this ?

  • Optional declaration
  • Tag sharing
  • Hash patterns

That creates a different, and probably better, learning track. It implies a large refactoring of the rest of the text, but I agree it's worth the effort.

I'm concerned about the relationship with simple variants. Some newcomers are puzzled by the resemblance-difference with polymorphic variants. Plenty of questions arise in relation to that.


### Origin and Context

Polymorphic variants originate from Jacques Garrigue's work on Objective Label, which was [first published in 1996](https://caml.inria.fr/pub/old_caml_site/caml-list-ar/0533.html). It became part of standard Objective Caml with [release 3.0](https://caml.inria.fr/distrib/ocaml-3.00/) in 2000, along with labelled and optional function arguments. They were introduced to give more precise types in [LablTk](https://garrigue.github.io/labltk/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph should probably be a footnote except if the expected audience of the tutorial are student in a Programming Language class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. This is consistent with feedback in the discuss thread.

1. If typing inconsistencies are found, an error is raised.

This is very similar to solving an equation in mathematics. Equations accept either zero, exactly one, several, or infinitely many numbers as solutions. Nominal type-checking finds that either zero, exactly one, or any type can be used in an expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misleading, consider None.

Copy link
Collaborator Author

@cuihtlauac cuihtlauac Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. If we downplay the type-checking algorithm story, this will likely disappear.

This is very similar to solving an equation in mathematics. Equations accept either zero, exactly one, several, or infinitely many numbers as solutions. Nominal type-checking finds that either zero, exactly one, or any type can be used in an expression.

In the structural approach of typing, type definitions are optional, so they can be omitted. Type-checking an expression constructs a data structure that represents the types that are compatible with it. These data structures are displayed as type expressions sharing a resemblance with simple variants.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest value with a structural in OCaml is probably let id x = x which doesn't fit this paragraph.

I am also still confused why we are speaking of a data structure that represent types? Why are the type system implementation details the focus of this paragraph?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.


## When to Use Polymorphic Variant

Only using polymorphic variants is over-engineering, it should be avoided. Polymorphic variants aren't an extension of simple variants. From the data perspective, polymorphic variants and simple variants are equivalent. They both are sum types in the algebraic data types sense, both with parametric polymorphism and recursion. Back quote isn't a difference that matters. The only meaningful difference is the algorithm. Therefore, deciding when to use polymorphic variants boils down to another question:
Copy link
Contributor

@leostera leostera Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Only using polymorphic variants is over-engineering,"

I also don't recommend people to only use polyvars but I think we can do better and explain why.

For example, the type errors can get VERY out of hand.

Also not sure why reiterating the whole "not an extension, but equivalent" is relevant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example code triggering such type errors? I may be useful.

"not an extension, but equivalent": Yes, and that discussion will by downsized and what's left will move to the end of the document.


Function `f` accepts tags `` `Broccoli`` and `` `Fruit`` whilst `g` accepts `` `Broccoli`` and `` `Edible``. But if `f` and `g` are stored in a list, they must have the same type. That forces their domain to be restricted to a common subtype. Although `f` can handle the tag `` `Fruit``, it no longer accepts that parameter when `f` is extracted from the list.

### Weaken Type-Checking and Harder Debugging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you write "weaken type-checking" are you referring to having both a Gray and Grey tags? I assumed this was meant to be an example of an accident/typo, but I don't think its fair to call it "weakened type-checking".

Like the compiler isn't going to mistake Grey for Gray.

Copy link
Collaborator Author

@cuihtlauac cuihtlauac Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the compiler isn't going to mistake Grey for Gray.

But it would if it had been simple variants. That's the point. But I didn't say it explicitly, so that's on me.


To conclude, remember a simple variant that naturally encodes some data should remain a simple variant.

## Conclusion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a Conclusion i'd have a section that summarizes what was learned in this tutorial:

  • We learned how to create new polymorphic variant tags
  • We learned how to to constraint them
  • We learned how to reuse tags
  • We learned how to use polymorphic variants for flexible error handling
  • etc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's simpler.

Copy link
Contributor

@reynir reynir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some typos in the OCaml code.

data/tutorials/language/0lg_10_polymorphic_variants.md Outdated Show resolved Hide resolved
data/tutorials/language/0lg_10_polymorphic_variants.md Outdated Show resolved Hide resolved
val nth : 'a list -> int -> ('a, nth_error) result = <fun>

# let f_res m n =
let* u = init m Fun.id in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think let* is not defined in this snippet, and when defining it in the snippet below it may be worth linking to some text about the let* syntax(?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the definition of let should move up. Links are also needed you are right

Cuihtlauac ALVARADO added 3 commits November 27, 2023 16:36
Addresses predefined types, non mutable records, non mutually recursive
and non generalized algebraic datatypes.
sabine added a commit that referenced this pull request Dec 21, 2023
sabine added a commit that referenced this pull request Dec 22, 2023
sabine added a commit that referenced this pull request Dec 27, 2023
sabine added a commit that referenced this pull request Jan 3, 2024
sabine added a commit that referenced this pull request Jan 10, 2024
sabine added a commit that referenced this pull request Jan 11, 2024
sabine added a commit that referenced this pull request Jan 11, 2024
sabine added a commit that referenced this pull request Jan 12, 2024
sabine added a commit that referenced this pull request Jan 12, 2024
sabine added a commit that referenced this pull request Jan 16, 2024
sabine added a commit that referenced this pull request Jan 18, 2024
sabine added a commit that referenced this pull request Jan 25, 2024
sabine added a commit that referenced this pull request Jan 30, 2024
@tmattio
Copy link
Collaborator

tmattio commented Oct 1, 2024

I'm closing this as stalled. I've heard that a rewrite was in progress somewhere, but in the meantime, I'll close this. Feel free to re-open or open a new PR if you resume work on this.

@tmattio tmattio closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Polymorphic Variants
6 participants