From 941a205abd2c2e0c061fa4ee7a9bac3f321163d5 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Wed, 20 Dec 2023 14:43:35 +0100 Subject: [PATCH] Update Modules and Functors Tutorials #1778 --- data/tutorials/language/1ms_00_modules.md | 315 +++-------- data/tutorials/language/1ms_01_functors.md | 599 ++++++++++++++------- data/tutorials/language/1ms_02_dune.md | 268 +++++++++ 3 files changed, 757 insertions(+), 425 deletions(-) create mode 100644 data/tutorials/language/1ms_02_dune.md diff --git a/data/tutorials/language/1ms_00_modules.md b/data/tutorials/language/1ms_00_modules.md index dd4c2d7ffd..588cff13fa 100644 --- a/data/tutorials/language/1ms_00_modules.md +++ b/data/tutorials/language/1ms_00_modules.md @@ -1,63 +1,65 @@ --- -id : modules +id: modules title: Modules description: > - Learn about OCaml modules and how they can be used to cleanly separate distinct parts of your program + Modules are collections of definitions grouped together. This is the basic means to organise OCaml software. category: "Module System" --- # Modules +## Introduction + +In this tutorial, we look at how to use and define modules. + +Modules are collections of definitions grouped together. This is the basic means to organise OCaml software. Separate concerns can and should be isolated into separate modules. + +**Prerequisites**: [Values and Functions](/docs/values-and-functions) and [Basic Data Types and Pattern Matching](/docs/basic-data-types) + ## Basic Usage ### File-Based Modules In OCaml, every piece of code is wrapped into a module. Optionally, a module itself can be a submodule of another module, pretty much like directories in a -file system - but we don't do this very often. +file system. -When you write a program, let's say using two files `amodule.ml` and -`bmodule.ml`, each of these files automatically defines a module named -`Amodule` and a module named `Bmodule` that provide whatever you put into the +When you write a program, let's say using the two files `amodule.ml` and +`bmodule.ml`, each automatically defines a module named +`Amodule` and a module named `Bmodule`, which provides whatever you put into the files. Here is the code that we have in our file `amodule.ml`: - ```ocaml let hello () = print_endline "Hello" ``` -And here is what we have in `bmodule.ml`: - +This is what we have in `bmodule.ml`: ```ocaml let () = Amodule.hello () ``` -### Automatised Compilation - -In order to compile them using the [Dune](https://dune.build/) build system, -which is now the standard on OCaml, at least two configuration files are -required: +In order to compile them using the [Dune](https://dune.build/) build system, at least two configuration files are required: * The `dune-project` file, which contains project-wide configuration data. Here's a very minimal one: - ``` - (lang dune 3.4) + ```lisp + (lang dune 3.7) ``` * The `dune` file, which contains actual build directives. A project may have several of them, depending on the organisation of the sources. This is sufficient for our example: - ``` + ```lisp (executable (name bmodule)) ``` Here is how to create the configuration files, build the source, and run the -executable. +executable: ```bash -$ echo "(lang dune 3.4)" > dune-project +$ echo "(lang dune 3.7)" > dune-project $ echo "(executable (name bmodule))" > dune $ dune build $ dune exec ./bmodule.exe @@ -74,46 +76,16 @@ In a real-world project, it is preferable to start by creating the `dune` configuration files and directory structure using the `dune init project` command. -### Manual Compilation - -Alternatively, it is possible, but not recommended, to compile the files by -directly calling the compiler, either by using a single command: - - -```sh -$ ocamlopt -o hello amodule.ml bmodule.ml -``` - -Or, as a build system does, one by one: - - -```sh -$ ocamlopt -c amodule.ml -$ ocamlopt -c bmodule.ml -$ ocamlopt -o hello amodule.cmx bmodule.cmx -``` - -In both cases, a standalone executable is created - -```sh -$ ./hello -Hello -``` - -Note: It's necessary to place the source files in the correct order. The dependencies must come before -the dependent. In the first example above, putting `bmodule.ml` before `amodule.ml` -will result in an `Unbound module` error. - ### Naming and Scoping -Now we have an executable that prints `Hello`. As you can see, if you want to +Now we have an executable that prints `Hello`. If you want to access anything from a given module, use the name of the module (always -starting with a capital letter) followed by a dot and the thing that you want to use. +starting with a capital letter) followed by a dot and the thing you want to use. It may be a value, a type constructor, or anything else that a given module can provide. Libraries, starting with the standard library, provide collections of modules. -for example, `List.iter` designates the `iter` function from the `List` module. +For example, `List.iter` designates the `iter` function from the `List` module. If you are using a given module heavily, you may want to make its contents directly accessible. For this, we use the `open` directive. In our example, @@ -126,7 +98,7 @@ let () = hello () ``` Using `open` or not is a matter of personal taste. Some modules provide names -that are used in many other modules. This is the case of the `List` module for +that are used in many other modules. This is the case of the `List` module, for instance. Usually, we don't do `open List`. Other modules like `Printf` provide names that normally aren't subject to conflicts, such as `printf`. In order to avoid writing `Printf.printf` all over the place, it often makes sense to place @@ -141,29 +113,29 @@ let () = List.iter (printf "%s\n") data There are also local `open`s: ```ocaml -# let map_3d_matrix f m = - let open Array in - map (map (map f)) m;; -val map_3d_matrix : - ('a -> 'b) -> 'a array array array -> 'b array array array = -# let map_3d_matrix' f = - Array.(map (map (map f)));; -val map_3d_matrix' : - ('a -> 'b) -> 'a array array array -> 'b array array array = +# let sum_sq m = + let open List in + init m Fun.id |> map (fun i -> i * i) |> fold_left ( + ) 0;; +val sum_sq : int -> int = + +# let sym_sq' m = + Array.(init m Fun.id |> map (fun i -> i * i) |> fold_left ( + ) 0);; +val sum_sq' : int -> int = + ``` ## Interfaces and Signatures A module can provide a certain number of things (functions, types, submodules, -etc.) to the rest of the program that is using it. If nothing special is done, +etc.) to the rest of the program using it. If nothing special is done, everything that's defined in a module will be accessible from the outside. That's often fine in small personal programs, but there are many situations where it is better that a module only provides what it is meant to provide, not any of the auxiliary functions and types that are used internally. For this, we have to define a module interface, which will act as a mask over -the module's implementation. Just like a module derives from an `.ml` file, the -corresponding module interface or signature derives from an `.mli` file. It +the module's implementation. Just like a module derives from a `.ml` file, the +corresponding module interface or signature derives from a `.mli` file. It contains a list of values with their type. Let's rewrite our `amodule.ml` file to something called `amodule2.ml`: @@ -191,9 +163,9 @@ val hello : unit -> unit (** Displays a greeting message. *) ``` -(note the double asterisk at the beginning of the comment. It is a good habit +Note the double asterisk at the beginning of the comment. It is a good habit to document `.mli` files using the format supported by -[ocamldoc](/releases/4.14/htmlman/ocamldoc.html)) +[ocamldoc](/releases/4.14/htmlman/ocamldoc.html) The corresponding module `Bmodule2` is defined in file `bmodule2.ml`: @@ -204,7 +176,7 @@ let () = Amodule2.hello () The .`mli` files must be compiled before the matching `.ml` files. This is done automatically by Dune. We update the `dune` file to allow the compilation -of this example aside of the previous one. +of this example aside from the previous one. ```bash @@ -216,26 +188,10 @@ $ dune exec ./bmodule2.exe Hello 2 ``` -Here is how the same result can be achieved by calling the compiler manually. -Notice the `.mli` file is compiled using bytecode compiler `ocamlc`, while -`.ml` files are compiled to native code using `ocamlopt`: - - -```sh -$ ocamlc -c amodule2.mli -$ ocamlopt -c amodule2.ml -$ ocamlopt -c bmodule2.ml -$ ocamlopt -o hello2 amodule2.cmx bmodule2.cmx -$ ./hello -Hello -$ ./hello2 -Hello 2 -``` - ## Abstract Types What about type definitions? We saw that values such as functions can be -exported by placing their name and their type in an `.mli` file, e.g., +exported by placing their name and their type in an `.mli` file, e.g.: ```ocaml @@ -254,7 +210,7 @@ There are four options when it comes to writing the `.mli` file: 1. The type is completely omitted from the signature. 2. The type definition is copy-pasted into the signature. 3. The type is made abstract: only its name is given. -4. The record fields are made read-only: `type date = private { ... }` +4. The record fields are made read-only: `type date = private { ... }`. Case 3 would look like this: @@ -282,8 +238,8 @@ val years : date -> float The point is that only `create` and `sub` can be used to create `date` records. Therefore, it is not possible for the user to create ill-formed records. Actually, our implementation uses a record, but we could change it and -be sure that it will not break any code that relies on this module! This makes -a lot of sense in a library since subsequent versions of the same library can +be sure that it will not break any code relying on this module! This makes +a lot of sense in a library because subsequent versions of it can continue to expose the same interface while internally changing the implementation, including data structures. @@ -291,9 +247,9 @@ implementation, including data structures. ### Submodule Implementation -We saw that one `example.ml` file results automatically in one module +We saw that one `example.ml` file results automatically in the module implementation named `Example`. Its module signature is automatically derived -and is the broadest possible, or can be restricted by writing an `example.mli` +and is the broadest possible, or it can be restricted by writing an `example.mli` file. That said, a given module can also be defined explicitly from within a file. @@ -349,7 +305,7 @@ let hello_goodbye () = The definition of the `Hello` module above is the equivalent of a `hello.mli`/`hello.ml` pair of files. Writing all of that in one block of code -is not elegant so, in general, we prefer to define the module signature +is not elegant, so in general, we prefer to define the module signature separately: @@ -371,175 +327,46 @@ interfaces. ### Displaying the Interface of a Module You can use the OCaml toplevel to visualise the contents of an existing -module, such as `List`: +module, such as `Fun`: ```ocaml -# #show List;; -module List : +# #show Fun;; +module Fun : sig - type 'a t = 'a list = [] | (::) of 'a * 'a list - val length : 'a t -> int - val compare_lengths : 'a t -> 'b t -> int - val compare_length_with : 'a t -> int -> int - val cons : 'a -> 'a t -> 'a t - val hd : 'a t -> 'a - val tl : 'a t -> 'a t - val nth : 'a t -> int -> 'a - val nth_opt : 'a t -> int -> 'a option - val rev : 'a t -> 'a t - val init : int -> (int -> 'a) -> 'a t - val append : 'a t -> 'a t -> 'a t - val rev_append : 'a t -> 'a t -> 'a t - val concat : 'a t t -> 'a t - val flatten : 'a t t -> 'a t - val equal : ('a -> 'a -> bool) -> 'a t -> 'a t -> bool - val compare : ('a -> 'a -> int) -> 'a t -> 'a t -> int - val iter : ('a -> unit) -> 'a t -> unit - val iteri : (int -> 'a -> unit) -> 'a t -> unit - val map : ('a -> 'b) -> 'a t -> 'b t - val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t - val rev_map : ('a -> 'b) -> 'a t -> 'b t - val filter_map : ('a -> 'b option) -> 'a t -> 'b t - val concat_map : ('a -> 'b t) -> 'a t -> 'b t - val fold_left_map : ('a -> 'b -> 'a * 'c) -> 'a -> 'b t -> 'a * 'c t - val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b t -> 'a - val fold_right : ('a -> 'b -> 'b) -> 'a t -> 'b -> 'b - val iter2 : ('a -> 'b -> unit) -> 'a t -> 'b t -> unit - val map2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t - val rev_map2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t - val fold_left2 : ('a -> 'b -> 'c -> 'a) -> 'a -> 'b t -> 'c t -> 'a - val fold_right2 : ('a -> 'b -> 'c -> 'c) -> 'a t -> 'b t -> 'c -> 'c - val for_all : ('a -> bool) -> 'a t -> bool - val exists : ('a -> bool) -> 'a t -> bool - val for_all2 : ('a -> 'b -> bool) -> 'a t -> 'b t -> bool - val exists2 : ('a -> 'b -> bool) -> 'a t -> 'b t -> bool - val mem : 'a -> 'a t -> bool - val memq : 'a -> 'a t -> bool - val find : ('a -> bool) -> 'a t -> 'a - val find_opt : ('a -> bool) -> 'a t -> 'a option - val find_map : ('a -> 'b option) -> 'a t -> 'b option - val filter : ('a -> bool) -> 'a t -> 'a t - val find_all : ('a -> bool) -> 'a t -> 'a t - val filteri : (int -> 'a -> bool) -> 'a t -> 'a t - val partition : ('a -> bool) -> 'a t -> 'a t * 'a t - val partition_map : ('a -> ('b, 'c) Either.t) -> 'a t -> 'b t * 'c t - val assoc : 'a -> ('a * 'b) t -> 'b - val assoc_opt : 'a -> ('a * 'b) t -> 'b option - val assq : 'a -> ('a * 'b) t -> 'b - val assq_opt : 'a -> ('a * 'b) t -> 'b option - val mem_assoc : 'a -> ('a * 'b) t -> bool - val mem_assq : 'a -> ('a * 'b) t -> bool - val remove_assoc : 'a -> ('a * 'b) t -> ('a * 'b) t - val remove_assq : 'a -> ('a * 'b) t -> ('a * 'b) t - val split : ('a * 'b) t -> 'a t * 'b t - val combine : 'a t -> 'b t -> ('a * 'b) t - val sort : ('a -> 'a -> int) -> 'a t -> 'a t - val stable_sort : ('a -> 'a -> int) -> 'a t -> 'a t - val fast_sort : ('a -> 'a -> int) -> 'a t -> 'a t - val sort_uniq : ('a -> 'a -> int) -> 'a t -> 'a t - val merge : ('a -> 'a -> int) -> 'a t -> 'a t -> 'a t - val to_seq : 'a t -> 'a Seq.t - val of_seq : 'a Seq.t -> 'a t + external id : 'a -> 'a = "%identity" + val const : 'a -> 'b -> 'a + val flip : ('a -> 'b -> 'c) -> 'b -> 'a -> 'c + val negate : ('a -> bool) -> 'a -> bool + val protect : finally:(unit -> unit) -> (unit -> 'a) -> 'a + exception Finally_raised of exn end ``` -There is online documentation for each library. +There is online documentation for each library, for instance [`Fun`](/api/Fun.html) ### Module Inclusion Let's say we feel that a function is missing from the standard `List` module, -but we really want it as if it were part of it. In an `extensions.ml` file, we +but we really want it as if it were part of it. In an `extlib.ml` file, we can achieve this effect by using the `include` directive: ```ocaml -# module List = struct +module List = struct include List - let rec optmap f = function - | [] -> [] - | hd :: tl -> - match f hd with - | None -> optmap f tl - | Some x -> x :: optmap f tl - end;; -module List : - sig - type 'a t = 'a list = [] | (::) of 'a * 'a list - val length : 'a t -> int - val compare_lengths : 'a t -> 'b t -> int - val compare_length_with : 'a t -> int -> int - val cons : 'a -> 'a t -> 'a t - val hd : 'a t -> 'a - val tl : 'a t -> 'a t - val nth : 'a t -> int -> 'a - val nth_opt : 'a t -> int -> 'a option - val rev : 'a t -> 'a t - val init : int -> (int -> 'a) -> 'a t - val append : 'a t -> 'a t -> 'a t - val rev_append : 'a t -> 'a t -> 'a t - val concat : 'a t t -> 'a t - val flatten : 'a t t -> 'a t - val equal : ('a -> 'a -> bool) -> 'a t -> 'a t -> bool - val compare : ('a -> 'a -> int) -> 'a t -> 'a t -> int - val iter : ('a -> unit) -> 'a t -> unit - val iteri : (int -> 'a -> unit) -> 'a t -> unit - val map : ('a -> 'b) -> 'a t -> 'b t - val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t - val rev_map : ('a -> 'b) -> 'a t -> 'b t - val filter_map : ('a -> 'b option) -> 'a t -> 'b t - val concat_map : ('a -> 'b t) -> 'a t -> 'b t - val fold_left_map : ('a -> 'b -> 'a * 'c) -> 'a -> 'b t -> 'a * 'c t - val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b t -> 'a - val fold_right : ('a -> 'b -> 'b) -> 'a t -> 'b -> 'b - val iter2 : ('a -> 'b -> unit) -> 'a t -> 'b t -> unit - val map2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t - val rev_map2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t - val fold_left2 : ('a -> 'b -> 'c -> 'a) -> 'a -> 'b t -> 'c t -> 'a - val fold_right2 : ('a -> 'b -> 'c -> 'c) -> 'a t -> 'b t -> 'c -> 'c - val for_all : ('a -> bool) -> 'a t -> bool - val exists : ('a -> bool) -> 'a t -> bool - val for_all2 : ('a -> 'b -> bool) -> 'a t -> 'b t -> bool - val exists2 : ('a -> 'b -> bool) -> 'a t -> 'b t -> bool - val mem : 'a -> 'a t -> bool - val memq : 'a -> 'a t -> bool - val find : ('a -> bool) -> 'a t -> 'a - val find_opt : ('a -> bool) -> 'a t -> 'a option - val find_map : ('a -> 'b option) -> 'a t -> 'b option - val filter : ('a -> bool) -> 'a t -> 'a t - val find_all : ('a -> bool) -> 'a t -> 'a t - val filteri : (int -> 'a -> bool) -> 'a t -> 'a t - val partition : ('a -> bool) -> 'a t -> 'a t * 'a t - val partition_map : ('a -> ('b, 'c) Either.t) -> 'a t -> 'b t * 'c t - val assoc : 'a -> ('a * 'b) t -> 'b - val assoc_opt : 'a -> ('a * 'b) t -> 'b option - val assq : 'a -> ('a * 'b) t -> 'b - val assq_opt : 'a -> ('a * 'b) t -> 'b option - val mem_assoc : 'a -> ('a * 'b) t -> bool - val mem_assq : 'a -> ('a * 'b) t -> bool - val remove_assoc : 'a -> ('a * 'b) t -> ('a * 'b) t - val remove_assq : 'a -> ('a * 'b) t -> ('a * 'b) t - val split : ('a * 'b) t -> 'a t * 'b t - val combine : 'a t -> 'b t -> ('a * 'b) t - val sort : ('a -> 'a -> int) -> 'a t -> 'a t - val stable_sort : ('a -> 'a -> int) -> 'a t -> 'a t - val fast_sort : ('a -> 'a -> int) -> 'a t -> 'a t - val sort_uniq : ('a -> 'a -> int) -> 'a t -> 'a t - val merge : ('a -> 'a -> int) -> 'a t -> 'a t -> 'a t - val to_seq : 'a t -> 'a Seq.t - val of_seq : 'a Seq.t -> 'a t - val optmap : ('a -> 'b option) -> 'a t -> 'b t - end + let uncons = function + | [] -> None + | hd :: tl -> Some (hd, tl) +end ``` -It creates a module `Extensions.List` that has everything the standard `List` -module has, plus a new `optmap` function. From another file, all we have to do -to override the default `List` module is `open Extensions` at the beginning of -the `.ml` file: +It creates a module `Extlib.List` that has everything the standard `List` +module has, plus a new `uncons` function. In order to override the default `List` module from another `.ml` file, we merely need to add `open Extlib` at the beginning. - -```ocaml -open Extensions +## Conclusion -... +In OCaml, modules are the basic means of organising software. To sum up, a module is a collection of definitions wrapped under a name. These definitions can be submodules, which allows the creation of hierarchies of modules. Top-level modules must be files and are the units of compilation. Every module has an interface, which is the list of definitions a module exposes. By default, a module's interface exposes all its definitions, but this can be restricted using the interface syntax. -List.optmap ... -``` +Going further, here are the other means to handle OCaml software components: +- Functors, which act like functions from modules to modules +- Libraries, which are compiled modules bundled together +- Packages, which are installation and distribution units diff --git a/data/tutorials/language/1ms_01_functors.md b/data/tutorials/language/1ms_01_functors.md index 7a30813a75..2342a4fe58 100644 --- a/data/tutorials/language/1ms_01_functors.md +++ b/data/tutorials/language/1ms_01_functors.md @@ -2,189 +2,426 @@ id: functors title: Functors description: > - Learn about functors, modules parameterised by other modules + Functors essentially work the same way as functions. The difference is that we are passing modules instead of values. category: "Module System" --- # Functors -Functors are probably one of the most complex features of OCaml, but you don't -have to use them extensively to be a successful OCaml programmer. Actually, -you may never have to define a functor yourself, but you will surely encounter -them in the standard library. They are the only way of using the Set and Map -modules, but using them is not so difficult. - -## What Are Functors and Why Do We Need Them? - -A functor is a module that is parametrised by another module, just like a -function is a value which is parametrised by other values, the arguments. - -It allows one to parametrise a type by a value, which is not possible directly -in OCaml without functors. For example, we can define a functor that takes an -`int n` and returns a collection of array operations that work exclusively on -arrays of length `n`. If by mistake the programmer passes a regular array to one -of those functions, it will result in a compilation error. If we were not using -this functor but the standard array type, the compiler would not be able to -detect the error, and we would get a runtime error at some undetermined date in -the future, which is much worse. - -## Using an Existing Functor - -The standard library defines a `Set` module, which provides a `Make` functor. -This functor takes one argument, which is a module that provides (at least) two -things: the type of elements, given as `t` and the comparison function given as -`compare`. The point of the functor is to ensure that the same comparison -function will always be used, even if the programmer makes a mistake. - -For example, if we want to use sets of `ints`, we would do this: - -```ocaml -# module Int_set = - Set.Make (struct - type t = int - let compare = compare - end);; -module Int_set : - sig - type elt = int - type t - val empty : t - val is_empty : t -> bool - val mem : elt -> t -> bool - val add : elt -> t -> t - val singleton : elt -> t - val remove : elt -> t -> t - val union : t -> t -> t - val inter : t -> t -> t - val disjoint : t -> t -> bool - val diff : t -> t -> t - val compare : t -> t -> elt - val equal : t -> t -> bool - val subset : t -> t -> bool - val iter : (elt -> unit) -> t -> unit - val map : (elt -> elt) -> t -> t - val fold : (elt -> 'a -> 'a) -> t -> 'a -> 'a - val for_all : (elt -> bool) -> t -> bool - val exists : (elt -> bool) -> t -> bool - val filter : (elt -> bool) -> t -> t - val filter_map : (elt -> elt option) -> t -> t - val partition : (elt -> bool) -> t -> t * t - val cardinal : t -> elt - val elements : t -> elt list - val min_elt : t -> elt - val min_elt_opt : t -> elt option - val max_elt : t -> elt - val max_elt_opt : t -> elt option - val choose : t -> elt - val choose_opt : t -> elt option - val split : elt -> t -> t * bool * t - val find : elt -> t -> elt - val find_opt : elt -> t -> elt option - val find_first : (elt -> bool) -> t -> elt - val find_first_opt : (elt -> bool) -> t -> elt option - val find_last : (elt -> bool) -> t -> elt - val find_last_opt : (elt -> bool) -> t -> elt option - val of_list : elt list -> t - val to_seq_from : elt -> t -> elt Seq.t - val to_seq : t -> elt Seq.t - val to_rev_seq : t -> elt Seq.t - val add_seq : elt Seq.t -> t -> t - val of_seq : elt Seq.t -> t - end -``` - -For sets of strings, it is even easier because the standard library provides a -`String` module with a type `t` and a function `compare`. If you were following -carefully, by now you must have guessed how to create a module to -manipulate string sets: - -```ocaml -# module String_set = Set.Make (String);; -module String_set : - sig - type elt = string - type t = Set.Make(String).t - val empty : t - val is_empty : t -> bool - val mem : elt -> t -> bool - val add : elt -> t -> t - val singleton : elt -> t - val remove : elt -> t -> t - val union : t -> t -> t - val inter : t -> t -> t - val disjoint : t -> t -> bool - val diff : t -> t -> t - val compare : t -> t -> int - val equal : t -> t -> bool - val subset : t -> t -> bool - val iter : (elt -> unit) -> t -> unit - val map : (elt -> elt) -> t -> t - val fold : (elt -> 'a -> 'a) -> t -> 'a -> 'a - val for_all : (elt -> bool) -> t -> bool - val exists : (elt -> bool) -> t -> bool - val filter : (elt -> bool) -> t -> t - val filter_map : (elt -> elt option) -> t -> t - val partition : (elt -> bool) -> t -> t * t - val cardinal : t -> int - val elements : t -> elt list - val min_elt : t -> elt - val min_elt_opt : t -> elt option - val max_elt : t -> elt - val max_elt_opt : t -> elt option - val choose : t -> elt - val choose_opt : t -> elt option - val split : elt -> t -> t * bool * t - val find : elt -> t -> elt - val find_opt : elt -> t -> elt option - val find_first : (elt -> bool) -> t -> elt - val find_first_opt : (elt -> bool) -> t -> elt option - val find_last : (elt -> bool) -> t -> elt - val find_last_opt : (elt -> bool) -> t -> elt option - val of_list : elt list -> t - val to_seq_from : elt -> t -> elt Seq.t - val to_seq : t -> elt Seq.t - val to_rev_seq : t -> elt Seq.t - val add_seq : elt Seq.t -> t -> t - val of_seq : elt Seq.t -> t - end -``` - -(the parentheses are necessary) - -## Defining Functors - -A functor with one argument can be defined like this: - - -```ocaml -module F (X : X_type) = struct - ... -end -``` - -where `X` is the module that will be passed as argument, and `X_type` is its -signature, which is mandatory. - -The signature of the returned module itself can be constrained, using this -syntax: - - -```ocaml -module F (X : X_type) : Y_type = -struct - ... -end -``` - -or by specifying this in the `.mli` file: - - -```ocaml -module F (X : X_type) : Y_type -``` - -Overall, the syntax of functors is hard to grasp. The best may be to look at -the source files -[`set.ml`](https://github.com/ocaml/ocaml/blob/trunk/stdlib/set.ml) or -[`map.ml`](https://github.com/ocaml/ocaml/blob/trunk/stdlib/map.ml) of the -standard library. + +## Introduction + +In this tutorial, we look at how to use a functor, how to write a functor, and show a couple of use cases involving functors. + +As suggested by the name, a _functor_ is almost like a function. However, while functions are between values, functors are between modules. A functor takes a module as a parameter and returns a module as a result. A functor is a parametrised module. + +**Prerequisites**: [Modules](/docs/modules). + +## Project Setup + +This tutorial uses the [Dune](https://dune.build) build tool. Make sure you have installed version 3.7 or later. We start by creating a fresh project. We need a folder named `funkt` with files `dune-project`, `dune`, `funkt.opam`, and `funkt.ml`. The latter two are created empty. +```shell +$ mkdir funkt; cd funkt + +$ touch funkt.opam funkt.ml +``` + +**`dune-project`** +```lisp +(lang dune 3.7) +``` + +**`dune`** +```lisp +(executable + (name funkt) + (public_name funkt) + (libraries str)) +``` + +Check this works using the `dune exec funkt` command, it shouldn't do anything (the empty file is valid OCaml syntax) but it shouldn't fail either. The stanza `libraries str` will be used later. + +## Using an Existing Functor: `Set.Make` + +The standard library contains a [`Set`](/api/Set.html) module providing a data structure that allows operations like union and intersection. To use the provided type and its associated [functions](/api/Set.S.html), it's necessary to use the functor provided by `Set`. For reference only, here is a shortened version of the interface of `Set`: +```ocaml +module type OrderedType = sig + type t + val compare : t -> t -> int +end + +module type S = sig + (** This is the module's signature returned by applying `Make` *) +end + +module Make : functor (Ord : OrderedType) -> S +``` + +Here is how this reads (starting from the bottom-up, then going up): +* Like a function (indicated by the arrow `->`), the functor `Set.Make` + - takes a module having `Set.OrderedType` as signature and + - returns a module having `Set.S` as signature +* The module type `Set.S` is the signature of some sort of set +* The module type `Set.OrderedType` is the signature of elements of a + +**Note**: Most set operation implementations must use a comparison function. Using `Stdlib.compare` would make it impossible to use a user-defined comparison algorithm. Passing the comparison function as a higher-order parameter, as done in `Array.sort`, for example, would add a lot of boilerplate code. Providing set operations as a functor allows specifying the comparison function only once. + +Here is what it can look like in our project: + +**`funkt.ml`** + +```ocaml +module StringCompare = struct + type t = string + let compare = String.compare +end + +module StringSet = Set.Make(StringCompare) +``` + +This defines a module `Funkt.StringSet`. What `Set.Make` needs is: +- A type `t`, here `string` +- A function allowing to compare two values of type `t`, here `String.compare` + +However, since the module `String` defines +- A type name `t`, which is an alias for `string` +- A function `compare` of type `t -> t -> bool` that allows to compare two strings + +This can be simplified using an _anonymous module_ expression: +```ocaml +module StringSet = Set.Make(struct + type t = string + let compare = String.compare +end) +``` + +The module expression `struct ... end` is inlined in the call to `Set.Make`. + +The be simplified even further into this: +```ocaml +module StringSet = Set.Make(String) +``` + +In both versions, the result module from the functor application `Set.Make(String)` is bound to the name `StringSet`, and it has the signature `Set.S`. The module `StringSet` provides set operations and is parametrized by the module `String`. This means the function `String.compare` is used internally by `StringSet`, inside the implementation of the functions it provides. Making a group of functions (here those provided by `StringSet`) use another group of functions (here only `String.compare`) is the role of a functor. + +With this, the command `dune exec funkt` shouldn't do anything, but it shouldn't fail either. + +Add some code to the `funkt.ml` file to produce an executable that does something and checks the result. + +**`funkt.ml`** +```ocaml +module StringSet = Set.Make(String) + +let _ = + In_channel.input_lines stdin + |> List.concat_map Str.(split (regexp "[ \t.,;:()]+")) + |> StringSet.of_list + |> StringSet.iter print_endline +``` + +Here are the types of functions used throughout the pipe: +- `In_channel.input_lines : in_channel -> string list`, +- `Str.(split (regexp "[ \t.,;:()]+")) : string -> string list`, +- `List.concat_map : ('a -> 'b list) -> 'a list -> 'b list`, +- `StringSet.of_list : string list -> StringSet.t`, and +- `StringSet.iter : StringSet.t -> unit`. + +This reads the following way: +- Read lines of text from standard input, that produces a list of strings. +- Split each string using a regular expression and flatten the resulting list of lists into a list. +- Convert the list of strings into a set. +- Display each element of the set. + +The functions `StringSet.of_list` and `StringSet.iter` are available as the result of the functor application. + +```shell +$ dune exec funkt < dune +executable +libraries +name +public_name +str +funkt +``` + +There are no duplicates in a `Set`. Therefore, the string `"funkt"` is only displayed once, although it appears twice in the `dune` file. + +## Extending a Module with a Standard Library Functor + +Using the `include` statement, here is an alternate way to expose the module created by `Set.Make(String)`: + +**`funkt.ml`** +```ocaml +module String = struct + include String + module Set = Set.Make(String) +end + +let _ = + stdin + |> In_channel.input_lines + |> List.concat_map Str.(split (regexp "[ \t.,;:()]+")) + |> String.Set.of_list + |> String.Set.iter print_endline +``` + +This allows the user to seemingly extend the module `String` with a submodule `Set`. Check the behaviour using `dune exec funkt < dune`. + +## Functors are Parametrised Modules + +### Functors from the Standard Library + +Some ”modules” provide operations over an abstract type and need to be supplied with a parameter module used in their implementation. These “modules” are parametrised, in other words, functors. That's the case for the sets, maps, and hash tables provided by the standard library. It works like a contract between the functor and the developer: +* If you provide a module that implements what is expected (the parameter interface) +* The functor returns a module that implements what is promised (the result interface) + +Here is the module's signature that the functors `Set.Make` and `Map.Make` expect: +```ocaml +module type OrderedType = sig + type t + val compare : t -> t -> int +end +``` + +Here is the module's signature that the functor `Hashtbl.Make` expects: +```ocaml +module type HashedType = sig + type t + val equal : t -> t -> bool + val hash : t -> int +end +``` + +**Note**: `Ordered.t` is a type of set elements or map keys, `Set.S.t` is a type of set, and `Map.S.t` is a type of mapping. `HashedType.t` is a type of hash table keys, and `Hashtbl.S.t` is a type of hash table. + +The functors `Set.Make`, `Map.Make`, and `Hashtbl.Make` return modules satisfying the interfaces `Set.S`, `Map.S`, and `Hashtbl.S` (respectively), which all contain an abstract type `t` and associated functions. Refer to the documentation for the details about what they provide: +* [`Set.S`](/api/Set.S.html) +* [`Map.S`](/api/Map.S.html) +* [`Hashtbl.S`](/api/Hashtbl.S.html) + +### Writing Your Own Functors + +There are many kinds of [heap](https://en.wikipedia.org/wiki/Heap_(data_structure)) data structures. Example include binary heaps, leftist heaps, binomial heaps, or Fibonacci heaps. + +The kind of data structures and algorithms used to implement a heap is not discussed in this document. + +The common prerequisite to implement any heap is a means to compare the elements they contain. That's the same signature as the parameter of `Set.Make` and `Map.Make`: +```ocaml +module type OrderedType = sig + type t + val compare : t -> t -> int +end +``` + +Using such a parameter, a heap implementation must provide at least this interface: +```ocaml +module type HeapType = sig + type elt + type t + val empty : t + val is_empty : t -> bool + val insert : t -> elt -> t + val merge : t -> t -> t + val find : t -> elt + val delete : t -> t +end +``` + +Heap implementations can be represented as functors from `OrderedType` into `HeapType`. Each kind of heap would be a different functor. + +Here is the skeleton of a possible implementation: + +**heap.ml** +```ocaml +module type OrderedType = sig + type t + val compare : t -> t -> int +end + +module type S = sig + type elt + type t + val empty : t + val is_empty : t -> bool + val insert : t -> elt -> t + val merge : t -> t -> t + val find : t -> elt + val delete : t -> t +end + +module Binary(Elt: OrderedType) : S = struct + type elt = | (* Replace by your own *) + type t = | (* Replace by your own *) + (* Add private functions here *) + let is_empty h = failwith "Not yet implemented" + let insert h e = failwith "Not yet implemented" + let merge h1 h2 = failwith "Not yet implemented" + let find h = failwith "Not yet implemented" + let delete h = failwith "Not yet implemented" +end +``` + +Here, binary heaps is the only implementation suggested. This can be extended to other implementations by adding one functor per each (e.g., `Heap.Leftist`, `Heap.Binomial`, `Heap.Fibonacci`, etc.). + +## Injecting Dependencies Using Functors + +**Dependencies Between Modules** + +Here is a new version of the `funkt` program: + +**`funkt.ml`** +```ocaml +module StringSet = Set.Make(String) + +module IterPrint : sig + val f : string list -> unit +end = struct + let f = List.iter (fun s -> Out_channel.output_string stdout (s ^ "\n")) +end + +let _ = + stdin + |> In_channel.input_lines + |> List.concat_map Str.(split (regexp "[ \t.,;:()]+")) + |> StringSet.of_list + |> StringSet.elements + |> IterPrint.f +``` + +It embeds an additional `IterPrint` module that exposes a single function `f` of type `string list -> unit` and has two dependencies: + - Module `List` through `List.iter` and the type of its `f` function + - Module `Out_channel` through `Out_channel.output_string` + +Check the behaviour of the program using `dune exec funkt < dune`. + +**Dependency Injection** + +[Dependency injection](https://en.wikipedia.org/wiki/Dependency_injection) is a way to parametrise over a dependency. + +Here is a refactoring of the module `IterPrint` to make of this technique: + +**`iterPrint.ml`** +```ocaml +module type Iterable = sig + type 'a t + val iter : ('a -> unit) -> 'a t -> unit +end + +module type S = sig + type 'a t + val f : string t -> unit +end + +module Make(Dep: Iterable) : S with type 'a t := 'a Dep.t = struct + let f = Dep.iter (fun s -> Out_channel.output_string stdout (s ^ "\n")) +end +``` + +The module `IterPrint` is refactored into a functor that takes as a parameter a module providing the function `iter`. The `with type 'a t := 'a Dep.t` constraint means the type `t` from the parameter `Dep` replaces the type `t` in the result module. This allows the type of `f` to use the type `t` from the parameter module `Dep`. With this refactoring, `IterPrint` only has one dependency; at the time it is compiled, no implementation of function `iter` is available yet. + +**Note**: An OCaml interface file must be a module, not a functor. Functors must be embedded inside modules. Therefore, it is customary to call them `Make`. + +**`funkt.ml`** + +```ocaml +module StringSet = Set.Make(String) +module IterPrint = IterPrint.Make(List) + +let _ = + stdin + |> In_channel.input_lines + |> List.concat_map Str.(split (regexp "[ \t.,;:()]+")) + |> StringSet.of_list + |> StringSet.elements + |> IterPrint.f +``` + +The dependency `List` is _injected_ when compiling the module `Funkt`. Observe that the code using `IterPrint` is unchanged. Check the program's behaviour using `dune exec funkt < dune`. + +**Replacing a Dependency** + +Now, replacing the implementation of `iter` inside `IterListPrint` is no longer a refactoring; it is another functor application with another dependency. Here, `Array` replaces `List`: + +**`funkt.ml`** +```ocaml +module StringSet = Set.Make(String) +module IterPrint = IterPrint.Make(Array) + +let _ = + stdin + |> In_channel.input_lines + |> List.concat_map Str.(split (regexp "[ \t.,;:()]+")) + |> StringSet.of_list + |> StringSet.elements + |> Array.of_list + |> IterPrint.f +``` + +Check the program's behaviour using `dune exec funkt < dune`. + +**Note**: The functor `IterPrint.Make` returns a module that exposes the type from the injected dependency (here first `List.t` then `Array.t`). That's why a `with type` constraint is needed. If the dependency was an _implementation detail_ that wasn't exposed in the signature of `IterMake`'s initial version (i.e., in the type of `IterMake.f`), that constraint wouldn't be needed. Plus, the call site of `IterPrint.f` would be unchanged when injecting another dependency. + +## Write a Functor to Extend Modules + +In this section, we define a functor to extend several modules in the same way. This is the same idea as in the [Extending a Module with a Standard Library Functor](#extending-a-module-with-a-standard-library-functor), except we write the functor ourselves. + +Create a fresh directory with the following files: + +**`dune-project`** +```lisp +(lang dune 3.7) +``` +**`dune`** +```lisp +(library (name scanLeft)) +``` + +**`scanLeft.ml`** +```ocaml +module type LeftFoldable = sig + type 'a t + val fold_left : ('b -> 'a -> 'b) -> 'b -> 'a t -> 'b + val of_list : 'a list -> 'a t +end + +module type S = sig + type 'a t + val scan_left : ('b -> 'a -> 'b) -> 'b -> 'a t -> 'b t +end + +module Make(F: LeftFoldable) : S with type 'a t := 'a F.t = struct + let scan_left f b u = + let f (b, u) a = let b' = f b a in (b', b' :: u) in + u |> F.fold_left f (b, []) |> snd |> List.rev |> F.of_list +end +``` + +Run the `dune utop` command. Once inside the toplevel, enter the following commands. +```ocaml +# module Array = struct + include Stdlib.Array + include ScanLeft.Make(Stdlib.Array) + end;; + +# module List = struct + include List + include ScanLeft.Make(struct + include List + let of_list = Fun.id + end) + end;; + +# Array.init 10 Fun.id |> Array.scan_left ( + ) 0;; +- : int array = [|0; 1; 3; 6; 10; 15; 21; 28; 36; 45|] + +# List.init 10 Fun.id |> List.scan_left ( + ) 0;; +- : int list = [0; 1; 3; 6; 10; 15; 21; 28; 36; 45] +``` + +Modules `Array` and `List` appear augmented with `Array.scan_left` and `List.scan_left`. For brevity, the output of the first two toplevel commands is not shown here. + +## Conclusion + +Functors are pretty unique to the ML family of programming languages. They provide a means to pass definitions inside a module. The same behaviour can be achieved with high-order parameters. However, functors allow passing several definitions at once, which is more convenient. + +Functor application essentially works the same way as function application: passing parameters and getting results. The difference is that we are passing modules instead of values. Beyond comfort, it enables a design approach where concerns are not only separated in silos, which is enabled by modules, but also in stages stacked upon each other. diff --git a/data/tutorials/language/1ms_02_dune.md b/data/tutorials/language/1ms_02_dune.md new file mode 100644 index 0000000000..20dbb6a697 --- /dev/null +++ b/data/tutorials/language/1ms_02_dune.md @@ -0,0 +1,268 @@ +--- +id: libraries-dune +title: Libraries With Dune +description: > + Dune provides several means to arrange modules into libraries. We look at Dune's mechanisms for structuring projects with libraries that contain modules. +category: "Module System" +--- + +# Libraries With Dune + +## Introduction + +Dune provides several means to arrange modules into libraries. We look at Dune's mechanisms for structuring projects with libraries that contain modules. + +This tutorial uses the [Dune](https://dune.build) build tool. Make sure you have version 3.7 or later installed. + +**Requirements**: [Modules](/docs/modules) and [Functors](/docs/modules). + +## Minimum Project Setup + +This section details the structure of an almost-minimum Dune project setup. Check [Your First OCaml Program](/docs/your-first-program) for automatic setup using the `dune init proj` command. +```shell +$ mkdir mixtli; cd mixtli + +$ touch mixtli.opam +``` + +In this directory, create four more files: `dune-project`, `dune`, `cloud.ml`, and `wmo.ml`: + +**`dune-project`** +```lisp +(lang dune 3.7) +``` + +This file contains the global project configuration. It's kept to the bare minimum, including the `lang dune` stanza that specifies the required Dune version. + +**`dune`** +```lisp +(executable + (name cloud) + (public_name nube)) +``` + +Each folder that requires some sort of build must contain a `dune` file. The `executable` stanza means an executable program is built. +- The `name cloud` stanza means the file `cloud.ml` contains the executable. +- The `public_name nube` stanza means the executable is made available using the name `nube`. + +**`wmo.ml`** +```ocaml +module Stratus = struct + let nimbus = "Nimbostratus (Ns)" +end + +module Cumulus = struct + let nimbus = "Cumulonimbus (Cb)" +end +``` + +**`cloud.ml`** +```ocaml +let () = + Wmo.Stratus.nimbus |> print_endline; + Wmo.Cumulus.nimbus |> print_endline +``` + +Here is the resulting output: +```shell +$ dune exec nube +Nimbostratus (Ns) +Cumulonimbus (Cb) +``` + + +Here is the folder contents: +```shell +$ tree +. +├── mixtli.opam +├── dune +├── dune-project +├── cloud.ml +└── wmo.ml +``` + +Dune stores the files it creates in a folder named `_build`. In a project managed using Git, the `_build` folder should be ignored +```shell +$ echo _build >> .gitignore +``` + +In OCaml, each source file is compiled into a module. In the `mixtli` project, the file `cloud.ml` creates a module named `Cloud`, the file `wmo.ml` creates a module `Wmo` that contains two submodules: `Stratus` and `Cumulus`. + +Here are the different names: +* `mixtli` is the project's name (it means *cloud* in Nahuatl). +* `cloud.ml` is the OCaml source file's name, referred as `cloud` in the `dune` file. +* `nube` is the executable command's name (it means *cloud* in Spanish). +* `Cloud` is the name of the module associated with the file `cloud.ml`. +* `Wmo` is the name of the module associated with the file `wmo.ml`. + +The `dune describe` command allows having a look at the project's module structure. Here is its output: +```lisp +((root /home/cuihtlauac/caml/mixtli-dune) + (build_context _build/default) + (executables + ((names (cloud)) + (requires ()) + (modules + (((name Wmo) + (impl (_build/default/wmo.ml)) + (intf ()) + (cmt (_build/default/.cloud.eobjs/byte/wmo.cmt)) + (cmti ())) + ((name Cloud) + (impl (_build/default/cloud.ml)) + (intf ()) + (cmt (_build/default/.cloud.eobjs/byte/cloud.cmt)) + (cmti ())))) + (include_dirs (_build/default/.cloud.eobjs/byte))))) +``` + + +## Libraries + + +In OCaml, a library is a collection of modules. By default, when Dune builds a library, it wraps the bundled modules into a module. Dune creates libraries from folders. Let's look at an example, here the folder is `lib`: +```shell +$ mkdir lib +``` + +The `lib` folder is populated with the following files. + +**`lib/dune`** +```lisp +(library (name wmo)) +``` + +**`lib/cumulus.mli`** +```ocaml +val stratus : string +``` + +**`lib/cumulus.ml`** +```ocaml +let nimbus = "Cumulonimbus (Cb)" +``` + +**`lib/stratus.mli`** +```ocaml +val cumulus : string +``` + +**`lib/stratus.ml`** +```ocaml +let nimbus = "Nimbostratus (Ns)" +``` + +All the modules found in the `lib` folder are bundled into the `Wmo` module. This module is the same as what we had in the `wmo.ml` file. To avoid redundancy, we delete it: +```shell +$ rm wmo.ml +``` + +We update the `dune` file building the executable to use the library as a dependency. + +**`dune`** +```lisp +(executable + (name cloud) + (public_name nube) + (libraries wmo)) +``` + +**Observations**: +* Dune creates a module `Wmo` from the contents of folder `lib`. +* The folder's name (here `lib`) is irrelevant. +* The library name appears uncapitalised (`wmo`) in `dune` files: + - In its definition, in `lib/dune` + - When used as a dependency in `dune` + +## Library Wrapper Modules + +By default, when Dune bundles modules into a library, they are automatically wrapped into a module. It is possible to manually write the wrapper file. The wrapper file must have the same name as the library. + +Here, we are creating a wrapper file for the `wmo` library from the previous section. + +**`lib/wmo.ml`** +```ocaml +module Cumulus = Cumulus +module Stratus = Stratus +``` + +Here is how to make sense of these module definitions: +- On the left-hand side, `module Cumulus` means module `Wmo` contains a submodule named `Cumulus`. +- On the right-hand side, `Cumulus` refers to the module defined in the file `lib/cumulus.ml`. + + +Run `dune exec nube` to see that the behaviour of the program is the same as in the previous section. + +When a library folder contains a wrapper module (here `wmo.ml`), it is the only one exposed. All other file-based modules from that folder that do not appear in the wrapper module are private. + +Using a wrapper file makes several things possible: +- Have different public and internal names, `module CumulusCloud = Cumulus` +- Define values in the wrapper module, `let ... = ` +- Expose module resulting from functor application, `module StringSet = Set.Make(String)` +- Apply the same interface type to several modules without duplicating files +- Hide modules by not listing them + +## Include Subdirectories + +By default, Dune builds a library from the modules found in the same folder as the `dune` file, but it doesn't look into subfolders. It is possible to change this behaviour. + +In this example, we create subdirectories and move files there. +```shell +$ mkdir lib/cumulus lib/stratus +$ mv lib/cumulus.ml lib/cumulus/m.ml +$ mv lib/cumulus.mli lib/cumulus/m.mli +$ mv lib/stratus.ml lib/stratus/m.ml +$ mv lib/stratus.mli lib/stratus/m.mli +``` + +Change from the default behaviour with the `include_subdirs` stanza. + +**`lib/dune`** +```lisp +(include_subdirs qualified) +(library (name wmo)) +``` + +Update the library wrapper to expose the modules created from the subdirectories. + +**`wmo.ml`** +```ocaml +module Cumulus = Cumulus.M +module Stratus = Stratus.M +``` + +Run `dune exec nube` to see that the behaviour of the program is the same as in the two previous sections. + +The `include_subdirs qualified` stanza works recursively, except on subfolders containing a `dune` file. See the [Dune](https://dune.readthedocs.io/en/stable/dune-files.html#include-subdirs) [documentation](https://github.com/ocaml/dune/issues/1084) for [more](https://discuss.ocaml.org/t/upcoming-dune-feature-include-subdirs-qualified) on this [topic](https://github.com/ocaml/dune/tree/main/test/blackbox-tests/test-cases/include-qualified). + + + +## Conclusion + +The OCaml module system allows organizing a project in many ways. Dune provides several means to arrange modules into libraries. +