Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Little bit faster BatList.cartesian_product #998

Closed
wants to merge 3 commits into from

Conversation

fccm
Copy link
Member

@fccm fccm commented Dec 29, 2020

More details in this issue: #997

Also by avoiding the call to List.concat seems to make this function tail-rec, while the doc says that List.concat is not tail-rec.

You can test this PR with:

$ ocamlfind ocamlopt -linkpkg -package "unix,num,bigarray,str,bytes" \
        -I _build/src/ batteries.cmxa -o cp.opt cp.ml

$ cat cp.ml

let cartesian_product_orig l1 l2 =
  List.concat (List.map (fun a -> List.map (fun b -> (a, b)) l2) l1)

let cartesian_product = BatList.cartesian_product
let cartesian_product_rev = BatList.cartesian_product_rev

let () =
  let l1 = List.init 2_000 (fun i -> i) in
  let l2 = List.init 4_000 (fun i -> i) in
  let p =
    match Sys.argv.(1) with
    | "concat" -> cartesian_product_orig l1 l2
    | "fold" -> cartesian_product l1 l2
    | "fold_rev" -> cartesian_product_rev l1 l2
    | _ -> raise Exit
  in
  Printf.printf "len: %d\n%!" (List.length p);
;;

$ time ./cp.opt concat
len: 8000000

real	0m2,566s
user	0m2,392s
sys	0m0,172s

$ time ./cp.opt fold
len: 8000000

real	0m2,355s
user	0m2,215s
sys	0m0,139s

$ time ./cp.opt fold_rev
len: 8000000

real	0m2,176s
user	0m2,046s
sys	0m0,128s

this implementation avoids the call to List.concat

as explained in issue ocaml-batteries-team#997
same than BatList.cartesian_product but avoids the calls to List.rev
@UnixJunkie
Copy link
Member

Can you send a PR with just your version of BatList.cartesian_product?
Please forgive my terseness and lag on github; I am busy with work and several projects.

@fccm
Copy link
Member Author

fccm commented Jan 8, 2021

Can you send a PR with just your version of BatList.cartesian_product?

I think this is what I did. The *_iter and *_fold versions are in a different PR #999.
Or maybe you mean you would like that I split the PR with the *_rev one too?

Please forgive my terseness and lag on github; I am busy with work and several projects.

I also take quite a while between each steps, I can understand :)

@fccm
Copy link
Member Author

fccm commented Jan 8, 2021

For the base cartesian_product maybe we could use the local fold_right instead of fold_left.
Then we would don't need to reverse the lists to preserve the output order of the previous version.
(The drawback would be then that it would be more difficult to copy/past this function for someone who don't want to add a dependency to batteries.)

@fccm
Copy link
Member Author

fccm commented Jan 8, 2021

Performance of using fold_right seems to be in the middle between BatList.cartesian_product and BatList.cartesian_product_rev

let cartesian_product_right l1 l2 =
  List.fold_right
    (fun a acc ->
       List.fold_right
         (fun b acc -> (a, b)::acc)
         l2 acc)
    l1 []
$ time ./cp.opt fold
len: 8000000

real	0m2,396s
user	0m2,269s
sys	0m0,118s

$ time ./cp.opt fold_rev
len: 8000000

real	0m2,215s
user	0m2,100s
sys	0m0,114s

$ time ./cp.opt right
len: 8000000

real	0m2,264s
user	0m2,120s
sys	0m0,115s

Using 2 fold_left without List.rev is still the fastest, but doesn't preserve the same order of the output than the current version.

Using 2 fold_right seems faster than 2 fold_left with the List.rev calls that preserves the same order of the output.

@UnixJunkie
Copy link
Member

Can you compare with this one:

let cart_prod l1 l2 =                                                           
  let a1 = A.of_list l1 in                                                      
  let a2 = A.of_list l2 in                                                      
  let res = ref [] in                                                           
  let n = A.length a1 in                                                        
  let m = A.length a2 in                                                        
  for i = n - 1 downto 0 do                                                     
    let a = A.unsafe_get a1 i in                                                
    for j = m - 1 downto 0 do                                                   
      res := (a, A.unsafe_get a2 j) :: !res                                     
    done                                                                        
  done;                                                                         
  !res

@UnixJunkie
Copy link
Member

You could use one of the several bench libraries for ocaml.
Also, you could try for various sizes like: 1k,2k,5k,10k

@gasche
Copy link
Member

gasche commented Jan 12, 2021

Note: we have a benchsuite/ directory with benchmarks for various versions of some functions. I think it's a good practice to always include a benchmark there when discussing rewrites for performance.

@gasche
Copy link
Member

gasche commented Jan 15, 2021

I would use the following implementation, relying on Batteries' support for destination-passing-style.

let cartesian_product l1 l2 =
  match l1, l2 with
  | [], _ | _, [] -> []
  | _, _ ->
    let dst = Acc.dummy () in
    let _ =
      List.fold_left (fun dst x1 ->
        List.fold_left (fun dst x2 ->
          Acc.accum dst (x1, x2)
        ) dst l2
     ) dst l1
    in dst.tl

I have not checked, but it should have performance comparable to @fccm's proposal, and preserve the output order.

@fccm
Copy link
Member Author

fccm commented Jan 15, 2021

$ time ./cp.opt orig
len: 8000000

real    0m2,631s
user    0m2,469s
sys     0m0,161s

$ time ./cp.opt fold
len: 8000000

real    0m2,385s
user    0m2,259s
sys     0m0,125s

$ time ./cp.opt fold_rev
len: 8000000

real    0m2,328s
user    0m2,197s
sys     0m0,127s

$ time ./cp.opt dst
len: 8000000

real    0m2,517s
user    0m2,399s
sys     0m0,116s

$ time ./cp.opt a
len: 8000000

real    0m2,235s
user    0m2,105s
sys     0m0,128s
(* ocamlfind ocamlopt -linkpkg -package "unix,num,bigarray,str,bytes"
         -I _build/src/ batteries.cmxa -o cp.opt cp.ml
*)
(* Thanks to Jacques Garrigue for suggesting the following structure *)
type 'a mut_list =  {
  hd: 'a;
  mutable tl: 'a list
}

external inj : 'a mut_list -> 'a list = "%identity"

module Acc = struct
  let dummy () =
    { hd = Obj.magic (); tl = [] }
  let create x =
    { hd = x; tl = [] }
  let accum acc x =
    let cell = create x in
    acc.tl <- inj cell;
    cell
end

let cartesian_product_orig l1 l2 =
  List.concat (List.map (fun a -> List.map (fun b -> (a, b)) l2) l1)

let cartesian_product_fold l1 l2 =
  let l1 = List.rev l1 in
  let l2 = List.rev l2 in
  List.fold_left
    (fun acc a ->
       List.fold_left
         (fun acc b -> (a, b)::acc)
         acc l2)
    [] l1

let cartesian_product_fold_rev l1 l2 =
  List.fold_left
    (fun acc a ->
       List.fold_left
         (fun acc b -> (a, b)::acc)
         acc l2)
    [] l1

let cartesian_product_dst l1 l2 =
  match l1, l2 with
  | [], _ | _, [] -> []
  | _, _ ->
      let dst = Acc.dummy () in
      let _ =
        List.fold_left (fun dst x1 ->
          List.fold_left (fun dst x2 ->
            Acc.accum dst (x1, x2)
          ) dst l2
       ) dst l1
      in dst.tl

module A = BatArray

let cartesian_product_a_unsafe_get l1 l2 =                                                           
  let a1 = A.of_list l1 in                                                      
  let a2 = A.of_list l2 in                                                      
  let res = ref [] in                                                           
  let n = A.length a1 in                                                        
  let m = A.length a2 in                                                        
  for i = n - 1 downto 0 do                                                     
    let a = A.unsafe_get a1 i in                                                
    for j = m - 1 downto 0 do                                                   
      res := (a, A.unsafe_get a2 j) :: !res                                     
    done                                                                        
  done;                                                                         
  !res

let () =
  let l1 = List.init 2_000 (fun i -> i) in
  let l2 = List.init 4_000 (fun i -> i) in
  let p =
    match Sys.argv.(1) with
    | "orig" -> cartesian_product_orig l1 l2
    | "fold" -> cartesian_product_fold l1 l2
    | "fold_rev" -> cartesian_product_fold_rev l1 l2
    | "dst" -> cartesian_product_dst l1 l2
    | "a" -> cartesian_product_a_unsafe_get l1 l2
    | _ -> raise Exit
  in
  Printf.printf "len: %d\n%!" (List.length p);
;;

@fccm
Copy link
Member Author

fccm commented Jan 15, 2021

When I run make bench I get exception Invalid_argument("Array.make") or Assertion failed for most benches.

@UnixJunkie
Copy link
Member

@fccm if you use a benchmarking library, you can have statistical results like variance and statistical significance of the observed change

@fccm
Copy link
Member Author

fccm commented Jan 18, 2021

I agree with you.
Just that benchmarks are currently broken, see issue #1003

@UnixJunkie
Copy link
Member

@fccm you should really use a benchmark library and try the one suggested by @gasche

@UnixJunkie
Copy link
Member

Under benchsuite/, there are several bench examples.
If this benchmark is contributed in the same style, this PR could be merged and we could finally choose which
implementation we switch to.
It would be interesting to test with lists of different sizes (like: 100, 200, 500, 1000)
to see if one implementation consistently dominate the others.

@UnixJunkie
Copy link
Member

closing because too old

@UnixJunkie UnixJunkie closed this Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants