Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test case for custom preprocessor #334

Merged
merged 2 commits into from
Jun 24, 2024
Merged

Add a test case for custom preprocessor #334

merged 2 commits into from
Jun 24, 2024

Conversation

hhugo
Copy link
Collaborator

@hhugo hhugo commented May 30, 2024

This use case was found in camlp4 and is likely broken on windows. Let's try to fix.

This PR adds a special handling for A '-pp'; Quote q. The command q will be executed by cmd.exe in ocamlc/ocamldep/.. and needs to be escaped as such.

Based on #338 and #339

Should fix #89

@hhugo hhugo changed the title Add a est case for custom preprocessor Add a test case for custom preprocessor May 30, 2024
@gasche
Copy link
Member

gasche commented May 30, 2024

CI results are in: this works on Windows under OCaml 4.x, and fails on Windows under OCaml 5.x. So this may be an upstream-OCaml issue, but I haven't tried to investigate and will happily leave this to @hhugo.

@hhugo
Copy link
Collaborator Author

hhugo commented May 30, 2024

Yet I can't install camlp4 on my machine on ocaml 4.14.1. I need to investigate, The CI failure on OCaml 5 could be due to missing parches in the sunset repo..

@hhugo
Copy link
Collaborator Author

hhugo commented May 30, 2024

For the record, here is the error I get locally trying to install camlp4.

/config/Camlp4_config.ml
+ '''C:\Users\hugoh\AppData\Local\opam\default\bin/ocamlc.opt' -I +dynlink dynlink.cma -g -I camlp4/config -I camlp4/boot camlp4/config/Camlp4_import.cmo camlp4/config/Camlp4_config.cmo camlp4/boot/Camlp4.cmo camlp4/boot/camlp4boot.cmo -o camlp4/boot/camlp4boot.byte
+ '''C:\Users\hugoh\AppData\Local\opam\default\bin/ocamldep.opt' -pp ''\'''\''camlp4\\\\boot\\\\camlp4boot.byte -D OPT' -modules camlp4/Camlp4/Debug.mli > camlp4/Camlp4/Debug.mli.depends
+ bash --norc -c "'''C:\Users\hugoh\AppData\Local\opam\default\bin/ocamldep.opt' -pp ''\'''\''camlp4\\\\boot\\\\camlp4boot.byte -D OPT' -modules camlp4/Camlp4/Debug.mli > camlp4/Camlp4/Debug.mli.depends"
The system cannot find the path specified.
File "camlp4/Camlp4/Debug.mli", line 1:
Error: Error while running external preprocessor
Command line: ''camlp4\\boot\\camlp4boot.byte -D OPT "camlp4/Camlp4/Debug.mli" > C:\cygwin64\tmp\ocamlpp68ba5a

I found the following piece of code in camlp4 myocamlbuil that I don't really understand but I'd be nice to address it. Maybe @dra27 can help ?

    let hot_camlp4boot_dep, hot_camlp4boot_cmd =
      if use_external_camlp4boot then
        (None, "camlp4boot")
      else
        let exe =
          "camlp4boot" ^
          if C.ocamlnat then
            (* If we are using a native plugin, we might as well use a native
               preprocessor. *)
            ".native"
          else
            ".byte"
        in
        let dep = "camlp4"/"boot"/exe in
        let cmd =
          let ( / ) = Filename.concat in
          (*
           * Workaround ocamlbuild problems on Windows by double-escaping.
           * On systems using forward-slash, the calls to String.escaped will be
           * no-ops anyway and the code will continue to work even once ocamlbuild
           * correctly escapes output (the issue is trying to escape output for both cmd
           * and bash)
           *)
          String.escaped (String.escaped ("camlp4"/"boot"/exe))
        in
        (Some dep, cmd)
    in
    let hot_camlp4boot_cmd =
      if not windows then
        hot_camlp4boot_cmd
      else
        let buf = Buffer.create (String.length hot_camlp4boot_cmd + 5) in
        for i = 0 to String.length hot_camlp4boot_cmd - 1 do
          match hot_camlp4boot_cmd.[i] with
          | '/' ->
            Buffer.add_string buf "\\";
          | x -> Buffer.add_char buf x;
        done;
        Buffer.contents buf          
    in

@hhugo
Copy link
Collaborator Author

hhugo commented May 30, 2024

The patch against the compiler gives some information about what's going on.
https://gist.githubusercontent.com/fdopen/2def708d5eb7c8764c7b0804587a57dd/raw/2897d6a477dce97b88cdf8633355a99400d0e083/ocaml-4.14.0.patch

(* external commands are unfortunately called called with cmd.exe
 * (via Sys.command).
 * Cmd.exe has strange quoting rules. The most notorious quirk is, that
 * you can't use forward slashes as path separators at the first position,
 * unless you quote the expression explicitly.
 * cmd.exe will interpret the slash and everything thereafter as first
 * parameter. Eg. 'bin/foo -x' is treated like 'bin /foo -x'.
 * Because the most used build tools are unix-centric (ocamlbuild, gmake)
 * and are not aware of it, such errors are quite common, especially when
 * calling the preprocessor. ( ocamlc -pp 'subdir/exe' ... )
 *
 * Therefore, I replace every slash inside the first subexpression with a
 * backslash.
 * I think this replacement is safe. No OCaml developer will write 'bin/foo',
 * if he wants to call a executable 'bin' with parameter /foo.
 *)

@gasche
Copy link
Member

gasche commented May 30, 2024

This makes sense, the failing command in the testsuite does contain a forward slash:

''ocamldep.opt -pp '''./preprocessor.byte 2' -modules main.ml > main.ml.depends

On the other hand, the comment you quote mentions "unless you quote it explicitly", and it looks like an attempt at quoting was made here, there are in fact three quotes right after the -pp. I don't understand the quoting rules at play, it looks like ''foo.exe is acceptable in Windows-land, but I would expect -pp "''foo.exe" rather than `-pp '''foo.exe. Is the latter also correct? If you have a Windows machine, can you find out which quoting would work correctly here?

@hhugo
Copy link
Collaborator Author

hhugo commented Jun 14, 2024

This makes sense, the failing command in the testsuite does contain a forward slash:

''ocamldep.opt -pp '''./preprocessor.byte 2' -modules main.ml > main.ml.depends

On the other hand, the comment you quote mentions "unless you quote it explicitly", and it looks like an attempt at quoting was made here, there are in fact three quotes right after the -pp. I don't understand the quoting rules at play, it looks like ''foo.exe is acceptable in Windows-land, but I would expect -pp "''foo.exe" rather than `-pp '''foo.exe. Is the latter also correct? If you have a Windows machine, can you find out which quoting would work correctly here?

After digging a bit, I wish I never did !!!

  • We call bash -norc -c "ocamlc -pp 'preprocessor..'" which goes through cmd.exe \c
  • bash execute what's left from ocamlc -pp 'preprocessor..' with bash escaping/quoting
  • ocamlc execute what's left of preprocessor.. with cmd.exe

going the other way

  • we want to execute "./preprocessor.exe" args in cmd
  • we have to call Sys.command with ^"./preprocessor.exe^" args
  • from bash we can call ocamlc.opt -verbose -dsource -pp '^"./preprocessor.exe^" args' main.ml
  • finally, calling bash from cmd, we get bash --norc --verbose -c ^"ocamlc.opt -dsource -pp '^^^\^"./preprocessor.exe^^^\^" args' main.ml"

I still need to figure out some steps..

@hhugo hhugo force-pushed the camlp4-bug branch 5 times, most recently from 5382e0f to b0dc7f6 Compare June 20, 2024 15:51
@hhugo hhugo marked this pull request as ready for review June 20, 2024 15:56
@hhugo
Copy link
Collaborator Author

hhugo commented Jun 20, 2024

I believe this PR is now ready for review.

We're now checking in the CI that camlp4 can be installed with the current ocamlbuild and we're running the external.ml tests that depend on camlp4 and menhir.

@hhugo hhugo force-pushed the camlp4-bug branch 2 times, most recently from 7a29cc4 to 6f64501 Compare June 20, 2024 22:22
@hhugo hhugo mentioned this pull request Jun 20, 2024
@hhugo hhugo requested review from gasche and kit-ty-kate June 20, 2024 23:37
@hhugo hhugo force-pushed the camlp4-bug branch 2 times, most recently from f920726 to 4fa6a78 Compare June 23, 2024 07:57
@hhugo
Copy link
Collaborator Author

hhugo commented Jun 23, 2024

Rebased

src/command.ml Outdated
| Quote s -> put_space (); put_filename (aux encode s)
and do_specs = function
| [] -> ()
| (A "-pp" as pp) :: Quote q :: rest ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels very fishy to me to special-case -pp for special quoting rule. Why not apply this logic to all Quote arguments?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the definition of Quote ?
It's currently

| Quote of spec  (** A string that should be quoted like a
                      filename but isn't really one. *)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that Quote is in fact used for arguments that are themselves commands or prefixes of command, meant to be passed to the shell at some point. We could indeed change the (fairly unclear) documentation to reflect this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

~targets:("main.native", [])
~post_cmd:(
if Sys.win32
then "bash --norc -c ./main.native"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is odd that we have to do this here. Should the function that runs post_cmd be calling My_std.sys_command instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The testsuite doesn't currently have access to ocamlbuild the lib. Should I make the change ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with keeping this file separate from the ocamlbuild logic, as an external driver, but I would rather have the Windows-testing logic in ocamlbuild_test.ml rather than in each caller of the test function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

src/command.ml Outdated
aux (fun x ->
if ((!first && (first := false; String.contains x '/')) || not (Shell.is_simple_filename x))
then (My_std.quote_cmd ("\"" ^ x ^ "\""))
else x) q
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is fairly complex and also not clearly correct: if q is to be interpreted as a command, then we need different quoting rules for the first argument of q, which is not necessarily an argument on which the encode function will be called. So encode will run on a follow-up argument and the logic does not match the intent.

An alternative proposal would be as follows: on Windows, ensure that is_simple_filename fails on filenames that contain a /. Does that suffice to prevent the issue that you are guarding against here? (I would be happy to keep your nice, descriptive comment that explains why we need to protect certain occurrences of /, which could move to Shell if we do it this way.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like moving this to Shell because it should only apply inside cmd.
I've change the logic to apply no only on the first encoded element.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that windows has more constraint on command line size.

src/command.ml Outdated Show resolved Hide resolved
@hhugo hhugo merged commit 42450af into master Jun 24, 2024
15 checks passed
@hhugo hhugo deleted the camlp4-bug branch June 24, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Escaping problems on windows
2 participants