Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid calling external processes through the system shell #10393

Closed

Conversation

justin-espedal
Copy link
Contributor

This change allows Haxe to be used even if the system shell is unavailable.

Certain ocaml functions which call external processes do so through the system
shell. On systems where the shell is unavailable, this makes Haxe unusable.

These problematic functions are Unix.system, the Unix.open_process_* family
of functions, and Sys.command.

Places where the shell was previously accessed:

  1. When using the -lib argument to the compiler.
  • Haxe.add_libs (calls haxelib path {libs})
  1. When using the -cmd argument to the compiler.
  • Haxe.run_command (command is normally passed to the shell)
  1. When generating code for certain targets.
  • Gencpp.write_build_options (calls haxelib path hxcpp)
  • Gencpp.generate_source (calls haxelib run hxcpp ...)
  • Gencs.generate (calls haxelib run hxcs ...)
  • Genhl.generate (calls haxelib run hashlink ...)
  • Genjava.generate (calls haxelib run hxjava ...)
  • Genneko.generate (calls nekoc ...)
  1. By hl interpreter and eval, to check the system name.
  • Hlinterp.load_native, for sys_string (calls uname on unix)
  • EvalStdLib.StdSys for system name (calls uname on unix)

Out of all of these, most don't need to be interpreted by the shell. All the
calls to haxelib, nekoc, and uname are known in advance, and those
processes can be started directly instead. The only one that actually needs to
be interpreted by the shell is (2), using -cmd.

As a replacement for the Unix.open_process_* family of functions, the Unix
module provides Unix.open_process_args_* as well. This set of functions does
not pass through the shell, but uses fork/exec/posix_spawn on unix and
CreateProcess on Windows. Arguments are passed as an array of strings, and
given that the shell is not involved, the strings should not be escaped like
they would be if passed to the shell. The first argument in the array should
be the name of the program.

Unfortunately, Unix.open_process_args_* is only available beginning with
ocaml 4.08, so a backport that wraps around Unix.create_process_env has
been added in the new process_helper.ml file.

For the purpose of finding the full path, a simple path-searching function has also
been added in process_helper.ml. This path searcher does not function exactly as
the built-in path-searching capabilities of system shells, but it should work similarly
to the built-in path searching functionality of the functions as of ocaml 4.12.

Testing that these changes work is pretty simple on Windows. Open a command prompt first, then in the registry add a DWORD with the value 1 at Computer\HKEY_CURRENT_USER\SOFTWARE\Policies\Microsoft\Windows\System\DisableCMD. If a new command prompt is opened now, it will warn that you don't have access, but the existing command prompt will continue to function. In the existing command prompt, call haxe -lib myhaxelib and confirm that the output is something expected like "myhaxelib isn't installed."

This change allows Haxe to be used even if the system shell is unavailable.

Certain ocaml functions which call external processes do so through the system
shell. On systems where the shell is unavailable, this makes Haxe unusable.

These problematic functions are `Unix.system`, the `Unix.open_process_*` family
of functions, and `Sys.command`.

Places where the shell was previously accessed:

1. When using the `-lib` argument to the compiler.
- `Haxe.add_libs` (calls `haxelib path {libs}`)

2. When using the `-cmd` argument to the compiler.
- `Haxe.run_command` (command is normally passed to the shell)

3. When generating code for certain targets.
- `Gencpp.write_build_options` (calls `haxelib path hxcpp`)
- `Gencpp.generate_source` (calls `haxelib run hxcpp ...`)
- `Gencs.generate` (calls `haxelib run hxcs ...`)
- `Genhl.generate` (calls `haxelib run hashlink ...`)
- `Genjava.generate` (calls `haxelib run hxjava ...`)
- `Genneko.generate` (calls `nekoc ...`)

4. By hl interpreter and eval, to check the system name.
- `Hlinterp.load_native`, for `sys_string` (calls `uname` on unix)
- `EvalStdLib.StdSys` for system name (calls `uname` on unix)

Out of all of these, most don't need to be interpreted by the shell. All the
calls to `haxelib`, `nekoc`, and `uname` are known in advance, and those
processes can be started directly instead. The only one that actually needs to
be interpreted by the shell is (2), using `-cmd`.

As a replacement for the `Unix.open_process_*` family of functions, the Unix
module provides `Unix.open_process_args_*` as well. This set of functions does
not pass through the shell, but uses fork/exec/posix_spawn on unix and
CreateProcess on Windows. Arguments are passed as an array of strings, and
given that the shell is not involved, the strings should not be escaped like
they would be if passed to the shell. The first argument in the array should
be the name of the program.

For the purpose of finding the full path, a simple path-searching function has
been added in the new process_helper.ml. This path searcher does not function
exactly as the built-in path-searching capabilities of system shells.

Unfortunately, `Unix.open_process_args_*` is only available beginning with
ocaml 4.08, so a backport that wraps around `Unix.create_process_env` has also
been added in process_helper.ml.
@justin-espedal
Copy link
Contributor Author

In order to use haxelib run to execute the run.n neko scripts of haxelibs, a change needs to be made on Haxelib's side as well. A PR for that change is here: HaxeFoundation/haxelib#523

@justin-espedal
Copy link
Contributor Author

I wonder whether this change would also solve the issue in #10297. Once the mac build is available, that would be good to test.

@Simn
Copy link
Member

Simn commented Oct 18, 2021

Could you please resolve the conflict? I'll review this afterwards.

@Simn Simn self-assigned this Oct 18, 2021
@Simn
Copy link
Member

Simn commented Oct 18, 2021

I'm getting some warnings here:

File "src/generators/hlinterp.ml", line 1541, characters 6-51:
1541 |                                          Process_helper.close_process_in_pid (ic, pid);
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 10: this expression should have type unit.
File "src/generators/gencpp.ml", line 8562, characters 33-36:
8562 |       if (common_ctx.debug) then cmd := cmd @ ["-Ddebug"];
                                        ^^^
Error: This expression has type string list
       but an expression was expected of type 'a ref
File "src/macro/eval/evalStdLib.ml", line 2687, characters 6-51:
2687 |                                          Process_helper.close_process_in_pid (ic, pid);
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 10: this expression should have type unit.
File "src/generators/hlinterp.ml", line 1541, characters 6-51:
1541 |                                          Process_helper.close_process_in_pid (ic, pid);
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 10: this expression should have type unit.
File "src/macro/eval/evalStdLib.ml", line 2687, characters 6-51:
2687 |                                          Process_helper.close_process_in_pid (ic, pid);
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Warning 10: this expression should have type unit.

The good news is that I can confirm that your change works with DisableCmd being set.

@justin-espedal
Copy link
Contributor Author

I don't have the ocaml environment set up that I used to make this change originally, so I'm using the CI to quickly see if it builds or not. On top of that, I have no previous experience with ocaml, so... please bear with me.

@Simn
Copy link
Member

Simn commented Oct 18, 2021

No worries, if you create just one function that does something meaningful with the OCaml process API then you're already more successful in that regard than I've ever been. :P

@justin-espedal
Copy link
Contributor Author

It looks like re-running the tests made the windows-test (js) error go away. Is it safe to just ignore that, then?

I'll need to set some time aside to get set back up with a Haxe dev environment on Linux to figure out what's going on with the others.

@Simn
Copy link
Member

Simn commented Oct 19, 2021

It looks like re-running the tests made the windows-test (js) error go away. Is it safe to just ignore that, then?

Yes these tests are sometimes are bit fickle...

@RealyUniqueName RealyUniqueName added this to the Backlog milestone Nov 29, 2021
Simn added a commit that referenced this pull request Mar 20, 2022
Simn added a commit that referenced this pull request Mar 30, 2022
Simn added a commit that referenced this pull request Mar 30, 2022
@Simn
Copy link
Member

Simn commented Mar 30, 2022

I've pulled most of your changes in the linked PR. Currently it still just folds run_command_args into run_command and thus also keeps the escaping. If we can sort out the problems related to haxelib, we can change that later.

Thank you very much for this contribution!

@Simn Simn closed this Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants