Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script passed as pipe, fix for locks, support for python #153

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

fabio-brugnara
Copy link

This PR proposes a solution (at least in my experience), for a couple of questions appearing among the Issues.

  1. Remove the limitations of command line length limits and exposability by using pipes instead of command line to pass the script to the interpreter. A -P option is added to enable this.
  2. Remove occasional locks introduced by a race condition in the untraceable() function: the possibility that the CONT signal arrives before the process has actually entered the stopped state, being therefore discarded.
  3. Add support for python script.

A drawback of the "pipe" feature, is the fact that the script may lose the possibility to retrieve the original name by accessing $0. Often this can be worked around referring to the $_ environment variable, but not always. In exchange for this annoyance, the limit on the script size and the possibility of seeing the source in clear with ps or /proc/*/cmdline both disappear.

These modifications have been applied to the version found in Francisco Rosales' page and have been working reliably for years under linux.

src/shc.c Show resolved Hide resolved
src/shc.c Outdated
@@ -149,6 +150,9 @@ static int MMAP2_flag = 0;
static const char BUSYBOXON_line[] =
"#define BUSYBOXON %d /* Define as 1 to enable work with busybox */\n";
static int BUSYBOXON_flag = 0;
static const char PIPESCRIPT_line[] =
"#define PIPESCRIPT %d /* Define as 1 to submit script as a pipe */\n";
static int PIPESCRIPT_flag;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please initialize to 0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but static variables are initialized to 0 anyway

Copy link
Author

@fabio-brugnara fabio-brugnara Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do, but if you feel it is necessary to modify something, you should have push access to my fork.
However, if you prefer to ask me to operate on it, no problem at all.

Copy link
Contributor

@suntong suntong Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, good job @fabio-brugnara, the PR fix the most important limitations of command line length limits. I'm chipping in to say that it's common practice for the PR raiser to fix things that the author reviewed, but I'm sure you know that as well. I know you're enjoying the fix yourself, but hope your contribution can serve the world as well. (I'm the DM of the shc Debian package, thanks)

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @suntong, actually my intention was to follow author's indication and I immediately pushed a commit addressing the issue as requested, even if I felt it was not necessary. Maybe this is not the correct way to proceed? Is there something I overlooked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fabio-brugnara. I saw "New changes since you last viewed". Hope @neurobin can be satisfied to take the PR.

Hi @ashamedbit, would you take a look at this as well please?

@mdeweerd
Copy link

I've integrated this in #168 as well.

What is missing:

  • Actual test case for the Pipe option;
  • Actual test case for python (test.py is not used in the tests).

@fabio-brugnara
Copy link
Author

Hi @mdeweerd , thanks for your work. I pushed a commit modifying test/ttest.sh in order to address your comments, including both test cases. As for the source files in the test dir, as far as I see none of them is actually used, the test script generates and compiles a simple test program on the fly.

@mdeweerd
Copy link

mdeweerd commented Aug 12, 2024

I've been working with it since a while now.
The test result was not meaningful because the errors were not really checked (was also true before).

Now I've updated the flow and the script so that more errors are flagged and an artifact is available for analysis.

Check this out : https://github.com/mdeweerd/shc/actions/runs/10357725849 . The artefact can be download from summary.

I'll let you have a look - I now now that it is in the "pipe" code - a multiplication exceeds the size of int (which is 32bit here even on a 64bit platform).

Maybe it's easier that you start from my branch and make a PR to my repository (and enable actions in your fork).

I also have some failures on WsL with csh, but I do think that is not related to this PR:

====================================================
=== /bin/csh [with shc -v]: FAILED
====================================================
  Files kept in '/tmp/shc.vbR.tst'
  *** Output:
/bin/csh: Event not found.
Hello World fp:first sp:second
 *** End of output

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 12, 2024

Hi @mdeweerd , I've seen the error report, very clear, thank you. The integer overflow is irrelevant: the multiplication is only used to generate pseudo-random numbers in order to create a unique file name for a fifo, it does not matter if the result "wraps around" the 32 bit limit.

@mdeweerd
Copy link

I am adding an annotation to skip the check there.

@fabio-brugnara
Copy link
Author

Thank you @mdeweerd

@mdeweerd
Copy link

I had the annotation, but it's more lumpy than making i locally an unsigned.

However, now the output is not the expected output: https://github.com/mdeweerd/shc/actions/runs/10359338552/job/28675539171 .

@fabio-brugnara
Copy link
Author

Hi @mdeweerd , i cannot be local to that block, because there is a goto a the end, that goes outside the block and uses i, that was declared above at line 659. This is why it stops working with that declaration. In any case, making it unsigned does not help against the overflow, so you can leave it as an int.

@mdeweerd
Copy link

Hi @mdeweerd , i cannot be local to that block, because there is a goto a the end, that goes outside the block and uses i, that was declared above at line 659. This is why it stops working with that declaration.
Ok, I supposed that one the value was used in the filename, i was no longer needed.

In any case, making it unsigned does not help against the overflow, so you can leave it as an int.
The sanitizer only warns for unsigned to signed overflows, so making it unsigned stops this detection?

@fabio-brugnara
Copy link
Author

Ah, OK. So you could simply use an unsigned local variable just for that loop. Given the minimal amount of change, please allow me to propose the change right here without going through the fork and the PR:

        "               unsigned r;",                                                     
        "               for(r=getpid(); l--; ) {",                                        
        "                       r = (r*1436856257)%1436856259;",                          
        "                       sprintf(tnm, \"/tmp/%08x\", r);",                         
        "                       if(!mkfifo(tnm, S_IWUSR|S_IRUSR)) break;",                
        "               }",                                                               

@mdeweerd
Copy link

You can update your PR to have it up-to-date.
I'ld suggest to keep the same name for the variable and add the braces around break;.

I do not know how this is going to go - ideally #168 would get merged into the main branch and possibly automatically close several PRs.
Once you add your update here, I'll merge it in my integration branch (should be almost a "null merge", but helps to close the PR automatically).

@fabio-brugnara
Copy link
Author

Hi @mdeweerd , I updated my PR hopefully according to your suggestions.

@suntong
Copy link
Contributor

suntong commented Aug 15, 2024

Hi @fabio-brugnara, just realized something --

How shc will handle $0 after using pipes instead of command line to pass the script to the interpreter? I.e., what'd the output be for the following?

echo "command line: $0"

Hopefully not /proc/self/fd/nn, as I'm using $0 to located related resources for all my scripts all the time.

I know it has been an issue for all tools using pipes, like chenzhch/shellc#10, which is unacceptable / unusable for my cases.

@mdeweerd
Copy link

mdeweerd commented Aug 15, 2024

@suntong

$ cat test.cmd
#!/bin/bash
echo "command line: $0"
$ ./shc -P -f test.cmd
$ ./test.cmd.x
command line: /tmp/1a09ab52
$ ./shc -f test.cmd
$ ./test.cmd.x
command line: ./test.cmd.x

The result without -P is preferable.
However, it's not blocking to add the function but would need to be documented.

And hopefully there is a solution for it.

@suntong
Copy link
Contributor

suntong commented Aug 15, 2024

Thanks for the report, mdeweerd.

It's fixable, please see:

https://github.com/liberize/ssc/blob/24e131be8cfc1799ca654440ad2e8c960962f69d/src/main.cpp#L230-L236

i.e., from it readme:

  • $0 / $ARGV[0] / sys.argv[0] is replaced by /dev/fd/xxx. Try -0 flag or use $SSC_ARGV0 instead.

thx

@mdeweerd
Copy link

The non-piping version can "fake" $0, the "-D" option shows the script name is also provided as an argument in the final call (when not using -P) so I guess that is used somehow.

The following would only help for shell scripts, so it's not a generic solution, but executing it shows the filename of the script, showing that sourcing the 'pipe' does not change $0.

#!/bin/bash
rm ./tmpfifo >& /dev/null
mkfifo ./tmpfifo
echo 'echo "command line: $0"' > ./tmpfifo &
. ./tmpfifo

@mdeweerd
Copy link

It's fixable, please see:

I think that is wat's done when the -P option is not used as far as the debug output goes, with with the pipe option some argv settings are skipped.

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 15, 2024 via email

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 15, 2024

Hi @suntong, @mdeweerd . I just pushed a commit (one line change) that allow to retrieve the original name through SHC_ARGV0.
You can test it with:

#!/bin/bash
me=$_
echo Hello from $me, aka $SHC_ARGV0, but really $0
unset SHC_ARGV0 #don't want to leave it in the environment

@mdeweerd
Copy link

mdeweerd commented Aug 16, 2024

Thanks for that @fabio-brugnara .
I think there is a solution though to keep $0 for most shells alon the lines suggested in siberize/scc but already implemented in shc in fact.

I tested bash and csh. csh did not yield the desired result, but that's already documented in shc's code.

        { "csh", "-c", "-b", "exec '%s' $argv" },  /* AIX: No file for $0 */

The following sample launches bash from a csh, but the core solution is the command to bash and the fact that the first argument is what you want to see in $0.

Therefore, a trick for most shells is to launch them with a -c option which argument is source PIPE or . PIPE and then the original script name.
Maybe shellsDB needs an extra colomn to indicate the command to inject in the script for each shell (especially for python).
Possibly the original code could be reused a bit more as it already injects -c and only its argument would need to be updated (along with the fifo approach).

#!/bin/csh
rm ./tmpfifo >& /dev/null
mkfifo ./tmpfifo
echo 'echo "command line: $0"' > ./tmpfifo &
echo "START SHELL"
bash -c "source ./tmpfifo" hello
echo "END SHELL"
rm ./tmpfifo >& /dev/null

The output from the bash script is command line: hello demonstrating that $0 has been forged.

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 16, 2024

HI @mdeweerd , I created a branch argv0 in my fork of the repo. There you can find shc.c that implements the the . PIPE filename args trick, and adds a column pfmt to the shellsDB, as you suggested, ideally for describing how to obtain this behavior for other interpreters beside *sh. I don't know if it can be achieved this way for those. So far it works only for *sh, breaks things e.g. for python, rc. ssc modifies the input in order to fix argv0 for non-sh interpreters.
Just to let you have a look, maybe you find something interesting.

@suntong
Copy link
Contributor

suntong commented Aug 16, 2024

Are you compiling with -P option? Here it doesn't seem so.

I tested the current shc behavior,

I was testing with the current shc, v4.0.3-1, and I realized that my approach

still depends on $0 being provided correctly.

@mdeweerd
Copy link

@fabio-brugnara I'll have a look into your fork and check if I can find an approch for python.

@mdeweerd
Copy link

This seems an acceptable approach for python:

cat > the_script.py <<EOF
import sys;
print("# Python's ARGV[0] is " + sys.argv[0] + "\n")
EOF
python -c 'import sys; sys.argv[0] = "forged_script_name.py"; exec(open("the_script.py").read())'
# Python's ARGV[0] is forged_script_name.py

@fabio-brugnara
Copy link
Author

Hi @mdeweerd , thank you for the idea. I'll see if I find time today to put it inside the code.

@mdeweerd
Copy link

FYI, I've updated ttest.sh to also verify the '$0' value, and implemented more special cases (such as not verifying it for csh).

I noticed that while perl is supported in the script, it's also not showing the expected arguments.

Result of a test run:
https://github.com/mdeweerd/shc/actions/runs/10431612595/job/28891457265

Test script:
https://github.com/mdeweerd/shc/blob/argv0_test/test/ttest.sh

@mdeweerd
Copy link

For certen exec versions, the -a option also exists. Not that we have to use it, but for information:

( exec -a forgedname /bin/bash -c '/bin/printf $0' )

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 17, 2024

Hi @mdeweerd , please have a look at the last commit on branch argv0.

  • your proposal for python (with minimum change) has been implemented and is working. In fact, it works only with the -P option. Python treats the order of parameters differently. -P becomes therefore forced for python. I think python was never used without -P, and the same holds for perl, that was actually not mentioned.
  • the fix of argv parameter has been put under option -0, and disabled by default, but we reverse the behavior if you prefer
  • there is still the possibility to rely on SHC_ARGV0 for something that works evreywhere, with the caveat of doing
    argv0=${SHC_ARGV0:-$0} SHC_ARGV0= ot the analogous in python.

Hope it is useful. In this week I had some spare time to work on this, in the near future I don't think I will be able to do the same.
I added you as a collaborator to my fork of the repo.

@mdeweerd
Copy link

@fabio-brugnara Thanks. I've added the ci flow to argv0 (which can be removed if needed later) where I disable the run with the sanitizer (as those issues still remain in this branch).

I think I had better results on an earlier run (locally).

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 17, 2024

Hi @mdeweerd , whit my last commit, I added support for perl (as with python, -P option is forced, there is nothing to lose).
As for the sanitizer, the current version, also that in argv0, has the fix to unisgned that should not trigger the sanitizer.
Unless there is another one.

@mdeweerd
Copy link

mdeweerd commented Aug 17, 2024

@fabio-brugnara Thank you for these updates.
Did you notice that ci is blocking for csh with the -P option (after the -S test that shows PASSED)? This also occurs on my WSL/bookworm environment.

When running 'a.out' for that test, I get:

$ ./a.out
Ambiguous output redirect.

I also have a multitule of 'a.out' processes in my process list.

@mdeweerd
Copy link

By replacing the '-c' option with '-xvc', I found that the binary file is sourced with tcsh with '-P' active. I suspect that this is the same for other "shells":

====================================================
=== /usr/bin/tcsh [with shc -P]: FAILED
====================================================
  Files kept in '/tmp/shc.tcsh-P.DbE.tst'
*** Expected Output:
/usr/bin/tcsh: Hello World sn:/tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P fp:first sp:second
*** Output:
exec '/tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P' $argv
exec /tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P /tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P first second
/tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P /tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P first second
source '/tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P'
source /tmp/shc.tcsh-P.DbE.tst/a.out.tcsh-P
^P^A^E^P^P^P�^P�^P^P^A^D000�^B�^B^P^A^F�=�M�M�^E8^G^P^B^F�=�M�M�^A�^A^H^D^D8^C8^C8^C ^H^D^DX^CX^CX^CDD^DS�td^D8^C8^C8^C ^HP�td^Dx0x0x0ll^DQ�td^F^PR�td^D�=�M�M0^B0^B^A/lib64/ld-linux-x86-64.so.2^D^P^EGNU^B��^D^A^D^T^CGNU�6L�^N"|^Q����'^S��<^^�&^D^P^AGNU^C^B^C$^A^F^P^A��@^B$&�ݣk�e�mĹ�@9�^\�^RU^R�^R^]^RN^A C^R
Illegal variable name.
*** End of output

@mdeweerd
Copy link

I extended the test to verify the behavior with arguments containing spaces and quotes.
I fixed tcsh and csh for that.

There is still the sourcing of the binary to fix with -P.

@fabio-brugnara
Copy link
Author

Hi @mdeweerd , I fixed sourcing of binaries for csh/tcsh. I still see problems with rc and zsh. This happen because they set $0 also inside sourced scripts, so the trick does not work

@mdeweerd
Copy link

@fabio-brugnara That's great - I managed to forge $0 for zsh with the pipe option as well.
I update the test so that it passes for that rc case as well..

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 18, 2024

Hi @mdeweerd , thank you. So you think the state of the software is acceptable now? (btw, I fixed a minor oversight in ttest.sh were the shells list was left with zsh only.) If we are finished, is it the case to merge the argv0 branch into master, or will you simply pull things from where they are now.?

@mdeweerd
Copy link

@fabio-brugnara @suntong

It's been an nice cooperation on this 😄 .

Thanks for removing the line to limit to zsh which was the focus of my tests.

There are just two things that remain IMHO:

  • As far as I understood, the new -0 option is required to forge $0 - I activated it in the test case. I believe it was suggested, I think the meaning should be inverted. My expectation would be that by default $0 is the same as running the script normaly;
  • I think there's also some documentation to add.

I'll merge the updates in my integration branch #168 .

While it's vacation time, I think @neurobin has not been responding on this project for about a year, so we need to consider having a new "official" fork.

@suntong
Copy link
Contributor

suntong commented Aug 18, 2024

Yes, nice cooperation AND great achievement indeed!!! I checked the following first thing in the morning then come back here to congratulate you two.

https://github.com/mdeweerd/shc/actions/runs/10440900647/job/28911372960

I think the meaning should be inverted. My expectation would be that by default $0 is the same as running the script normaly;

Hmm... yes I agree with you theoretically, but just to not confuse people coming from ssc, let's use another flag instead, what was you choice before @fabio-brugnara, before you switched to -0? BTW @fabio-brugnara, if you think the feature should be opt-in by choice, I'm fine with it as well, in which case we can keep -0.

While it's vacation time, I think @neurobin has not been responding on this project for about a year, so we need to consider having a new "official" fork.

TOTALLY agree.

Let's give a few days for @neurobin / @ashamedbit to review, and see what they're going to say...

I was hoping that either @neurobin or @ashamedbit would participate in the work of the new shc release, but since neither of them responded, and @neurobin has not been responding on this project for about a year, I too think it is about time.

@mdeweerd, I know that you has expressed that you don't want to be the maintainer of shc,
but since @fabio-brugnara has mentioned that he might not be have such free time in the (near) future, and since you're the one who has done the most heavy lifting besides this feature, I think you're the natural candidate for it, at least temporarily. (Personally I've forked and maintained a Go project before, then the author came back in about a year and I passed the maintainer back to him).

We can also create a shc GH organization, to share the maintenance responsibilities, but just an ideal, totally up to you two to decide.

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 18, 2024

Hi @mdeweerd , @suntong I'm also grateful to you for this nice and effective collaboration.

As for the option, I made it opt-out through -p , to avoid confusion with ssc, as @suntong rightly notes.
That is, when -P is selected, forging of argv0, as we have added in this work, is active, unless disabled with -p.

I've also forced -P for *csh, because of that error of Event non found, coming from the fact that csh unconditionally tries to apply history substitution also to the hashbang when contained in the command passed with -c, and I couldn't find a way to disable this behavior.

I also think that there is little chance that this PR will be taken into consideration given the lack of reaction of the current maintainer.

As for the candidate for a new mantainer, I agree in seeing @mdeweerd a perfect match for what I've seen, but of course the role has to accepted willingly.
As for me, I have very little spare time in general, it was a coincidence that this issue emerged this week and I could respond. Beside, this is my first experience on a PR to a github project. So I don't feel adequate for a mantainer role. I'm of course available to cooperate on specific topics.

@mdeweerd
Copy link

Ok, I'll manage the (interrim) repo to merge requests (and interact for those), keep the code up-to-date, but not as such to fix/solve/respond to issues.

I am busy myself, but a long time ago somebody told me that this was the reason: busy people can always do more he said (implying that the others don't and because that's there's a skill to know how to "be busy").

I checked the GH Organisation option, shc is already taken. Maybe another option?

The evolution for '-p' is great, but this leaves the '-0' option open. We can rename it, but I feel that the feature should be 'on' by default.

I'll merge the latest changes once we settled on the -0 option (I need to go to a few steps to merge: apply the code formatting on a separate branch, then merge that branch).

@fabio-brugnara and myself both lifted some weight - there is a net improvement to shc with the changes made whilst working on this PR (the testcases prove it).

@fabio-brugnara
Copy link
Author

Hi @mdeweerd , thank you for your availability to take care of the repo. I'll try to be of support, even without an "official" role. Consider that my involvement in shc was related only to this pipe thing, mainly motivated by the reason that exposing the code through /proc/*/cmdline seemed too risky to me. (Well, and the other question of the occasional lockups that I met).

As for the -p option, maybe there is a misunderstanding. The -0 option is not needed anymore, you can see that it no longer appears in ttest.sh. The "new" feature of argv forging is 'on' by default once you set -P. You have the option to add -p to disable it. If everything goes well, the latter may never be used. It should be used only in rare cases.

Now that I think of it, we could make the two options alternatives:

  • -P for pipe with argv forging
  • -p for pipe without argv forging

Does this make more sense in your opinion (and @suntong's)?

@mdeweerd
Copy link

@fabio-brugnara Thanks for trying to help out.
Personnally, I checked out the project which did not really meet my needs eventually or I did not have a use for it in the end (I forgot what exactly), and I've kind of a habit to try to improve the flow/documentation of projects that I put some basic effort in.

Regarding the options, this looks better to me:

-P for pipe with argv forging
-p for pipe without argv forging

@suntong
Copy link
Contributor

suntong commented Aug 18, 2024

Regarding the options, this looks better to me:

OK, then, since you two agree with each other, and yes, I agree, it's better than -p amending -P.

I checked the GH Organisation option, shc is already taken.

Oh, yeah, what a shame, someone took it but abandoned it. Anyway,

So, maybe we make a little bit change, like go project tends to be named as go-xx, i.e., we can call the org c-shc, but I actually don't like that. Maybe theshc, the-shc, shc-proj, proj-shc ..? Worse case, we spell it out --

shell-script-compiler

:-) :-)

@mdeweerd
Copy link

Well, the tool is historically SHC, not well defined, but supposedly ShCompiler.
It's not compiling the shell, it's rather a ShebangConcealer/ShebangCloaker/... .

I think ShcTeam, OpenSHC, NextSHC are more appropriate choices.

@suntong
Copy link
Contributor

suntong commented Aug 19, 2024

LGTM, all good, go ahead with your top choice first then, :)

@fabio-brugnara
Copy link
Author

fabio-brugnara commented Aug 19, 2024

Hi @mdeweerd , I pushed a commit that gives the meaning we agreed upon to the options -P, -p.
Besides, it adds the assignment of __file__ variable in the python preamble, as I saw that the lack if it could break scripts.

@suntong
Copy link
Contributor

suntong commented Aug 19, 2024

all good, go ahead with your top choice first then

Or, we can wait and see what the outcome would be, from
https://github.com/orgs/community/discussions/136303

cheers

@mdeweerd
Copy link

mdeweerd commented Aug 19, 2024

@fabio-brugnara Your changes have been integreted.
@suntong Yes, let's wait for the outcome. I suppose that you meant 'winning' not 'wining' ;-).

I've also added tests for the new option (-p) and several older ones updating the script to cope with exceptions (incompatibility/forgibility) by generalising the test script a bit further.

And a "milestone" has been reached - 100 commits ahead of master in the integration branch.

@suntong
Copy link
Contributor

suntong commented Aug 19, 2024

Oh, it was promptly declined. Well, at least I tried, and asked very nicely and convincingly. No regret anymore. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants