Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser confusions with braces inside ${ cmd; } form of command substitution #691

Open
stephane-chazelas opened this issue Oct 2, 2023 · 1 comment
Labels
1.1 Issue relevant to dev branch (v1.1.*) backburner Low priority (but feel free to fix it and do a PR) bug Something is not working

Comments

@stephane-chazelas
Copy link

Using braces inside ${ ...; } command substitution sometimes causes some `{' unmatched errors:

$ ksh -c 'echo ${ echo {a.b}; }'
ksh: syntax error at line 1: `{' unmatched
$ ksh -c 'echo ${ echo "{a,b}c; }'
ksh: syntax error at line 1: `"' unmatched
$ ksh -c 'echo ${ echo {fd[0]}< /dev/null; }'
ksh: syntax error at line 1: `{' unmatched

These are OK for some reason:

$ ksh -c 'echo ${ echo {acb}; }'
{acb}
$ ksh -c 'echo ${ echo {a b}; }'
{a b}
$ ksh -c 'echo ${ echo x{a,b} ;}'
xa xb
$ ksh -c 'echo ${ echo ${0}; }'
ksh
$ ksh -c 'echo ${ a={a,b}c; echo $a }'
ac bc

Quoting the cmdsubst doesn't help:

$ ksh -c 'echo "${ echo {a,b}; }"'
ksh: syntax error at line 1: `"' unmatched

Escaping/quoting the { avoids the error:

$ ksh -c 'echo ${ echo \{a,b}c; }'
ac bc
$ ksh -c 'echo "{"a,b}c'
{a,b}c
$ ksh -c 'echo ${ echo "{"a,b}c; }'
ac bc

I guess the brace expansion is done on the output of unquoted command substitution:

$ ksh -o posix -c 'echo ${ echo \{a,b} ;}'
{a,b}
$ ksh --version
  version         sh (AT&T Research) 93u+m/1.0.4 2022-10-22
@McDutchie
Copy link

This is the reason for bugs like these:

ksh/src/cmd/ksh93/sh/lex.c

Lines 1497 to 1523 in 4dacec2

/*
* read to end of command substitution
* of the form $(...) or ${ ...;}
* or arithmetic expansion $((...))
*
* Ugly hack alert: At parse time, command substitutions and arithmetic expansions are read
* without parsing, using lexical analysis only. This is only to determine their length, so
* that their literal source text can be stored in the parse tree. They are then actually
* parsed at runtime (!) each time they are executed (!) via comsubst() in macro.c.
*
* This approach is okay for arithmetic expansions, but for command substitutions it is an
* unreliable hack. The lexer does not have real shell grammar knowledge; that's what the
* parser is for. However, a clean separation between lexical analysis and parsing is not
* possible, because the design of the shell language is fundamentally messy. So we need the
* parser to set the some flags in the lexer at the appropriate times to avoid spurious
* syntax errors (these are the non-private Lex_t struct members). But the parser obviously
* cannot do this if we're not using it.
*
* The comsub() hack below, along with all the dolparen checks in the lexer, tries to work
* around this fundamental problem as best we can to make it work in all but corner cases.
* It sets the lexd.dolparen, lexd.dolparen_eqparen and lexd.dolparen_arithexp flags for the
* rest of the lexer code to execute lots of workarounds.
*
* TODO: to achieve correctness, actually parse command substitutions at parse time.
*/
static int comsub(Lex_t *lp, int endtok)
{

So far, my experiments in that direction have not been successful.

@McDutchie McDutchie added bug Something is not working backburner Low priority (but feel free to fix it and do a PR) 1.1 Issue relevant to dev branch (v1.1.*) labels Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.1 Issue relevant to dev branch (v1.1.*) backburner Low priority (but feel free to fix it and do a PR) bug Something is not working
Projects
None yet
Development

No branches or pull requests

2 participants