Fix infinite loop on arithmetic for syntzx error on interactive

Marc Wilson (@posguy99) reports: > A completely errant paste that wasn't meant for the terminal > window exposed this... > > Reproducer: > > $ for ((i = 5 , i = 0, --i)) > -ksh: syntax error: `{' unmatched > -ksh: syntax error: `{' unmatched > -ksh: syntax error: `{' unmatched > -ksh: syntax error: `{' unmatched > -ksh: syntax error: `{' unmatched > > Continues forever until you kill the shell. Simplest reproducer is 'for(())', with identical symptoms. In scripts, the symptom is less severe, but the error message is still incorrect as above. For example: $ ksh -c 'for(())' ksh: syntax error at line 1: `{' unmatched The expected error is: ksh: syntax error at line 1: `))' unexpected ksh93u+ and ksh2020 are also broken, just not in the same way, and harder to reproduce. The current infinite-loop manifestation of the bug was introduced in commit f8f2c4b. Analysis: what happens is this: 1. arithfor() in parse.c is called to parse the arithmetic for loop 2. arithfor() saves current input stream and opens a new input stream to separate each of the three expressions in (( ... )) 3. on line 758, arithfor() calls sh_lexskip() to find the next ';' 4. in lex.c line 1749, sh_lexskip() calls the main lexer function, sh_lex() 5. in lex.c line 375, sh_lex() calls sh_syntax() to throw the syntax error in the reproducer 6. sh_syntax() calls errormsg() from libast, which causes a longjmp straight back to the prompt, without giving arithfor() a chance to restore the input stream that it saved in step 2. This is the bug. We can get all sorts of unpredictable/undefined behaviour that way. I can think of two strategies to fix this: 1. Push context and sigsetjmp in arithfor() before calling sh_lexskip(), so that it longjmps back to arithfor() and it can restore the input stream before doing its own longjmp. This may carry a minor performance hit for nested arithmetic for loops. 2. Since arithfor() throws a syntax error itself when needed anyway (and does so after restoring the input stream), we could also find a way for sh_lex() not to throw that syntax error when called from sh_lexskip. Strategy 2 might be the best way. It just so happens that sh_lexskip sets its own flag, lp->lexd.noarg, that we can use in sh_lex() to tell if it was called from sh_lexskip(). So we could use that to prevent it from throwing the syntax error and just return instead, handing control back to arithfor(), so that it can throw the syntax error after restoring state. src/cmd/ksh93/include/shlex.h, src/cmd/ksh93/sh/lex.c: - Rename lexd.noarg flag to lexd.inlexskip. Since it will now be multi-purpose, this is clearer to the code reader. It simply means that the current sh_lex() invocation was called from sh_lexwskip(). - When reading end-of-file, do not throw a symtax error from sh_lex() if lexd.inlexskip is set, letting the caller handle it instead. This fixes the bug. Resolves: #779
ksh93 · Aug 26, 2024 · 1a30a27 · 1a30a27
1 parent b75df92
commit 1a30a27
Show file tree

Hide file tree

Showing 4 changed files with 22 additions and 7 deletions.
diff --git a/NEWS b/NEWS
@@ -4,6 +4,9 @@ Uppercase BUG_* IDs are shell bug IDs as used by the Modernish shell library.
 
 2024-08-26:
 
+- Fixed an infinite loop that occurred when a syntax error was thrown
+  for an arithmetic 'for' loop on the interactive command line.
+
 - [v1.1] Backported a change to '_' in types from the unreleased ksh 93v-
   beta. If a 'typeset -T' type declaration sets a discipline function for
   one of its subvariables (such as var.get() or var.getn()), then, within

diff --git a/src/cmd/ksh93/include/shlex.h b/src/cmd/ksh93/include/shlex.h
@@ -52,7 +52,7 @@ struct _shlex_pvt_lexdata_
 	char		docword;
 	char		nested_tilde;
 	char 		*docend;
-	char		noarg;
+	char		inlexskip;	/* set when sh_lex() is called from sh_lexskip() */
 	char		warn;
 	char		message;
 	char		arith;

diff --git a/src/cmd/ksh93/sh/lex.c b/src/cmd/ksh93/sh/lex.c
@@ -150,13 +150,13 @@ static void lex_advance(Sfio_t *iop, const char *buff, int size, void *context)
 	{
 		size -= (lp->lexd.first-(char*)buff);
 		buff = lp->lexd.first;
-		if(!lp->lexd.noarg)
+		if(!lp->lexd.inlexskip)
 			lp->arg = stkseek(sh.stk,ARGVAL);
 #if SHOPT_KIA
 		lp->lexd.kiaoff += ARGVAL;
 #endif /* SHOPT_KIA */
 	}
-	if(size>0 && (lp->arg||lp->lexd.noarg))
+	if(size>0 && (lp->arg||lp->lexd.inlexskip))
 	{
 		sfwrite(sh.stk,buff,size);
 		lp->lexd.first = 0;
@@ -344,7 +344,7 @@ int sh_lex(Lex_t* lp)
 				/* end-of-file */
 				if(mode==ST_BEGIN)
 					return lp->token=EOFSYM;
-				if(mode >ST_NORM && lp->lexd.level>0)
+				if(mode >ST_NORM && lp->lexd.level>0 && !lp->lexd.inlexskip)
 				{
 					switch(c=endchar(lp))
 					{
@@ -706,7 +706,7 @@ int sh_lex(Lex_t* lp)
 				mode = ST_NORM;
 				continue;
 			case S_LIT:
-				if(oldmode(lp)==ST_NONE && !lp->lexd.noarg)	/* in ((...)) */
+				if(oldmode(lp)==ST_NONE && !lp->lexd.inlexskip)	/* in ((...)) */
 				{
 					if((c=fcpeek(0))==LPAREN || c==RPAREN || c=='$' || c==LBRACE || c==RBRACE || c=='[' || c==']')
 					{
@@ -1741,13 +1741,13 @@ void sh_lexskip(Lex_t *lp,int close, int copy, int  state)
 	char	*cp;
 	lp->lexd.nest = close;
 	lp->lexd.lex_state = state;
-	lp->lexd.noarg = 1;
+	lp->lexd.inlexskip = 1;
 	if(copy)
 		fcnotify(lex_advance,lp);
 	else
 		lp->lexd.nocopy++;
 	sh_lex(lp);
-	lp->lexd.noarg = 0;
+	lp->lexd.inlexskip = 0;
 	if(copy)
 	{
 		fcnotify(0,lp);

diff --git a/src/cmd/ksh93/tests/loop.sh b/src/cmd/ksh93/tests/loop.sh
@@ -69,5 +69,17 @@ got=$(for i in a b c; do print -n $i; for j in 1 2 3; do print -n $j; for k in x
 exp=a1xb1xc1x
 [[ $got == "$exp" ]] || err_exit "'continue 3' broken (expected '$exp', got '$got')"
 
+# ======
+# arithmetic for
+
+got=$(eval 'for((i=0,i<10,i++)); do :; done' 2>&1)
+exp=': syntax error at line 1: `))'\'' unexpected'
+[[ $got == *"$exp" ]] || err_exit "arithmetic for syntax error" \
+	"(expected match of *$(printf %q "$exp"), got $(printf %q "$got"))"
+got=$(eval 'for(())' 2>&1)
+exp=': syntax error at line 1: `))'\'' unexpected'
+[[ $got == *"$exp" ]] || err_exit "arithmetic for syntax error" \
+	"(expected match of *$(printf %q "$exp"), got $(printf %q "$got"))"
+
 # ======
 exit $((Errors<125?Errors:125))