Division by negative numbers returns "division by zero" #770

leo-ard · 2024-07-23T19:41:49Z

Doing an arithmetic expansion with a division where the dividend is a negative number returns a "division by zero error" :

> ksh
$ echo $(( 10 / -10 ))
ksh:  10 / -10 : divide by zero
$ echo $(( 10 / -1 ))
ksh:  10 / -1 : divide by zero
$ echo $(( 10 / -2 ))
ksh:  10 / -2 : divide by zero

Tested with version sh (AT&T Research) 93u+ 2012-08-01 of ksh. Running on Macos M2 14.1.1 (23B81)

The text was updated successfully, but these errors were encountered:

McDutchie · 2024-07-23T22:02:39Z

Still reproducible in 93u+m, unfortunately. Thanks for the report.

Kind of amazing it took this long for someone to notice!

McDutchie · 2024-07-23T22:10:01Z

So, we forked from the last "stable" (ha!) AT&T release, 93u+ 2012-08-01, inheriting this unfortunate bug.

Testing my stash of compiled historic AT&T ksh versions shows that the bug was introduced in 93q 2005-01-31 and fixed in 93v- 2013-06-28.

So, with the help of the ast-open-archive repo, we should be able to isolate and backport the fix from that beta version.

McDutchie · 2024-07-23T22:56:23Z

This is likely the relevant RELEASE file entry from the 2013-06-28 commit:

13-06-04  A bug with the arithmetic expression $((2**63 / -1)) on some
         systems has been fixed.

And here is the 93v- 2013-06-28 change that fixes the bug, as a patch to 93u+m:

diff --git a/src/cmd/ksh93/sh/streval.c b/src/cmd/ksh93/sh/streval.c
index 79f75ee14..ac09e3d01 100644
--- a/src/cmd/ksh93/sh/streval.c
+++ b/src/cmd/ksh93/sh/streval.c
@@ -351,7 +351,7 @@ Sfdouble_t	arith_exec(Arith_t *ep)
 				num = sp[-1]/num;
 				type = 1;
 			}
-			else if((Sfulong_t)(num)==0)
+			else if((Sfulong_t)(num < 0 ? -num : num)==0)
 				arith_error(e_divzero,ep->expr,ep->emode);
 			else if(type==2 || tp[-1]==2)
 				num = U2F((Sfulong_t)(sp[-1]) / (Sfulong_t)(num));

So, the bug was caused by the fact that a floating-point variable with a negative value (num) was typecast to an unsigned integer type (Sfulong_t which is unsigned long long on most systems).

Converting a negative integer to an unsigned integer has implementation-defined results in C. However, as I am learning right now, casting a negative float to an unsigned integer is undefined behaviour according to the C standard. In practice, it behaves differently on Intel systems than it does on ARM systems (including Apple's M1/M2 processors). On ARM systems, the result of the undefined typecast is simply always 0, resulting in this bug.

The patch fixes it by making sure that the number is not negative before typecasting it.

Relevant reading (this article is about C++ but the undefined behaviour is the same in C):
https://embeddeduse.com/2013/08/25/casting-a-negative-float-to-an-unsigned-int/

leo-ard · 2024-07-24T01:32:09Z

Wow, what a sneaky bug. Thanks for sharing the debugging process ! It's crazy how little details in the C specification can cause bugs years later.

If you are interested, I found this bug working on pnut, a C to Shell compiler (yes you read this right). We had some division tests that were failing, flagging this bug in the underlying shell implementation. Among the shells we run on, Ksh is often the fastest one (and sometimes by far)

McDutchie · 2024-07-24T02:29:17Z

Workaround: make one of the numbers a floating point type by adding a .0 or ,0 (depending on the locale). Assuming LC_NUMERIC=C (or another locale that uses .), the following works without a patch:

$ echo $(( 10.0 / -10 ))                                                                                                           
-1
$ echo $(( 10 / -1.0 ))
-10
$ echo $(( 10 / -2.0 ))
-5

leo-ard · 2024-07-24T02:41:57Z

Oh nice trick. My workaround was to negate the divisor and the result when the divisor was negative :

#Lets say we want to divide x by y

if [ $y -lt 0 ] ; then 
    result=$((-(x / -y)))
else 
    result=$((x / y)) 
fi

McDutchie · 2024-07-24T03:49:03Z

I introduced a similar bug with the same undefined behaviour myself in 4592859 (see there for explanation). On ARM systems:

$ typeset -ui i
$ echo $((i = -5))
0
$ echo $i
4294967291

The assignment worked fine (wrapped around as expected for an unsigned integer), but the value of the assignment was 0 and should have been the same as the variable's new value.

So lines 272, 274 and 276 have undefined behaviour for a negative n:

ksh/src/cmd/ksh93/sh/arith.c

Lines 271 to 276 in 9926e2b

    
           else if((attr & NV_UINT64)==NV_UINT64)		/* long unsigned integer */ 
        
           	r = (uintmax_t)n; 
        
           else if((attr & NV_UINT16)==NV_UINT16)		/* short unsigned integer */ 
        
           	r = (uint16_t)n; 
        
           else if((attr & NV_UINT32)==NV_UINT32)		/* normal unsigned integer */ 
        
           	r = (uint32_t)n;

McDutchie · 2024-07-24T04:05:36Z

Patch to test for that other problem: to avoid undefined behaviour, invert the sign of a negative number before the typecast, then invert it again after. Maybe this is naive.

diff --git a/src/cmd/ksh93/sh/arith.c b/src/cmd/ksh93/sh/arith.c
index 01703e45f..6327b6865 100644
--- a/src/cmd/ksh93/sh/arith.c
+++ b/src/cmd/ksh93/sh/arith.c
@@ -269,11 +269,11 @@ static Sfdouble_t arith(const char **ptr, struct lval *lvalue, int type, Sfdoubl
 		else if((attr & NV_DOUBLE)==NV_DOUBLE)		/* normal float */
 			r = (double)n;
 		else if((attr & NV_UINT64)==NV_UINT64)		/* long unsigned integer */
-			r = (uintmax_t)n;
+			r = n < 0 ? -((uintmax_t)-n) : (uintmax_t)n;
 		else if((attr & NV_UINT16)==NV_UINT16)		/* short unsigned integer */
-			r = (uint16_t)n;
+			r = n < 0 ? -((uint16_t)-n) : (uint16_t)n;
 		else if((attr & NV_UINT32)==NV_UINT32)		/* normal unsigned integer */
-			r = (uint32_t)n;
+			r = n < 0 ? -((uint32_t)-n) : (uint32_t)n;
 		else if((attr & NV_INT64)==NV_INT64)		/* long signed integer */
 			r = (intmax_t)n;
 		else if((attr & NV_INT16)==NV_INT16)		/* short signed integer */

McDutchie · 2024-07-24T15:25:41Z

Thinking through that second patch. I think it's actually fine, if not obvious. If n is negative, we invert the sign using the unary minus, do the typecast, and then do another unary minus operation -- at which point we're performing a unary minus on a positive unsigned integer value, which is well-defined behaviour in C and yields a positive wrapped-around unsigned integer value, which is fine to implicitly typecast back to r's float type.

McDutchie · 2024-07-24T17:33:44Z

There are more terrible arithmetic problems that are a lot harder to solve, but that's for a separate issue, see #771.

@leo-ard

*** BUG 1 *** Léonard Oest O'Leary (@leo-ard) reports: > Doing an arithmetic expansion with a division where the dividend > is a negative number returns a "division by zero error" : > > > ksh > $ echo $(( 10 / -10 )) > ksh: 10 / -10 : divide by zero > $ echo $(( 10 / -1 )) > ksh: 10 / -1 : divide by zero > $ echo $(( 10 / -2 )) > ksh: 10 / -2 : divide by zero > > Tested with version sh (AT&T Research) 93u+ 2012-08-01 of ksh. > Running on Macos M2 14.1.1 (23B81) Turns out this bug is reproducible on all ARM systems. It was inherited from 93u+ 2012-08-01. Testing my stash of compiled historic AT&T ksh versions (on a Debian arm64 VM) shows that the bug was introduced in 93q 2005-01-31 and fixed in 93v- 2013-06-28. Analysis: All shell arithmetic is internally done with the maximum-length float type that the system supports (Sfdouble_t). Since comparison with floats is inexact, ksh converts the numnber to Sfulong_t, the largest unsigned integer type supported by the system, before comparing it to zero and throwing a "divide by zero". In streval.c: 354: else if((Sfulong_t)(num)==0) 355: arith_error(e_divzero,ep->expr,ep->emode); However, converting a negative float value to an unsigned integer type is undefined behaviour in the C standard. On (at least) ARM systems and IBM POWER systems [*1], the result is always zero in that case, which causes this bug. src/cmd/ksh93/sh/streval.c: arith_exec(): - Backport the 93v- 2013-06-28 fix: make sure that the number is not negative before typecasting it and compating it to zero. *** BUG 2 *** I introduced a similar bug with the same undefined behaviour myself in 4592859 (see there for explanation). On ARM systems: $ typeset -ui i $ echo $((i = -5)) 0 $ echo $i 4294967291 The assignment worked fine (wrapped around as expected for an unsigned integer), but the value of the assignment was 0 and should have been the same as the variable's new value. So lines 272, 274 and 276 in arith.c have undefined behaviour for a negative n (which is of the Sfdouble_t float type): 271: else if((attr & NV_UINT64)==NV_UINT64) /* long unsigned integer */ 272: r = (uintmax_t)n; 273: else if((attr & NV_UINT16)==NV_UINT16) /* short unsigned integer */ 274: r = (uint16_t)n; 275: else if((attr & NV_UINT32)==NV_UINT32) /* normal unsigned integer */ 276: r = (uint32_t)n; src/cmd/ksh93/sh/arith.c: arith(): - Before doing those typecasts, check if n is negative. If so, invert the sign using the unary minus (yielding a positive float), do the typecast, and then do another unary minus operation on the resulting positive unsigned integer value -- which is well-defined behaviour in C [*2] and yields a positive wrapped-around unsigned integer value, which is fine to implicitly typecast back to r's float type. Resolves: #770 [*1] https://www.ibm.com/support/pages/assigning-negative-float-value-unsigned-integral-type-yields-undefined-value-stored-type [*2] https://stackoverflow.com/questions/8026694/c-unary-minus-operator-behavior-with-unsigned-operands

McDutchie added bug Something is not working blocker This had better be fixed before releasing labels Jul 23, 2024

McDutchie closed this as completed in 42915ee Jul 24, 2024

This was referenced Aug 31, 2024

Add negative-zero.c test to document ksh bug udem-dlteam/pnut#73

Merged

Assignment of -0 in arithmetic expansion doesn't remove negative sign #781

Open

oliwer mentioned this issue Oct 9, 2024

Infinite loop when evaluating ((-2**63/-1)) #789

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Division by negative numbers returns "division by zero" #770

Division by negative numbers returns "division by zero" #770

leo-ard commented Jul 23, 2024 •

edited

Loading

McDutchie commented Jul 23, 2024

McDutchie commented Jul 23, 2024 •

edited

Loading

McDutchie commented Jul 23, 2024 •

edited

Loading

leo-ard commented Jul 24, 2024 •

edited

Loading

McDutchie commented Jul 24, 2024

leo-ard commented Jul 24, 2024 •

edited

Loading

McDutchie commented Jul 24, 2024 •

edited

Loading

McDutchie commented Jul 24, 2024

McDutchie commented Jul 24, 2024 •

edited

Loading

McDutchie commented Jul 24, 2024

Division by negative numbers returns "division by zero" #770

Division by negative numbers returns "division by zero" #770

Comments

leo-ard commented Jul 23, 2024 • edited Loading

McDutchie commented Jul 23, 2024

McDutchie commented Jul 23, 2024 • edited Loading

McDutchie commented Jul 23, 2024 • edited Loading

leo-ard commented Jul 24, 2024 • edited Loading

McDutchie commented Jul 24, 2024

leo-ard commented Jul 24, 2024 • edited Loading

McDutchie commented Jul 24, 2024 • edited Loading

McDutchie commented Jul 24, 2024

McDutchie commented Jul 24, 2024 • edited Loading

McDutchie commented Jul 24, 2024

leo-ard commented Jul 23, 2024 •

edited

Loading

McDutchie commented Jul 23, 2024 •

edited

Loading

McDutchie commented Jul 23, 2024 •

edited

Loading

leo-ard commented Jul 24, 2024 •

edited

Loading

leo-ard commented Jul 24, 2024 •

edited

Loading

McDutchie commented Jul 24, 2024 •

edited

Loading

McDutchie commented Jul 24, 2024 •

edited

Loading