-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Division by negative numbers returns "division by zero" #770
Comments
Still reproducible in 93u+m, unfortunately. Thanks for the report. Kind of amazing it took this long for someone to notice! |
So, we forked from the last "stable" (ha!) AT&T release, 93u+ 2012-08-01, inheriting this unfortunate bug. Testing my stash of compiled historic AT&T ksh versions shows that the bug was introduced in 93q 2005-01-31 and fixed in 93v- 2013-06-28. So, with the help of the ast-open-archive repo, we should be able to isolate and backport the fix from that beta version. |
This is likely the relevant
And here is the 93v- 2013-06-28 change that fixes the bug, as a patch to 93u+m: diff --git a/src/cmd/ksh93/sh/streval.c b/src/cmd/ksh93/sh/streval.c
index 79f75ee14..ac09e3d01 100644
--- a/src/cmd/ksh93/sh/streval.c
+++ b/src/cmd/ksh93/sh/streval.c
@@ -351,7 +351,7 @@ Sfdouble_t arith_exec(Arith_t *ep)
num = sp[-1]/num;
type = 1;
}
- else if((Sfulong_t)(num)==0)
+ else if((Sfulong_t)(num < 0 ? -num : num)==0)
arith_error(e_divzero,ep->expr,ep->emode);
else if(type==2 || tp[-1]==2)
num = U2F((Sfulong_t)(sp[-1]) / (Sfulong_t)(num)); So, the bug was caused by the fact that a floating-point variable with a negative value ( Converting a negative integer to an unsigned integer has implementation-defined results in C. However, as I am learning right now, casting a negative float to an unsigned integer is undefined behaviour according to the C standard. In practice, it behaves differently on Intel systems than it does on ARM systems (including Apple's M1/M2 processors). On ARM systems, the result of the undefined typecast is simply always 0, resulting in this bug. The patch fixes it by making sure that the number is not negative before typecasting it. Relevant reading (this article is about C++ but the undefined behaviour is the same in C): |
Wow, what a sneaky bug. Thanks for sharing the debugging process ! It's crazy how little details in the C specification can cause bugs years later. If you are interested, I found this bug working on pnut, a C to Shell compiler (yes you read this right). We had some division tests that were failing, flagging this bug in the underlying shell implementation. Among the shells we run on, Ksh is often the fastest one (and sometimes by far) |
Workaround: make one of the numbers a floating point type by adding a
|
Oh nice trick. My workaround was to negate the divisor and the result when the divisor was negative : #Lets say we want to divide x by y
if [ $y -lt 0 ] ; then
result=$((-(x / -y)))
else
result=$((x / y))
fi |
I introduced a similar bug with the same undefined behaviour myself in 4592859 (see there for explanation). On ARM systems:
The assignment worked fine (wrapped around as expected for an unsigned integer), but the value of the assignment was 0 and should have been the same as the variable's new value. So lines 272, 274 and 276 have undefined behaviour for a negative Lines 271 to 276 in 9926e2b
|
Patch to test for that other problem: to avoid undefined behaviour, invert the sign of a negative number before the typecast, then invert it again after. Maybe this is naive. diff --git a/src/cmd/ksh93/sh/arith.c b/src/cmd/ksh93/sh/arith.c
index 01703e45f..6327b6865 100644
--- a/src/cmd/ksh93/sh/arith.c
+++ b/src/cmd/ksh93/sh/arith.c
@@ -269,11 +269,11 @@ static Sfdouble_t arith(const char **ptr, struct lval *lvalue, int type, Sfdoubl
else if((attr & NV_DOUBLE)==NV_DOUBLE) /* normal float */
r = (double)n;
else if((attr & NV_UINT64)==NV_UINT64) /* long unsigned integer */
- r = (uintmax_t)n;
+ r = n < 0 ? -((uintmax_t)-n) : (uintmax_t)n;
else if((attr & NV_UINT16)==NV_UINT16) /* short unsigned integer */
- r = (uint16_t)n;
+ r = n < 0 ? -((uint16_t)-n) : (uint16_t)n;
else if((attr & NV_UINT32)==NV_UINT32) /* normal unsigned integer */
- r = (uint32_t)n;
+ r = n < 0 ? -((uint32_t)-n) : (uint32_t)n;
else if((attr & NV_INT64)==NV_INT64) /* long signed integer */
r = (intmax_t)n;
else if((attr & NV_INT16)==NV_INT16) /* short signed integer */ |
Thinking through that second patch. I think it's actually fine, if not obvious. If n is negative, we invert the sign using the unary minus, do the typecast, and then do another unary minus operation -- at which point we're performing a unary minus on a positive unsigned integer value, which is well-defined behaviour in C and yields a positive wrapped-around unsigned integer value, which is fine to implicitly typecast back to r's float type. |
There are more terrible arithmetic problems that are a lot harder to solve, but that's for a separate issue, see #771. |
*** BUG 1 *** Léonard Oest O'Leary (@leo-ard) reports: > Doing an arithmetic expansion with a division where the dividend > is a negative number returns a "division by zero error" : > > > ksh > $ echo $(( 10 / -10 )) > ksh: 10 / -10 : divide by zero > $ echo $(( 10 / -1 )) > ksh: 10 / -1 : divide by zero > $ echo $(( 10 / -2 )) > ksh: 10 / -2 : divide by zero > > Tested with version sh (AT&T Research) 93u+ 2012-08-01 of ksh. > Running on Macos M2 14.1.1 (23B81) Turns out this bug is reproducible on all ARM systems. It was inherited from 93u+ 2012-08-01. Testing my stash of compiled historic AT&T ksh versions (on a Debian arm64 VM) shows that the bug was introduced in 93q 2005-01-31 and fixed in 93v- 2013-06-28. Analysis: All shell arithmetic is internally done with the maximum-length float type that the system supports (Sfdouble_t). Since comparison with floats is inexact, ksh converts the numnber to Sfulong_t, the largest unsigned integer type supported by the system, before comparing it to zero and throwing a "divide by zero". In streval.c: 354: else if((Sfulong_t)(num)==0) 355: arith_error(e_divzero,ep->expr,ep->emode); However, converting a negative float value to an unsigned integer type is undefined behaviour in the C standard. On (at least) ARM systems and IBM POWER systems [*1], the result is always zero in that case, which causes this bug. src/cmd/ksh93/sh/streval.c: arith_exec(): - Backport the 93v- 2013-06-28 fix: make sure that the number is not negative before typecasting it and compating it to zero. *** BUG 2 *** I introduced a similar bug with the same undefined behaviour myself in 4592859 (see there for explanation). On ARM systems: $ typeset -ui i $ echo $((i = -5)) 0 $ echo $i 4294967291 The assignment worked fine (wrapped around as expected for an unsigned integer), but the value of the assignment was 0 and should have been the same as the variable's new value. So lines 272, 274 and 276 in arith.c have undefined behaviour for a negative n (which is of the Sfdouble_t float type): 271: else if((attr & NV_UINT64)==NV_UINT64) /* long unsigned integer */ 272: r = (uintmax_t)n; 273: else if((attr & NV_UINT16)==NV_UINT16) /* short unsigned integer */ 274: r = (uint16_t)n; 275: else if((attr & NV_UINT32)==NV_UINT32) /* normal unsigned integer */ 276: r = (uint32_t)n; src/cmd/ksh93/sh/arith.c: arith(): - Before doing those typecasts, check if n is negative. If so, invert the sign using the unary minus (yielding a positive float), do the typecast, and then do another unary minus operation on the resulting positive unsigned integer value -- which is well-defined behaviour in C [*2] and yields a positive wrapped-around unsigned integer value, which is fine to implicitly typecast back to r's float type. Resolves: #770 [*1] https://www.ibm.com/support/pages/assigning-negative-float-value-unsigned-integral-type-yields-undefined-value-stored-type [*2] https://stackoverflow.com/questions/8026694/c-unary-minus-operator-behavior-with-unsigned-operands
Doing an arithmetic expansion with a division where the dividend is a negative number returns a "division by zero error" :
Tested with version
sh (AT&T Research) 93u+ 2012-08-01
of ksh. Running on Macos M2 14.1.1 (23B81)The text was updated successfully, but these errors were encountered: