[CORE] Fix incorrect precision of decimal literal #6954

jiangjiangtian · 2024-08-21T03:28:30Z

For SQL as follows:

select (col0 / (col1 + 0.00000001)) from table;

In this case, col0 and col1 are 0.
The result may be NULL. The reason is that the result of Decimal(0.00000001).toString() is "1E-8", which will make the new precision be 4.
Therefore, we use toPlainString here to prevent scientific notation.

jiangjiangtian · 2024-08-21T03:28:46Z

@kecookier

github-actions · 2024-08-21T03:29:04Z

Run Gluten Clickhouse CI

github-actions · 2024-08-21T06:42:35Z

Run Gluten Clickhouse CI

jiangjiangtian · 2024-08-26T02:42:12Z

@rui-mo Can you review this PR?
I hava a question: why do we have this rescale in gluten? I can't find the same logic in Spark. I have read the comment and I still can't have a good understanding of it.
Thanks!

rui-mo · 2024-08-27T03:51:21Z

Hi @jiangjiangtian, this adjustment, as I recall, is for the situation where one performs an arithmetic operation between a decimal and a number. In this instance, the number is converted to decimal and its precision and scale acquired from Spark are (38, 18), which are inconsistent with the real values. E.g., in the case you mentioned, Decimal(0.00000001) should have a precision and scale of (8, 8) instead of (38, 18). In order to produce accurate results, we need additional logic to extract the accurate precision and scale that native computing requires.

Perhaps you could help confirm if it is the case for the example you provided. Thanks.

rui-mo

Thanks for the fix. Would you like to add the buggy case as a unit test?

jiangjiangtian · 2024-08-27T07:59:57Z

Hi @jiangjiangtian, this adjustment, as I recall, is for the situation where one performs an arithmetic operation between a decimal and a number. In this instance, the number is converted to decimal and its precision and scale acquired from Spark are (38, 18), which are inconsistent with the real values. E.g., in the case you mentioned, Decimal(0.00000001) should have a precision and scale of (8, 8) instead of (38, 18). In order to produce accurate results, we need additional logic to extract the accurate precision and scale that native computing requires.

Perhaps you could help confirm if it is the case for the example you provided. Thanks.

@rui-mo Thanks! It seems that Spark doesn't have the logic. I don't know why Spark doesn't need this logic.

In my case, the type of the literal 0.00000001 is Decimal(19, 8). After the adjustment, the type is still Decimal(19, 8). Because the string representation of the decimal contains . and the decimal is not a valid long number. Perhaps we should not return the original precision and scale in the end of the function. https://github.com/apache/incubator-gluten/blob/main/gluten-core/src/main/scala/org/apache/gluten/utils/DecimalArithmeticUtil.scala#L110

github-actions · 2024-08-27T08:49:39Z

Run Gluten Clickhouse CI

jiangjiangtian · 2024-08-27T08:56:01Z

Thanks for the fix. Would you like to add the buggy case as a unit test?

I add a unit test. Is there anything that I need to add or edit? Thanks.

rui-mo · 2024-08-27T09:27:27Z

Because the string representation of the decimal contains . and the decimal is not a valid long number.

Thanks for reminding me of this. I just remembered that this adjustment is typically for the arithmetic operation between a decimal and an integer/bigint. In your case, the literal is double, so to return the original precision and scale should be fine.

rui-mo · 2024-08-27T09:29:08Z

backends-velox/src/test/scala/org/apache/gluten/execution/MiscOperatorSuite.scala

+    withTable("test") {
+      sql("create table test (col0 decimal(10, 0), col1 decimal(10, 0)) using parquet")
+      sql("insert into test values (0, 0)")
+      runQueryAndCompare("select col0 / (col1 + 1E-8) from test") { _ => }


There is a test failure as below:

Fix wrong rescale *** FAILED *** org.apache.spark.sql.AnalysisException: unknown requires that the data to be inserted have the same number of columns as the target table: target table has 3 column(s) but the inserted data has 2 column(s), including 0 partition column(s) having constant value(s).

github-actions · 2024-08-27T11:17:38Z

Run Gluten Clickhouse CI

github-actions · 2024-08-27T11:47:19Z

Run Gluten Clickhouse CI

github-actions bot added the CORE works for Gluten Core label Aug 21, 2024

[CORE] Fix incorrect precision of Decimal literal

e32c494

jiangjiangtian force-pushed the fix_precision branch from 8ba8a73 to e32c494 Compare August 21, 2024 06:42

kecookier requested a review from rui-mo August 26, 2024 03:06

rui-mo reviewed Aug 27, 2024

View reviewed changes

github-actions bot added the VELOX label Aug 27, 2024

rui-mo reviewed Aug 27, 2024

View reviewed changes

jiangjiangtian force-pushed the fix_precision branch from 4475838 to 9769e12 Compare August 27, 2024 11:17

add unit test

33f2df9

jiangjiangtian force-pushed the fix_precision branch from 9769e12 to 33f2df9 Compare August 27, 2024 11:46

rui-mo approved these changes Aug 28, 2024

View reviewed changes

rui-mo changed the title ~~[CORE] Fix incorrect precision of Decimal literal~~ [CORE] Fix incorrect precision of decimal literal Aug 28, 2024

rui-mo merged commit 7e800f6 into apache:main Aug 28, 2024
44 checks passed

jiangjiangtian deleted the fix_precision branch August 28, 2024 05:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CORE] Fix incorrect precision of decimal literal #6954

[CORE] Fix incorrect precision of decimal literal #6954

jiangjiangtian commented Aug 21, 2024

jiangjiangtian commented Aug 21, 2024

github-actions bot commented Aug 21, 2024

github-actions bot commented Aug 21, 2024

jiangjiangtian commented Aug 26, 2024

rui-mo commented Aug 27, 2024 •

edited

Loading

rui-mo left a comment •

edited

Loading

jiangjiangtian commented Aug 27, 2024

github-actions bot commented Aug 27, 2024

jiangjiangtian commented Aug 27, 2024

rui-mo commented Aug 27, 2024

rui-mo Aug 27, 2024

jiangjiangtian Aug 28, 2024

github-actions bot commented Aug 27, 2024

github-actions bot commented Aug 27, 2024

[CORE] Fix incorrect precision of decimal literal #6954

[CORE] Fix incorrect precision of decimal literal #6954

Conversation

jiangjiangtian commented Aug 21, 2024

jiangjiangtian commented Aug 21, 2024

github-actions bot commented Aug 21, 2024

github-actions bot commented Aug 21, 2024

jiangjiangtian commented Aug 26, 2024

rui-mo commented Aug 27, 2024 • edited Loading

rui-mo left a comment • edited Loading

Choose a reason for hiding this comment

jiangjiangtian commented Aug 27, 2024

github-actions bot commented Aug 27, 2024

jiangjiangtian commented Aug 27, 2024

rui-mo commented Aug 27, 2024

rui-mo Aug 27, 2024

Choose a reason for hiding this comment

jiangjiangtian Aug 28, 2024

Choose a reason for hiding this comment

github-actions bot commented Aug 27, 2024

github-actions bot commented Aug 27, 2024

rui-mo commented Aug 27, 2024 •

edited

Loading

rui-mo left a comment •

edited

Loading