Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3462][CH] Fix round mismatch #3494

Merged
merged 1 commit into from
Oct 24, 2023
Merged

Conversation

taiyang-li
Copy link
Contributor

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #3462)

@github-actions
Copy link

#3462

@github-actions
Copy link

Run Gluten Clickhouse CI

@taiyang-li
Copy link
Contributor Author

原理,在spark中,如果round函数输入类型是float, spark会将其首先转化为double类型,然后根据scale进行截断处理,再把结果转化为float类型。例如round(0.41875f, 4)表达式中,0.41875f强转为double类型之后,结果为0.41874998807907104d, 截断后结果为0.4187d, 强转成float类型之后,结果为0.4187f

而在目前的roundHalfUp实现则不会对round的输入和输出做强转,因此导致了https://github.com/oap-project/gluten/issues/3462中的diff.

@lhuang09287750
Copy link
Contributor

原理,在spark中,如果round函数输入类型是float, spark会将其首先转化为double类型,然后根据scale进行截断处理,再把结果转化为float类型。例如round(0.41875f, 4)表达式中,0.41875f强转为double类型之后,结果为0.41874998807907104d, 截断后结果为0.4187d, 强转成float类型之后,结果为0.4187f

而在目前的roundHalfUp实现则不会对round的输入和输出做强转,因此导致了https://github.com/oap-project/gluten/issues/3462中的diff.

测试了一下,确实如此。
spark round实现里有这么一段:
case FloatType => val f = input1.asInstanceOf[Float] if (f.isNaN || f.isInfinite) { f } else { BigDecimal(f.toDouble).setScale(_scale, mode).toFloat }

@lhuang09287750
Copy link
Contributor

lgtm

Copy link
Contributor

@baibaichen baibaichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@baibaichen baibaichen merged commit edc2c53 into apache:main Oct 24, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] Round function get different result from spark
3 participants