Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile time evaluation doesn't match runtime evaluation with -ffp-contract=on #98197

Closed
bfraboni opened this issue Jul 9, 2024 · 15 comments
Closed
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" constexpr Anything related to constant evaluation floating-point Floating-point math question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@bfraboni
Copy link

bfraboni commented Jul 9, 2024

Hey LLVM team,

I found out a while ago that when -ffp-contract=on the compile time evaluation mismatch the runtime one for some expressions, but I identified that too late to get any traction. Next are some godbolt examples for repro:

I think the expected result should be that these two evaluations match all the time, and it seems that the compile time evaluation always use fma even when it is not forced with -mfma. I still don't get what's the exact difference between ffp-contract and mfma but it might something to look into.

Thank you 🙏

@AaronBallman AaronBallman added clang:frontend Language frontend issues, e.g. anything involving "Sema" floating-point Floating-point math constexpr Anything related to constant evaluation and removed new issue labels Jul 11, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 11, 2024

@llvm/issue-subscribers-clang-frontend

Author: Basile Fraboni (bfraboni)

Hey LLVM team,

I found out a while ago that when -ffp-contract=on the compile time evaluation mismatch the runtime one for some expressions, but I identified that too late to get any traction. Next are some godbolt examples for repro:

I think the expected result should be that these two evaluations match all the time, and it seems that the compile time evaluation always use fma even when it is not forced with -mfma. I still don't get what's the exact difference between ffp-contract and mfma but it might something to look into.

Thank you 🙏

@arsenm
Copy link
Contributor

arsenm commented Jul 11, 2024

seems that the compile time evaluation always use fma even when it is not forced with -mfma.

Correct.

The runtime lowering is target dependent so -mfma happens to make x86 use a native fma instruction

@AaronBallman
Copy link
Collaborator

Note, in C++, it is a recommended practice but not a normative requirement for floating-point operations to have the same behavior at compile time and runtime: https://eel.is/c++draft/expr.const#15

However, in C, it's a semantic requirement (see C23 6.6p17: "The semantic rules for the evaluation of a constant expression are the same as for nonconstant expressions.") C23 has constexpr objects and globals in C have always had to be initialized with an arithmetic constant expression, so this matters for both languages.

So I think this is somewhere between "bug" and "feature request", but I think Clang should aim to implement the recommended practice whenever possible (though I think C's requirement is overreaching). However, there are a lot of floating-point flags that change the behavior and I'm not certain we're equipped to handle the combinatorial explosion that comes from trying to support them all in constant expressions.

CC @jcranmer-intel @hubert-reinterpretcast @tbaederr @zahiraam

@jcranmer-intel
Copy link
Contributor

However, in C, it's a semantic requirement (see C23 6.6p17: "The semantic rules for the evaluation of a constant expression are the same as for nonconstant expressions.") C23 has constexpr objects and globals in C have always had to be initialized with an arithmetic constant expression, so this matters for both languages.

6.5p8: "Otherwise, whether or how expressions are contracted is implementation-defined." Arguably, we can implementation-define that we don't contract in constant expressions. This is a bit of a malicious reading of the standard, but it is perhaps enough cover to say that our failure to consistently execute the same at compile-time versus runtime.

So I think this is somewhere between "bug" and "feature request", but I think Clang should aim to implement the recommended practice whenever possible (though I think C's requirement is overreaching). However, there are a lot of floating-point flags that change the behavior and I'm not certain we're equipped to handle the combinatorial explosion that comes from trying to support them all in constant expressions.

Beyond the combinatorial issue with all the different floating point modes, there's also the fun fact that contract can do things other than FMA, a fact many people tend to miss. I can virtually guarantee that people in the backend adding new combines that take advantage of contraction will neglect to implement the same combines in the frontend, and to be frank, the contractions in practice will to some degree rely on serendipity as to whether or not optimizations will nudge the instructions to the right position or not. (Although C also limits contraction to within single expressions, which is not how it's implemented in the backend!)

@bfraboni
Copy link
Author

Note, in C++, it is a recommended practice but not a normative requirement for floating-point operations to have the same behavior at compile time and runtime

@AaronBallman does that mean in C++ I can't trust any constexpr float operation to output the same result as its runtime counterpart ? That sounds very wrong..

The runtime lowering is target dependent so -mfma happens to make x86 use a native fma instruction

@arsenm so the flag -ffp-contract=on does not actually enables fma instruction, but fma expression reordering only ? if we want the specific instruction it needs to be enforced with -mfma ? The manual is not super clear about that because it states that this flag enables the use of fma (but does not mention instruction specific or just expression fma ...): https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffp-contract

@jcranmer-intel
Copy link
Contributor

@arsenm so the flag -ffp-contract=on does not actually enables fma instruction, but fma expression reordering only ? if we want the specific instruction it needs to be enforced with -mfma ? The manual is not super clear about that because it states that this flag enables the use of fma (but does not mention instruction specific or just expression fma ...): https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffp-contract

The FMA instructions in x86 aren't available in all hardware. You need the -mfma flag (or appropriate -mcpu flags, etc.) to say that you're okay limiting compilation to only those CPUs that support the FMA instructions.

@bfraboni
Copy link
Author

bfraboni commented Jul 11, 2024

The documentation page states the following about -ffp-contract=on:

Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA).

I guess what I'm trying to understand is more the difference between -ffp-contract=on FMA and -mfma FMA.
The latter as you said @jcranmer-intel is simply using the hardware specific instruction for doing FMA, I get that.
But what is the former actually doing ? What is FMA without the hardware support ? Is that just expression reordering to form fma blocks of expressions but not actually doing a true fma instruction ? Or is that using a software version of fma with better rounding ?

@jcranmer-intel
Copy link
Contributor

-ffp-contract=on specifies that the compiler is permitted (but not required) to turn a floating-point expression (a * b) + c into fma(a, b, c).

Within the compiler, if it sees a (a * b) + c expression, and -ffp-contract=on is enabled, it turns around and asks the target "is it faster for you to execute fma(a, b, c) or (a * b) + c?" Where there is no hardware FMA instruction, the answer is usually the latter; but where there is one, it is almost always the former. Based on the answer to the question, it chooses whether or not to transform (a * b) + c into fma(a, b, c).

Note that it is possible to make an fma operation even without hardware FMA support (__builtin_fma(a, b, c) is the way you do this in clang), and this will get lowered to a library function named fma.

In short, there's three decisions going on here:

  • Am I allowed to transform (a * b) + c -> fma(a, b, c) (this is controlled by -ffp-contract=on)
  • Do I lower fma(a, b, c) to a function call or a hardware instruction (this is what -mfma is doing)
  • Is it faster to transform (a * b) + c -> fma(a, b, c) (which is almost always implied by the answer to the previous question)

@bfraboni
Copy link
Author

Thank you @jcranmer-intel for the detailed answer, that makes more sense now !

@bfraboni
Copy link
Author

Back to the issue, I'm still concerned about this backend VS frontend discrepancy. I applied a "make constexpr everything I can" policy lately for performance reasons, but knowing that operations are not enforced to be the same makes me doubt robustness now.

I agree with @AaronBallman saying that clang should be complying with recommendations here. It is not advertised / warned anywhere that constexpr can return inconsistent results with floating points and I think operations should try to respect the flags they are compiled with, otherwise it makes constexpr fp a lot less useful.

There is maybe another good reason but even when I only use constexpr I can get inconsistent results in compile time when evaluating the exact same line of code inside or outside a function: https://godbolt.org/z/vTd5PWexz
So there is definitely something wrong with -ffp-contract=on and constexpr evaluation. Could you please just double check that there isn't just a simple bug with fma used where it shouldn't ?

@bfraboni
Copy link
Author

bfraboni commented Jul 11, 2024

seems that the compile time evaluation always use fma even when it is not forced with -mfma.

Correct.

Not consistently @arsenm , see the above godbolt repro , the compile time eval doesn't match all the time ☝️

@arsenm
Copy link
Contributor

arsenm commented Jul 12, 2024

Not consistently @arsenm , see the above godbolt repro , the compile time eval doesn't match all the time ☝️

This is the behavior for cases where llvm.fmuladd is emitted, which is always treated as FMA. I don't know what clang is doing in the constexpr evaluation case

@jcranmer-intel
Copy link
Contributor

I agree with @AaronBallman saying that clang should be complying with recommendations here. It is not advertised / warned anywhere that constexpr can return inconsistent results with floating points and I think operations should try to respect the flags they are compiled with, otherwise it makes constexpr fp a lot less useful.

Fast-math flags (which include -ffp-contract=on) are an explicit signal to the compiler from the user that the user values speed over consistency in the numerical results: it gives license to the optimizer to rearrange floating-point expressions in a way that doesn't preserve exact fp results. It is effectively impossible to make the front-end always give the same answer that the optimizer gives. If you want consistent results between the constexpr evaluation and the optimizer, then stick with precise floating-point compilation modes and don't use any fast-math flags.

@bfraboni
Copy link
Author

I get it now that for frontend vs backend it is not simple to give the same answer all the time.

However, I still have no clue why the frontend does not always give the same answer for the same constexpr line of code, see : https://godbolt.org/z/vTd5PWexz

@bfraboni
Copy link
Author

Ok I found out, the function call inside the printf does not enforce constexpr result, so it is probably evaluated with a different code path. If I assign the result to a constexpr variable first, I get the same result. That's tricky.

Thanks all for your insights @arsenm @jcranmer-intel @AaronBallman , I know better how things work now .

Closing this one!

@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" constexpr Anything related to constant evaluation floating-point Floating-point math question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

6 participants