[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic functions #92827

ltertan · 2024-05-20T21:58:10Z

This patch adds a new option for AArch64, -mno-va-float which can be used to disable the generation of code that saves FP in variadic functions

…nctions

github-actions · 2024-05-20T21:58:26Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2024-05-20T21:58:56Z

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-clang-driver

Author: Laurentiu Tertan (ltertan)

Changes

This patch adds a new option for AArch64, -mno-va-float which can be used to disable the generation of code that saves FP in variadic functions

Full diff: https://github.com/llvm/llvm-project/pull/92827.diff

6 Files Affected:

(modified) clang/include/clang/Driver/Options.td (+3)
(modified) clang/lib/Driver/ToolChains/Arch/AArch64.cpp (+3)
(modified) llvm/lib/Target/AArch64/AArch64Features.td (+2)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
(added) llvm/test/CodeGen/AArch64/mno-va-float.ll (+21)
(added) llvm/test/CodeGen/AArch64/mva-float.ll (+21)

diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 7bb781667e926..c70a21d71795f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4877,6 +4877,9 @@ foreach i = {8-15,18} in
   def fcall_saved_x#i : Flag<["-"], "fcall-saved-x"#i>, Group<m_aarch64_Features_Group>,
     HelpText<"Make the x"#i#" register call-saved (AArch64 only)">;
 
+def mno_va_float : Flag<["-"], "mno-va-float">, Group<m_aarch64_Features_Group>,
+  HelpText<"Do not generate code to save FP in variadic functions">;
+
 def msve_vector_bits_EQ : Joined<["-"], "msve-vector-bits=">, Group<m_aarch64_Features_Group>,
   Visibility<[ClangOption, FlangOption]>,
   HelpText<"Specify the size in bits of an SVE vector register. Defaults to the"
diff --git a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
index 2cd2b35ee51bc..2f25eb898b303 100644
--- a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -432,6 +432,9 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
   if (Args.hasArg(options::OPT_mno_neg_immediates))
     Features.push_back("+no-neg-immediates");
 
+  if (Args.hasArg(options::OPT_mno_va_float))
+    Features.push_back("+no-va-float");
+
   if (Arg *A = Args.getLastArg(options::OPT_mfix_cortex_a53_835769,
                                options::OPT_mno_fix_cortex_a53_835769)) {
     if (A->getOption().matches(options::OPT_mfix_cortex_a53_835769))
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index ba0b760ce3d73..234d37a4e4dcf 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -788,6 +788,8 @@ def FeaturePAuthLR : Extension<"pauth-lr", "PAuthLR",
 def FeatureTLBIW : Extension<"tlbiw", "TLBIW",
   "Enable ARMv9.5-A TLBI VMALL for Dirty State (FEAT_TLBIW)">;
 
+def FeatureVariadicSaveFP : SubtargetFeature<"no-va-float", "HasNoVaFloat",
+    "true", "Do not generate code to save FP in variadic functions">;
 
 //===----------------------------------------------------------------------===//
 // Architectures.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e31a27e9428e8..64969b4150e77 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7470,7 +7470,7 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState &CCInfo,
   FuncInfo->setVarArgsGPRIndex(GPRIdx);
   FuncInfo->setVarArgsGPRSize(GPRSaveSize);
 
-  if (Subtarget->hasFPARMv8() && !IsWin64) {
+  if (Subtarget->hasFPARMv8() && !IsWin64 && !Subtarget->hasNoVaFloat()) {
     auto FPRArgRegs = AArch64::getFPRArgRegs();
     const unsigned NumFPRArgRegs = FPRArgRegs.size();
     unsigned FirstVariadicFPR = CCInfo.getFirstUnallocated(FPRArgRegs);
diff --git a/llvm/test/CodeGen/AArch64/mno-va-float.ll b/llvm/test/CodeGen/AArch64/mno-va-float.ll
new file mode 100644
index 0000000000000..4205b2f03be04
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mno-va-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 -mattr=+no-va-float | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK-NOT: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}
diff --git a/llvm/test/CodeGen/AArch64/mva-float.ll b/llvm/test/CodeGen/AArch64/mva-float.ll
new file mode 100644
index 0000000000000..6013aeb73cb20
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mva-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}

llvmbot · 2024-05-20T21:58:56Z

@llvm/pr-subscribers-clang

Author: Laurentiu Tertan (ltertan)

Changes

This patch adds a new option for AArch64, -mno-va-float which can be used to disable the generation of code that saves FP in variadic functions

Full diff: https://github.com/llvm/llvm-project/pull/92827.diff

6 Files Affected:

(modified) clang/include/clang/Driver/Options.td (+3)
(modified) clang/lib/Driver/ToolChains/Arch/AArch64.cpp (+3)
(modified) llvm/lib/Target/AArch64/AArch64Features.td (+2)
(modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
(added) llvm/test/CodeGen/AArch64/mno-va-float.ll (+21)
(added) llvm/test/CodeGen/AArch64/mva-float.ll (+21)

diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 7bb781667e926..c70a21d71795f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4877,6 +4877,9 @@ foreach i = {8-15,18} in
   def fcall_saved_x#i : Flag<["-"], "fcall-saved-x"#i>, Group<m_aarch64_Features_Group>,
     HelpText<"Make the x"#i#" register call-saved (AArch64 only)">;
 
+def mno_va_float : Flag<["-"], "mno-va-float">, Group<m_aarch64_Features_Group>,
+  HelpText<"Do not generate code to save FP in variadic functions">;
+
 def msve_vector_bits_EQ : Joined<["-"], "msve-vector-bits=">, Group<m_aarch64_Features_Group>,
   Visibility<[ClangOption, FlangOption]>,
   HelpText<"Specify the size in bits of an SVE vector register. Defaults to the"
diff --git a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
index 2cd2b35ee51bc..2f25eb898b303 100644
--- a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -432,6 +432,9 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
   if (Args.hasArg(options::OPT_mno_neg_immediates))
     Features.push_back("+no-neg-immediates");
 
+  if (Args.hasArg(options::OPT_mno_va_float))
+    Features.push_back("+no-va-float");
+
   if (Arg *A = Args.getLastArg(options::OPT_mfix_cortex_a53_835769,
                                options::OPT_mno_fix_cortex_a53_835769)) {
     if (A->getOption().matches(options::OPT_mfix_cortex_a53_835769))
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index ba0b760ce3d73..234d37a4e4dcf 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -788,6 +788,8 @@ def FeaturePAuthLR : Extension<"pauth-lr", "PAuthLR",
 def FeatureTLBIW : Extension<"tlbiw", "TLBIW",
   "Enable ARMv9.5-A TLBI VMALL for Dirty State (FEAT_TLBIW)">;
 
+def FeatureVariadicSaveFP : SubtargetFeature<"no-va-float", "HasNoVaFloat",
+    "true", "Do not generate code to save FP in variadic functions">;
 
 //===----------------------------------------------------------------------===//
 // Architectures.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e31a27e9428e8..64969b4150e77 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7470,7 +7470,7 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState &CCInfo,
   FuncInfo->setVarArgsGPRIndex(GPRIdx);
   FuncInfo->setVarArgsGPRSize(GPRSaveSize);
 
-  if (Subtarget->hasFPARMv8() && !IsWin64) {
+  if (Subtarget->hasFPARMv8() && !IsWin64 && !Subtarget->hasNoVaFloat()) {
     auto FPRArgRegs = AArch64::getFPRArgRegs();
     const unsigned NumFPRArgRegs = FPRArgRegs.size();
     unsigned FirstVariadicFPR = CCInfo.getFirstUnallocated(FPRArgRegs);
diff --git a/llvm/test/CodeGen/AArch64/mno-va-float.ll b/llvm/test/CodeGen/AArch64/mno-va-float.ll
new file mode 100644
index 0000000000000..4205b2f03be04
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mno-va-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 -mattr=+no-va-float | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK-NOT: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}
diff --git a/llvm/test/CodeGen/AArch64/mva-float.ll b/llvm/test/CodeGen/AArch64/mva-float.ll
new file mode 100644
index 0000000000000..6013aeb73cb20
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mva-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}

ltertan · 2024-05-20T22:03:41Z

@TNorthover

The PR mentions no Reviewers, so I'm adding you. Please help me redirect this if you're not the right person to review this. Thanks!

JonChesterfield · 2024-05-20T22:35:50Z

Aarch64 has a dedicated floating point region in the va_list structure. Is the intent of this patch to globally disable the use of that, such that clang should arrange to put floating point values in the stack fallback area instead?

efriedma-quic · 2024-05-20T23:35:34Z

This patch, as proposed, doesn't seem like a good idea: it's very likely to miscompile user code without any diagnostic. Are you sure you don't want one of the following?

A soft-float ABI (-mabi=aapcs-soft)
Completely forbidding the use of floating-point values (-mgeneral-regs-only).
An optimization that opportunistically skips saving float registers if we can prove it isn't necessary. (Not currently implemented, but not something you'd add a compiler option for; you'd just make it work automatically. I can write up an outline of how to implement this if you're interested.)

[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic fu…

3d6a1b5

…nctions

llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic functions #92827

[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic functions #92827

ltertan commented May 20, 2024

github-actions bot commented May 20, 2024

llvmbot commented May 20, 2024 •

edited

Loading

llvmbot commented May 20, 2024

ltertan commented May 20, 2024

JonChesterfield commented May 20, 2024

efriedma-quic commented May 20, 2024

[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic functions #92827

Are you sure you want to change the base?

[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic functions #92827

Conversation

ltertan commented May 20, 2024

github-actions bot commented May 20, 2024

llvmbot commented May 20, 2024 • edited Loading

llvmbot commented May 20, 2024

ltertan commented May 20, 2024

JonChesterfield commented May 20, 2024

efriedma-quic commented May 20, 2024

llvmbot commented May 20, 2024 •

edited

Loading