Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen][AArch64] Added -mno-va-float to skip FP save in variadic functions #92827

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ltertan
Copy link

@ltertan ltertan commented May 20, 2024

This patch adds a new option for AArch64, -mno-va-float which can be used to disable the generation of code that saves FP in variadic functions

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels May 20, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 20, 2024

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-clang-driver

Author: Laurentiu Tertan (ltertan)

Changes

This patch adds a new option for AArch64, -mno-va-float which can be used to disable the generation of code that saves FP in variadic functions


Full diff: https://github.com/llvm/llvm-project/pull/92827.diff

6 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+3)
  • (modified) clang/lib/Driver/ToolChains/Arch/AArch64.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64Features.td (+2)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
  • (added) llvm/test/CodeGen/AArch64/mno-va-float.ll (+21)
  • (added) llvm/test/CodeGen/AArch64/mva-float.ll (+21)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 7bb781667e926..c70a21d71795f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4877,6 +4877,9 @@ foreach i = {8-15,18} in
   def fcall_saved_x#i : Flag<["-"], "fcall-saved-x"#i>, Group<m_aarch64_Features_Group>,
     HelpText<"Make the x"#i#" register call-saved (AArch64 only)">;
 
+def mno_va_float : Flag<["-"], "mno-va-float">, Group<m_aarch64_Features_Group>,
+  HelpText<"Do not generate code to save FP in variadic functions">;
+
 def msve_vector_bits_EQ : Joined<["-"], "msve-vector-bits=">, Group<m_aarch64_Features_Group>,
   Visibility<[ClangOption, FlangOption]>,
   HelpText<"Specify the size in bits of an SVE vector register. Defaults to the"
diff --git a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
index 2cd2b35ee51bc..2f25eb898b303 100644
--- a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -432,6 +432,9 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
   if (Args.hasArg(options::OPT_mno_neg_immediates))
     Features.push_back("+no-neg-immediates");
 
+  if (Args.hasArg(options::OPT_mno_va_float))
+    Features.push_back("+no-va-float");
+
   if (Arg *A = Args.getLastArg(options::OPT_mfix_cortex_a53_835769,
                                options::OPT_mno_fix_cortex_a53_835769)) {
     if (A->getOption().matches(options::OPT_mfix_cortex_a53_835769))
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index ba0b760ce3d73..234d37a4e4dcf 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -788,6 +788,8 @@ def FeaturePAuthLR : Extension<"pauth-lr", "PAuthLR",
 def FeatureTLBIW : Extension<"tlbiw", "TLBIW",
   "Enable ARMv9.5-A TLBI VMALL for Dirty State (FEAT_TLBIW)">;
 
+def FeatureVariadicSaveFP : SubtargetFeature<"no-va-float", "HasNoVaFloat",
+    "true", "Do not generate code to save FP in variadic functions">;
 
 //===----------------------------------------------------------------------===//
 // Architectures.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e31a27e9428e8..64969b4150e77 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7470,7 +7470,7 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState &CCInfo,
   FuncInfo->setVarArgsGPRIndex(GPRIdx);
   FuncInfo->setVarArgsGPRSize(GPRSaveSize);
 
-  if (Subtarget->hasFPARMv8() && !IsWin64) {
+  if (Subtarget->hasFPARMv8() && !IsWin64 && !Subtarget->hasNoVaFloat()) {
     auto FPRArgRegs = AArch64::getFPRArgRegs();
     const unsigned NumFPRArgRegs = FPRArgRegs.size();
     unsigned FirstVariadicFPR = CCInfo.getFirstUnallocated(FPRArgRegs);
diff --git a/llvm/test/CodeGen/AArch64/mno-va-float.ll b/llvm/test/CodeGen/AArch64/mno-va-float.ll
new file mode 100644
index 0000000000000..4205b2f03be04
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mno-va-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 -mattr=+no-va-float | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK-NOT: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}
diff --git a/llvm/test/CodeGen/AArch64/mva-float.ll b/llvm/test/CodeGen/AArch64/mva-float.ll
new file mode 100644
index 0000000000000..6013aeb73cb20
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mva-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}

@llvmbot
Copy link
Collaborator

llvmbot commented May 20, 2024

@llvm/pr-subscribers-clang

Author: Laurentiu Tertan (ltertan)

Changes

This patch adds a new option for AArch64, -mno-va-float which can be used to disable the generation of code that saves FP in variadic functions


Full diff: https://github.com/llvm/llvm-project/pull/92827.diff

6 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+3)
  • (modified) clang/lib/Driver/ToolChains/Arch/AArch64.cpp (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64Features.td (+2)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
  • (added) llvm/test/CodeGen/AArch64/mno-va-float.ll (+21)
  • (added) llvm/test/CodeGen/AArch64/mva-float.ll (+21)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 7bb781667e926..c70a21d71795f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4877,6 +4877,9 @@ foreach i = {8-15,18} in
   def fcall_saved_x#i : Flag<["-"], "fcall-saved-x"#i>, Group<m_aarch64_Features_Group>,
     HelpText<"Make the x"#i#" register call-saved (AArch64 only)">;
 
+def mno_va_float : Flag<["-"], "mno-va-float">, Group<m_aarch64_Features_Group>,
+  HelpText<"Do not generate code to save FP in variadic functions">;
+
 def msve_vector_bits_EQ : Joined<["-"], "msve-vector-bits=">, Group<m_aarch64_Features_Group>,
   Visibility<[ClangOption, FlangOption]>,
   HelpText<"Specify the size in bits of an SVE vector register. Defaults to the"
diff --git a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
index 2cd2b35ee51bc..2f25eb898b303 100644
--- a/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/AArch64.cpp
@@ -432,6 +432,9 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
   if (Args.hasArg(options::OPT_mno_neg_immediates))
     Features.push_back("+no-neg-immediates");
 
+  if (Args.hasArg(options::OPT_mno_va_float))
+    Features.push_back("+no-va-float");
+
   if (Arg *A = Args.getLastArg(options::OPT_mfix_cortex_a53_835769,
                                options::OPT_mno_fix_cortex_a53_835769)) {
     if (A->getOption().matches(options::OPT_mfix_cortex_a53_835769))
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index ba0b760ce3d73..234d37a4e4dcf 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -788,6 +788,8 @@ def FeaturePAuthLR : Extension<"pauth-lr", "PAuthLR",
 def FeatureTLBIW : Extension<"tlbiw", "TLBIW",
   "Enable ARMv9.5-A TLBI VMALL for Dirty State (FEAT_TLBIW)">;
 
+def FeatureVariadicSaveFP : SubtargetFeature<"no-va-float", "HasNoVaFloat",
+    "true", "Do not generate code to save FP in variadic functions">;
 
 //===----------------------------------------------------------------------===//
 // Architectures.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e31a27e9428e8..64969b4150e77 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7470,7 +7470,7 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState &CCInfo,
   FuncInfo->setVarArgsGPRIndex(GPRIdx);
   FuncInfo->setVarArgsGPRSize(GPRSaveSize);
 
-  if (Subtarget->hasFPARMv8() && !IsWin64) {
+  if (Subtarget->hasFPARMv8() && !IsWin64 && !Subtarget->hasNoVaFloat()) {
     auto FPRArgRegs = AArch64::getFPRArgRegs();
     const unsigned NumFPRArgRegs = FPRArgRegs.size();
     unsigned FirstVariadicFPR = CCInfo.getFirstUnallocated(FPRArgRegs);
diff --git a/llvm/test/CodeGen/AArch64/mno-va-float.ll b/llvm/test/CodeGen/AArch64/mno-va-float.ll
new file mode 100644
index 0000000000000..4205b2f03be04
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mno-va-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 -mattr=+no-va-float | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK-NOT: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}
diff --git a/llvm/test/CodeGen/AArch64/mva-float.ll b/llvm/test/CodeGen/AArch64/mva-float.ll
new file mode 100644
index 0000000000000..6013aeb73cb20
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mva-float.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -march=aarch64 | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "arm64"
+
+%struct.__va_list = type { i8*, i8*, i8*, i32, i32 }
+
+declare i32 @vfunc(i8*, i8*)
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)
+
+define i32 @func(i8*, double, ...) {
+entry:
+  %argp = alloca %struct.__va_list, align 8
+  %argp1 = bitcast %struct.__va_list* %argp to i8*
+  call void @llvm.va_start(i8* %argp1)
+; CHECK: {{stp.*q[0-9]+}}
+  %ret = call i32 @vfunc(i8* %0, i8* %argp1)
+  call void @llvm.va_end(i8* %argp1)
+  ret i32 %ret
+}

@ltertan
Copy link
Author

ltertan commented May 20, 2024

@TNorthover

The PR mentions no Reviewers, so I'm adding you. Please help me redirect this if you're not the right person to review this. Thanks!

@JonChesterfield
Copy link
Collaborator

Aarch64 has a dedicated floating point region in the va_list structure. Is the intent of this patch to globally disable the use of that, such that clang should arrange to put floating point values in the stack fallback area instead?

@efriedma-quic
Copy link
Collaborator

This patch, as proposed, doesn't seem like a good idea: it's very likely to miscompile user code without any diagnostic. Are you sure you don't want one of the following?

  • A soft-float ABI (-mabi=aapcs-soft)
  • Completely forbidding the use of floating-point values (-mgeneral-regs-only).
  • An optimization that opportunistically skips saving float registers if we can prove it isn't necessary. (Not currently implemented, but not something you'd add a compiler option for; you'd just make it work automatically. I can write up an outline of how to implement this if you're interested.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants