Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64][GISel] Add fp128 and i128 sitofp/uitofp handling #97691

Merged
merged 5 commits into from
Jul 15, 2024

Conversation

Him188
Copy link
Member

@Him188 Him188 commented Jul 4, 2024

Legalize sitofp/uitofp involving fp128/i128 types into a libcall.
Vector with i128/fp128 types are scalarized.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Him188 (Him188)

Changes

Legalize sitofp/uitofp involving fp128/i128 types into a libcall.
Vector with i128/fp128 types are scalarized.


Patch is 114.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97691.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+4-5)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+24-13)
  • (modified) llvm/test/CodeGen/AArch64/itofp.ll (+1347-910)
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 86de1f3be9047..39718a634f0c9 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -1136,15 +1136,14 @@ LegalizerHelper::libcall(MachineInstr &MI, LostDebugLocObserver &LocObserver) {
   }
   case TargetOpcode::G_SITOFP:
   case TargetOpcode::G_UITOFP: {
-    // FIXME: Support other types
     unsigned FromSize = MRI.getType(MI.getOperand(1).getReg()).getSizeInBits();
-    unsigned ToSize = MRI.getType(MI.getOperand(0).getReg()).getSizeInBits();
-    if ((FromSize != 32 && FromSize != 64) || (ToSize != 32 && ToSize != 64))
+    Type *ToTy = getFloatTypeForLLT(Ctx, MRI.getType(MI.getOperand(0).getReg()));
+    if ((FromSize != 32 && FromSize != 64 && FromSize != 128) || !ToTy)
       return UnableToLegalize;
     LegalizeResult Status = conversionLibcall(
         MI, MIRBuilder,
-        ToSize == 64 ? Type::getDoubleTy(Ctx) : Type::getFloatTy(Ctx),
-        FromSize == 32 ? Type::getInt32Ty(Ctx) : Type::getInt64Ty(Ctx),
+        ToTy,
+        Type::getIntNTy(Ctx, FromSize),
         LocObserver);
     if (Status != Legalized)
       return Status;
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index c6eb4d2b3ec78..cdf0c158f13d5 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -710,7 +710,13 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
           {{s32, s128}, {s64, s128}, {s128, s128}, {s128, s32}, {s128, s64}});
 
   getActionDefinitionsBuilder({G_SITOFP, G_UITOFP})
-      .legalForCartesianProduct({s32, s64, v2s64, v4s32, v2s32})
+      .legalFor({{s32, s32},
+             {s64, s32},
+             {s32, s64},
+             {s64, s64},
+             {v2s64, v2s64},
+             {v4s32, v4s32},
+             {v2s32, v2s32}})
       .legalIf([=](const LegalityQuery &Query) {
         return HasFP16 &&
                (Query.Types[0] == s16 || Query.Types[0] == v4s16 ||
@@ -718,26 +724,31 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
                (Query.Types[1] == s32 || Query.Types[1] == s64 ||
                 Query.Types[1] == v4s16 || Query.Types[1] == v8s16);
       })
-      .widenScalarToNextPow2(1)
-      .clampScalar(1, s32, s64)
-      .widenScalarToNextPow2(0)
-      .clampScalarOrElt(0, MinFPScalar, s64)
-      .moreElementsToNextPow2(0)
+      .scalarizeIf(scalarOrEltWiderThan(1, 64), 1)
+      .scalarizeIf(scalarOrEltWiderThan(0, 64), 0)
+      .moreElementsToNextPow2(1)
+      .widenScalarOrEltToNextPow2OrMinSize(1)
+      .minScalar(1, s32)
+      .widenScalarOrEltToNextPow2OrMinSize(0, /*MinSize=*/HasFP16 ? 16 : 32)
       .widenScalarIf(
           [=](const LegalityQuery &Query) {
-            return Query.Types[0].getScalarSizeInBits() <
-                   Query.Types[1].getScalarSizeInBits();
+            return Query.Types[0].getScalarSizeInBits() <= 64 &&
+                   Query.Types[0].getScalarSizeInBits() >
+                       Query.Types[1].getScalarSizeInBits();
           },
-          LegalizeMutations::changeElementSizeTo(0, 1))
+          LegalizeMutations::changeElementSizeTo(1, 0))
       .widenScalarIf(
           [=](const LegalityQuery &Query) {
-            return Query.Types[0].getScalarSizeInBits() >
-                   Query.Types[1].getScalarSizeInBits();
+            return Query.Types[1].getScalarSizeInBits() <= 64 &&
+                   Query.Types[0].getScalarSizeInBits() <
+                       Query.Types[1].getScalarSizeInBits();
           },
-          LegalizeMutations::changeElementSizeTo(1, 0))
+          LegalizeMutations::changeElementSizeTo(0, 1))
       .clampNumElements(0, v4s16, v8s16)
       .clampNumElements(0, v2s32, v4s32)
-      .clampMaxNumElements(0, s64, 2);
+      .clampMaxNumElements(0, s64, 2)
+      .libcallFor(
+          {{s16, s128}, {s32, s128}, {s64, s128}, {s128, s128}, {s128, s32}, {s128, s64}});
 
   // Control-flow
   getActionDefinitionsBuilder(G_BRCOND)
diff --git a/llvm/test/CodeGen/AArch64/itofp.ll b/llvm/test/CodeGen/AArch64/itofp.ll
index ac26ccc44128f..04c56b796ce25 100644
--- a/llvm/test/CodeGen/AArch64/itofp.ll
+++ b/llvm/test/CodeGen/AArch64/itofp.ll
@@ -1,225 +1,228 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
 ; RUN: llc -mtriple=aarch64 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-SD,CHECK-SD-NOFP16
 ; RUN: llc -mtriple=aarch64 -mattr=+fullfp16 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-SD,CHECK-SD-FP16
-; RUN: llc -mtriple=aarch64 -global-isel -global-isel-abort=2 -verify-machineinstrs %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-NOFP16
-; RUN: llc -mtriple=aarch64 -mattr=+fullfp16 -global-isel -global-isel-abort=2 -verify-machineinstrs %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-FP16
-
-; CHECK-GI:       warning: Instruction selection used fallback path for stofp_i128_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i64_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i64_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i32_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i32_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i16_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i16_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i8_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i8_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i128_f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i128_f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i128_f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i64_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i64_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i64_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i64_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i32_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i32_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i32_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i32_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i16_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i16_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i16_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i16_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i8_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i8_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i8_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i8_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f16
+; RUN: llc -mtriple=aarch64 -global-isel -global-isel-abort=1 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-NOFP16
+; RUN: llc -mtriple=aarch64 -mattr=+fullfp16 -global-isel -global-isel-abort=1 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-FP16
 
 define fp128 @stofp_i128_f128(i128 %a) {
-; CHECK-LABEL: stofp_i128_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floattitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i128_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floattitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i128_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floattitf
 entry:
   %c = sitofp i128 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i128_f128(i128 %a) {
-; CHECK-LABEL: utofp_i128_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatuntitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i128_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatuntitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i128_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatuntitf
 entry:
   %c = uitofp i128 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i64_f128(i64 %a) {
-; CHECK-LABEL: stofp_i64_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatditf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i64_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatditf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i64_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatditf
 entry:
   %c = sitofp i64 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i64_f128(i64 %a) {
-; CHECK-LABEL: utofp_i64_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatunditf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i64_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatunditf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i64_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatunditf
 entry:
   %c = uitofp i64 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i32_f128(i32 %a) {
-; CHECK-LABEL: stofp_i32_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i32_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i32_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatsitf
 entry:
   %c = sitofp i32 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i32_f128(i32 %a) {
-; CHECK-LABEL: utofp_i32_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatunsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i32_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatunsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i32_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatunsitf
 entry:
   %c = uitofp i32 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i16_f128(i16 %a) {
-; CHECK-LABEL: stofp_i16_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    sxth w0, w0
-; CHECK-NEXT:    bl __floatsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i16_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    sxth w0, w0
+; CHECK-SD-NEXT:    bl __floatsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i16_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    sxth w0, w0
+; CHECK-GI-NEXT:    b __floatsitf
 entry:
   %c = sitofp i16 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i16_f128(i16 %a) {
-; CHECK-LABEL: utofp_i16_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    and w0, w0, #0xffff
-; CHECK-NEXT:    bl __floatunsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i16_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    and w0, w0, #0xffff
+; CHECK-SD-NEXT:    bl __floatunsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i16_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    and w0, w0, #0xffff
+; CHECK-GI-NEXT:    b __floatunsitf
 entry:
   %c = uitofp i16 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i8_f128(i8 %a) {
-; CHECK-LABEL: stofp_i8_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    sxtb w0, w0
-; CHECK-NEXT:    bl __floatsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i8_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    sxtb w0, w0
+; CHECK-SD-NEXT:    bl __floatsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i8_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    sxtb w0, w0
+; CHECK-GI-NEXT:    b __floatsitf
 entry:
   %c = sitofp i8 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i8_f128(i8 %a) {
-; CHECK-LABEL: utofp_i8_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    and w0, w0, #0xff
-; CHECK-NEXT:    bl __floatunsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i8_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    and w0, w0, #0xff
+; CHECK-SD-NEXT:    bl __floatunsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i8_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    and w0, w0, #0xff
+; CHECK-GI-NEXT:    b __floatunsitf
 entry:
   %c = uitofp i8 %a to fp128
   ret fp128 %c
 }
 
 define double @stofp_i128_f64(i128 %a) {
-; CHECK-LABEL: stofp_i128_f64:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floattidf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i128_f64:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floattidf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i128_f64:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floattidf
 entry:
   %c = sitofp i128 %a to double
   ret double %c
 }
 
 define double @utofp_i128_f64(i128 %a) {
-; CHECK-LABEL: utofp_i128_f64:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatuntidf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i128_f64:
+; CHECK-SD:     ...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Him188 (Him188)

Changes

Legalize sitofp/uitofp involving fp128/i128 types into a libcall.
Vector with i128/fp128 types are scalarized.


Patch is 114.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97691.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+4-5)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+24-13)
  • (modified) llvm/test/CodeGen/AArch64/itofp.ll (+1347-910)
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 86de1f3be9047..39718a634f0c9 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -1136,15 +1136,14 @@ LegalizerHelper::libcall(MachineInstr &MI, LostDebugLocObserver &LocObserver) {
   }
   case TargetOpcode::G_SITOFP:
   case TargetOpcode::G_UITOFP: {
-    // FIXME: Support other types
     unsigned FromSize = MRI.getType(MI.getOperand(1).getReg()).getSizeInBits();
-    unsigned ToSize = MRI.getType(MI.getOperand(0).getReg()).getSizeInBits();
-    if ((FromSize != 32 && FromSize != 64) || (ToSize != 32 && ToSize != 64))
+    Type *ToTy = getFloatTypeForLLT(Ctx, MRI.getType(MI.getOperand(0).getReg()));
+    if ((FromSize != 32 && FromSize != 64 && FromSize != 128) || !ToTy)
       return UnableToLegalize;
     LegalizeResult Status = conversionLibcall(
         MI, MIRBuilder,
-        ToSize == 64 ? Type::getDoubleTy(Ctx) : Type::getFloatTy(Ctx),
-        FromSize == 32 ? Type::getInt32Ty(Ctx) : Type::getInt64Ty(Ctx),
+        ToTy,
+        Type::getIntNTy(Ctx, FromSize),
         LocObserver);
     if (Status != Legalized)
       return Status;
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index c6eb4d2b3ec78..cdf0c158f13d5 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -710,7 +710,13 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
           {{s32, s128}, {s64, s128}, {s128, s128}, {s128, s32}, {s128, s64}});
 
   getActionDefinitionsBuilder({G_SITOFP, G_UITOFP})
-      .legalForCartesianProduct({s32, s64, v2s64, v4s32, v2s32})
+      .legalFor({{s32, s32},
+             {s64, s32},
+             {s32, s64},
+             {s64, s64},
+             {v2s64, v2s64},
+             {v4s32, v4s32},
+             {v2s32, v2s32}})
       .legalIf([=](const LegalityQuery &Query) {
         return HasFP16 &&
                (Query.Types[0] == s16 || Query.Types[0] == v4s16 ||
@@ -718,26 +724,31 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
                (Query.Types[1] == s32 || Query.Types[1] == s64 ||
                 Query.Types[1] == v4s16 || Query.Types[1] == v8s16);
       })
-      .widenScalarToNextPow2(1)
-      .clampScalar(1, s32, s64)
-      .widenScalarToNextPow2(0)
-      .clampScalarOrElt(0, MinFPScalar, s64)
-      .moreElementsToNextPow2(0)
+      .scalarizeIf(scalarOrEltWiderThan(1, 64), 1)
+      .scalarizeIf(scalarOrEltWiderThan(0, 64), 0)
+      .moreElementsToNextPow2(1)
+      .widenScalarOrEltToNextPow2OrMinSize(1)
+      .minScalar(1, s32)
+      .widenScalarOrEltToNextPow2OrMinSize(0, /*MinSize=*/HasFP16 ? 16 : 32)
       .widenScalarIf(
           [=](const LegalityQuery &Query) {
-            return Query.Types[0].getScalarSizeInBits() <
-                   Query.Types[1].getScalarSizeInBits();
+            return Query.Types[0].getScalarSizeInBits() <= 64 &&
+                   Query.Types[0].getScalarSizeInBits() >
+                       Query.Types[1].getScalarSizeInBits();
           },
-          LegalizeMutations::changeElementSizeTo(0, 1))
+          LegalizeMutations::changeElementSizeTo(1, 0))
       .widenScalarIf(
           [=](const LegalityQuery &Query) {
-            return Query.Types[0].getScalarSizeInBits() >
-                   Query.Types[1].getScalarSizeInBits();
+            return Query.Types[1].getScalarSizeInBits() <= 64 &&
+                   Query.Types[0].getScalarSizeInBits() <
+                       Query.Types[1].getScalarSizeInBits();
           },
-          LegalizeMutations::changeElementSizeTo(1, 0))
+          LegalizeMutations::changeElementSizeTo(0, 1))
       .clampNumElements(0, v4s16, v8s16)
       .clampNumElements(0, v2s32, v4s32)
-      .clampMaxNumElements(0, s64, 2);
+      .clampMaxNumElements(0, s64, 2)
+      .libcallFor(
+          {{s16, s128}, {s32, s128}, {s64, s128}, {s128, s128}, {s128, s32}, {s128, s64}});
 
   // Control-flow
   getActionDefinitionsBuilder(G_BRCOND)
diff --git a/llvm/test/CodeGen/AArch64/itofp.ll b/llvm/test/CodeGen/AArch64/itofp.ll
index ac26ccc44128f..04c56b796ce25 100644
--- a/llvm/test/CodeGen/AArch64/itofp.ll
+++ b/llvm/test/CodeGen/AArch64/itofp.ll
@@ -1,225 +1,228 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
 ; RUN: llc -mtriple=aarch64 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-SD,CHECK-SD-NOFP16
 ; RUN: llc -mtriple=aarch64 -mattr=+fullfp16 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-SD,CHECK-SD-FP16
-; RUN: llc -mtriple=aarch64 -global-isel -global-isel-abort=2 -verify-machineinstrs %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-NOFP16
-; RUN: llc -mtriple=aarch64 -mattr=+fullfp16 -global-isel -global-isel-abort=2 -verify-machineinstrs %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-FP16
-
-; CHECK-GI:       warning: Instruction selection used fallback path for stofp_i128_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i64_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i64_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i32_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i32_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i16_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i16_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i8_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i8_f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i128_f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i128_f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_i128_f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_i128_f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i64_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i64_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i64_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i64_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f64
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i32_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i32_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i32_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i32_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i16_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i16_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i16_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i16_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i8_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i8_v2f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i8_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i8_v3f128
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f32
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v2i128_v2f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v2i128_v2f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for stofp_v3i128_v3f16
-; CHECK-GI-NEXT:  warning: Instruction selection used fallback path for utofp_v3i128_v3f16
+; RUN: llc -mtriple=aarch64 -global-isel -global-isel-abort=1 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-NOFP16
+; RUN: llc -mtriple=aarch64 -mattr=+fullfp16 -global-isel -global-isel-abort=1 -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-GI,CHECK-GI-FP16
 
 define fp128 @stofp_i128_f128(i128 %a) {
-; CHECK-LABEL: stofp_i128_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floattitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i128_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floattitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i128_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floattitf
 entry:
   %c = sitofp i128 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i128_f128(i128 %a) {
-; CHECK-LABEL: utofp_i128_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatuntitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i128_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatuntitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i128_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatuntitf
 entry:
   %c = uitofp i128 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i64_f128(i64 %a) {
-; CHECK-LABEL: stofp_i64_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatditf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i64_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatditf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i64_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatditf
 entry:
   %c = sitofp i64 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i64_f128(i64 %a) {
-; CHECK-LABEL: utofp_i64_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatunditf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i64_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatunditf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i64_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatunditf
 entry:
   %c = uitofp i64 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i32_f128(i32 %a) {
-; CHECK-LABEL: stofp_i32_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i32_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i32_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatsitf
 entry:
   %c = sitofp i32 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i32_f128(i32 %a) {
-; CHECK-LABEL: utofp_i32_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatunsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i32_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floatunsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i32_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floatunsitf
 entry:
   %c = uitofp i32 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i16_f128(i16 %a) {
-; CHECK-LABEL: stofp_i16_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    sxth w0, w0
-; CHECK-NEXT:    bl __floatsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i16_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    sxth w0, w0
+; CHECK-SD-NEXT:    bl __floatsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i16_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    sxth w0, w0
+; CHECK-GI-NEXT:    b __floatsitf
 entry:
   %c = sitofp i16 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i16_f128(i16 %a) {
-; CHECK-LABEL: utofp_i16_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    and w0, w0, #0xffff
-; CHECK-NEXT:    bl __floatunsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i16_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    and w0, w0, #0xffff
+; CHECK-SD-NEXT:    bl __floatunsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i16_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    and w0, w0, #0xffff
+; CHECK-GI-NEXT:    b __floatunsitf
 entry:
   %c = uitofp i16 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @stofp_i8_f128(i8 %a) {
-; CHECK-LABEL: stofp_i8_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    sxtb w0, w0
-; CHECK-NEXT:    bl __floatsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i8_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    sxtb w0, w0
+; CHECK-SD-NEXT:    bl __floatsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i8_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    sxtb w0, w0
+; CHECK-GI-NEXT:    b __floatsitf
 entry:
   %c = sitofp i8 %a to fp128
   ret fp128 %c
 }
 
 define fp128 @utofp_i8_f128(i8 %a) {
-; CHECK-LABEL: utofp_i8_f128:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    and w0, w0, #0xff
-; CHECK-NEXT:    bl __floatunsitf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i8_f128:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    and w0, w0, #0xff
+; CHECK-SD-NEXT:    bl __floatunsitf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: utofp_i8_f128:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    and w0, w0, #0xff
+; CHECK-GI-NEXT:    b __floatunsitf
 entry:
   %c = uitofp i8 %a to fp128
   ret fp128 %c
 }
 
 define double @stofp_i128_f64(i128 %a) {
-; CHECK-LABEL: stofp_i128_f64:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floattidf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: stofp_i128_f64:
+; CHECK-SD:       // %bb.0: // %entry
+; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-SD-NEXT:    .cfi_offset w30, -16
+; CHECK-SD-NEXT:    bl __floattidf
+; CHECK-SD-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-SD-NEXT:    ret
+;
+; CHECK-GI-LABEL: stofp_i128_f64:
+; CHECK-GI:       // %bb.0: // %entry
+; CHECK-GI-NEXT:    b __floattidf
 entry:
   %c = sitofp i128 %a to double
   ret double %c
 }
 
 define double @utofp_i128_f64(i128 %a) {
-; CHECK-LABEL: utofp_i128_f64:
-; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    .cfi_offset w30, -16
-; CHECK-NEXT:    bl __floatuntidf
-; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
-; CHECK-NEXT:    ret
+; CHECK-SD-LABEL: utofp_i128_f64:
+; CHECK-SD:     ...
[truncated]

Copy link

github-actions bot commented Jul 4, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@dc03-work dc03-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like codegen matches SDAG now (and is better for the tail-call cases). Nice.

%c = uitofp <3 x i128> %a to <3 x double>
ret <3 x double> %c
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were these two tests deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... That was a mistake when resolving rebase conflicts... Will add them back.

},
LegalizeMutations::changeElementSizeTo(1, 0))
LegalizeMutations::changeElementSizeTo(0, 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters in practice (or at all really), but why were the changeElementSizeTo calls swapped?

if ((FromSize != 32 && FromSize != 64) || (ToSize != 32 && ToSize != 64))
Type *ToTy =
getFloatTypeForLLT(Ctx, MRI.getType(MI.getOperand(0).getReg()));
if ((FromSize != 32 && FromSize != 64 && FromSize != 128) || !ToTy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some targets might want to extend from 16 bit floats. It seems to be forbidden.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a similar check like the one in the G_FPTOUI above. I think it's better to forbid 16-bit types as they are not tested. If other targets want 16-bit floats, they can easily relax this check.

@Him188
Copy link
Member Author

Him188 commented Jul 11, 2024

Ping @davemgreen and @aemerson ?

@Him188 Him188 requested a review from chuongg3 July 11, 2024 10:25
Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK, and looks very similar to the fp->int conversions. I'm not sure if there is a way to improve the i128->f16 convert without needing to introduce other libcalls.

LGTM

@Him188 Him188 merged commit 365f5b4 into llvm:main Jul 15, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants