-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reapply " [XRay] Add support for instrumentation of DSOs on x86_64 (#90959)" #112930
Conversation
@llvm/pr-subscribers-xray @llvm/pr-subscribers-clang-driver Author: Sebastian Kreutzer (sebastiankreutzer) ChangesThis fixes remaining issues in my previous PR #90959. Changes:
I have tested these changes on Original description: This PR introduces shared library (DSO) support for XRay based on a revised version of the implementation outlined in this RFC.
These changes are fully backward compatible, i.e. running without instrumented DSOs will produce identical traces (in terms of recorded function IDs) to the previous implementation. Due to my limited ability to test on other architectures, this feature is only implemented and tested with x86_64. Extending support to other architectures is fairly straightforward, requiring only a position-independent implementation of the architecture-specific trampoline implementation (see This patch does not include any functionality to resolve function IDs from DSOs for the provided logging/tracing modes. These modes still work and will record calls from DSOs, but symbol resolution for these functions in not available. Getting this to work properly requires recording information about the loaded DSOs and should IMO be discussed in a separate RFC, as there are mulitple feasible approaches. Patch is 87.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112930.diff 29 Files Affected:
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index eac831278ee20d..e45370bde74a5d 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -136,6 +136,8 @@ CODEGENOPT(XRayIgnoreLoops , 1, 0)
///< Emit the XRay function index section.
CODEGENOPT(XRayFunctionIndex , 1, 1)
+///< Set when -fxray-shared is enabled
+CODEGENOPT(XRayShared , 1, 0)
///< Set the minimum number of instructions in a function to determine selective
///< XRay instrumentation.
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 152c43d7908ff8..6748f7045566aa 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2948,6 +2948,11 @@ def fxray_selected_function_group :
HelpText<"When using -fxray-function-groups, select which group of functions to instrument. Valid range is 0 to fxray-function-groups - 1">,
MarshallingInfoInt<CodeGenOpts<"XRaySelectedFunctionGroup">, "0">;
+defm xray_shared : BoolFOption<"xray-shared",
+ CodeGenOpts<"XRayShared">, DefaultFalse,
+ PosFlag<SetTrue, [], [ClangOption, CC1Option],
+ "Enable shared library instrumentation with XRay">,
+ NegFlag<SetFalse>>;
defm fine_grained_bitfield_accesses : BoolOption<"f", "fine-grained-bitfield-accesses",
CodeGenOpts<"FineGrainedBitfieldAccesses">, DefaultFalse,
diff --git a/clang/include/clang/Driver/XRayArgs.h b/clang/include/clang/Driver/XRayArgs.h
index bdd3d979547eed..1b5c4a4c42f12a 100644
--- a/clang/include/clang/Driver/XRayArgs.h
+++ b/clang/include/clang/Driver/XRayArgs.h
@@ -27,6 +27,7 @@ class XRayArgs {
XRayInstrSet InstrumentationBundle;
llvm::opt::Arg *XRayInstrument = nullptr;
bool XRayRT = true;
+ bool XRayShared = false;
public:
/// Parses the XRay arguments from an argument list.
@@ -35,6 +36,7 @@ class XRayArgs {
llvm::opt::ArgStringList &CmdArgs, types::ID InputType) const;
bool needsXRayRt() const { return XRayInstrument && XRayRT; }
+ bool needsXRayDSORt() const { return XRayInstrument && XRayRT && XRayShared; }
llvm::ArrayRef<std::string> modeList() const { return Modes; }
XRayInstrSet instrumentationBundle() const { return InstrumentationBundle; }
};
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 91605a67a37fc0..1c3c8c816594e5 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1623,10 +1623,14 @@ bool tools::addSanitizerRuntimes(const ToolChain &TC, const ArgList &Args,
}
bool tools::addXRayRuntime(const ToolChain&TC, const ArgList &Args, ArgStringList &CmdArgs) {
- if (Args.hasArg(options::OPT_shared))
- return false;
-
- if (TC.getXRayArgs().needsXRayRt()) {
+ if (Args.hasArg(options::OPT_shared)) {
+ if (TC.getXRayArgs().needsXRayDSORt()) {
+ CmdArgs.push_back("--whole-archive");
+ CmdArgs.push_back(TC.getCompilerRTArgString(Args, "xray-dso"));
+ CmdArgs.push_back("--no-whole-archive");
+ return true;
+ }
+ } else if (TC.getXRayArgs().needsXRayRt()) {
CmdArgs.push_back("--whole-archive");
CmdArgs.push_back(TC.getCompilerRTArgString(Args, "xray"));
for (const auto &Mode : TC.getXRayArgs().modeList())
diff --git a/clang/lib/Driver/XRayArgs.cpp b/clang/lib/Driver/XRayArgs.cpp
index 8c5134e2501358..d0bb5d4887c184 100644
--- a/clang/lib/Driver/XRayArgs.cpp
+++ b/clang/lib/Driver/XRayArgs.cpp
@@ -63,6 +63,23 @@ XRayArgs::XRayArgs(const ToolChain &TC, const ArgList &Args) {
<< XRayInstrument->getSpelling() << Triple.str();
}
+ if (Args.hasFlag(options::OPT_fxray_shared, options::OPT_fno_xray_shared,
+ false)) {
+ XRayShared = true;
+
+ // DSO instrumentation is currently limited to x86_64
+ if (Triple.getArch() != llvm::Triple::x86_64) {
+ D.Diag(diag::err_drv_unsupported_opt_for_target)
+ << "-fxray-shared" << Triple.str();
+ }
+
+ unsigned PICLvl = std::get<1>(tools::ParsePICArgs(TC, Args));
+ if (!PICLvl) {
+ D.Diag(diag::err_opt_not_valid_without_opt) << "-fxray-shared"
+ << "-fPIC";
+ }
+ }
+
// Both XRay and -fpatchable-function-entry use
// TargetOpcode::PATCHABLE_FUNCTION_ENTER.
if (Arg *A = Args.getLastArg(options::OPT_fpatchable_function_entry_EQ))
@@ -177,6 +194,10 @@ void XRayArgs::addArgs(const ToolChain &TC, const ArgList &Args,
Args.addOptOutFlag(CmdArgs, options::OPT_fxray_function_index,
options::OPT_fno_xray_function_index);
+ if (XRayShared)
+ Args.addOptInFlag(CmdArgs, options::OPT_fxray_shared,
+ options::OPT_fno_xray_shared);
+
if (const Arg *A =
Args.getLastArg(options::OPT_fxray_instruction_threshold_EQ)) {
int Value;
diff --git a/clang/test/Driver/XRay/xray-shared.cpp b/clang/test/Driver/XRay/xray-shared.cpp
new file mode 100644
index 00000000000000..215854e1fc7cef
--- /dev/null
+++ b/clang/test/Driver/XRay/xray-shared.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fPIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fpic -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+// RUN: not %clang -### --target=x86_64-unknown-linux-gnu -fno-PIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-PIC
+// RUN: not %clang -### --target=x86_64-unknown-linux-gnu -fno-pic -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-PIC
+
+// On 64 bit darwin, PIC is always enabled
+// RUN: %clang -### --target=x86_64-apple-darwin -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s
+
+// Check unsupported targets
+// RUN: not %clang -### --target=aarch64-pc-freebsd -fPIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-TARGET
+// RUN: not %clang -### --target=arm64-apple-macos -fPIC -fxray-instrument -fxray-shared -c %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=ERR-TARGET
+
+// CHECK: "-cc1" {{.*}}"-fxray-instrument" {{.*}}"-fxray-shared"
+// ERR-TARGET: error: unsupported option '-fxray-shared' for target
+// ERR-PIC: error: option '-fxray-shared' cannot be specified without '-fPIC'
+
diff --git a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
index d00d39518104bf..fb4dfa7bd09dfe 100644
--- a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+++ b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
@@ -104,6 +104,7 @@ else()
set(ALL_XRAY_SUPPORTED_ARCH ${X86_64} ${ARM32} ${ARM64} ${MIPS32} ${MIPS64}
powerpc64le ${HEXAGON} ${LOONGARCH64})
endif()
+set(ALL_XRAY_DSO_SUPPORTED_ARCH ${X86_64})
set(ALL_SHADOWCALLSTACK_SUPPORTED_ARCH ${ARM64})
if (UNIX)
diff --git a/compiler-rt/cmake/config-ix.cmake b/compiler-rt/cmake/config-ix.cmake
index a494e0532a50bc..431f544e8ad6a7 100644
--- a/compiler-rt/cmake/config-ix.cmake
+++ b/compiler-rt/cmake/config-ix.cmake
@@ -668,6 +668,9 @@ if(APPLE)
list_intersect(XRAY_SUPPORTED_ARCH
ALL_XRAY_SUPPORTED_ARCH
SANITIZER_COMMON_SUPPORTED_ARCH)
+ list_intersect(XRAY_DSO_SUPPORTED_ARCH
+ ALL_XRAY_DSO_SUPPORTED_ARCH
+ SANITIZER_COMMON_SUPPORTED_ARCH)
list_intersect(SHADOWCALLSTACK_SUPPORTED_ARCH
ALL_SHADOWCALLSTACK_SUPPORTED_ARCH
SANITIZER_COMMON_SUPPORTED_ARCH)
@@ -702,6 +705,7 @@ else()
filter_available_targets(CFI_SUPPORTED_ARCH ${ALL_CFI_SUPPORTED_ARCH})
filter_available_targets(SCUDO_STANDALONE_SUPPORTED_ARCH ${ALL_SCUDO_STANDALONE_SUPPORTED_ARCH})
filter_available_targets(XRAY_SUPPORTED_ARCH ${ALL_XRAY_SUPPORTED_ARCH})
+ filter_available_targets(XRAY_DSO_SUPPORTED_ARCH ${ALL_XRAY_DSO_SUPPORTED_ARCH})
filter_available_targets(SHADOWCALLSTACK_SUPPORTED_ARCH
${ALL_SHADOWCALLSTACK_SUPPORTED_ARCH})
filter_available_targets(GWP_ASAN_SUPPORTED_ARCH ${ALL_GWP_ASAN_SUPPORTED_ARCH})
diff --git a/compiler-rt/include/xray/xray_interface.h b/compiler-rt/include/xray/xray_interface.h
index 727431c04e4f73..675ea0cbc48c83 100644
--- a/compiler-rt/include/xray/xray_interface.h
+++ b/compiler-rt/include/xray/xray_interface.h
@@ -93,31 +93,78 @@ enum XRayPatchingStatus {
FAILED = 3,
};
-/// This tells XRay to patch the instrumentation points. See XRayPatchingStatus
-/// for possible result values.
+/// This tells XRay to patch the instrumentation points in all currently loaded
+/// objects. See XRayPatchingStatus for possible result values.
extern XRayPatchingStatus __xray_patch();
+/// This tells XRay to patch the instrumentation points in the given object.
+/// See XRayPatchingStatus for possible result values.
+extern XRayPatchingStatus __xray_patch_object(int32_t ObjId);
+
/// Reverses the effect of __xray_patch(). See XRayPatchingStatus for possible
/// result values.
extern XRayPatchingStatus __xray_unpatch();
-/// This patches a specific function id. See XRayPatchingStatus for possible
+/// Reverses the effect of __xray_patch_object. See XRayPatchingStatus for
+/// possible result values.
+extern XRayPatchingStatus __xray_unpatch_object(int32_t ObjId);
+
+/// This unpacks the given (packed) function id and patches
+/// the corresponding function. See XRayPatchingStatus for possible
/// result values.
extern XRayPatchingStatus __xray_patch_function(int32_t FuncId);
-/// This unpatches a specific function id. See XRayPatchingStatus for possible
+/// This patches a specific function in the given object. See XRayPatchingStatus
+/// for possible result values.
+extern XRayPatchingStatus __xray_patch_function_in_object(int32_t FuncId,
+ int32_t ObjId);
+
+/// This unpacks the given (packed) function id and unpatches
+/// the corresponding function. See XRayPatchingStatus for possible
/// result values.
extern XRayPatchingStatus __xray_unpatch_function(int32_t FuncId);
-/// This function returns the address of the function provided a valid function
-/// id. We return 0 if we encounter any error, even if 0 may be a valid function
-/// address.
+/// This unpatches a specific function in the given object.
+/// See XRayPatchingStatus for possible result values.
+extern XRayPatchingStatus __xray_unpatch_function_in_object(int32_t FuncId,
+ int32_t ObjId);
+
+/// This function unpacks the given (packed) function id and returns the address
+/// of the corresponding function. We return 0 if we encounter any error, even
+/// if 0 may be a valid function address.
extern uintptr_t __xray_function_address(int32_t FuncId);
-/// This function returns the maximum valid function id. Returns 0 if we
-/// encounter errors (when there are no instrumented functions, etc.).
+/// This function returns the address of the function in the given object
+/// provided valid function and object ids. We return 0 if we encounter any
+/// error, even if 0 may be a valid function address.
+extern uintptr_t __xray_function_address_in_object(int32_t FuncId,
+ int32_t ObjId);
+
+/// This function returns the maximum valid function id for the main executable
+/// (object id = 0). Returns 0 if we encounter errors (when there are no
+/// instrumented functions, etc.).
extern size_t __xray_max_function_id();
+/// This function returns the maximum valid function id for the given object.
+/// Returns 0 if we encounter errors (when there are no instrumented functions,
+/// etc.).
+extern size_t __xray_max_function_id_in_object(int32_t ObjId);
+
+/// This function returns the number of previously registered objects
+/// (executable + loaded DSOs). Returns 0 if XRay has not been initialized.
+extern size_t __xray_num_objects();
+
+/// Unpacks the function id from the given packed id.
+extern int32_t __xray_unpack_function_id(int32_t PackedId);
+
+/// Unpacks the object id from the given packed id.
+extern int32_t __xray_unpack_object_id(int32_t PackedId);
+
+/// Creates and returns a packed id from the given function and object ids.
+/// If the ids do not fit within the reserved number of bits for each part, the
+/// high bits are truncated.
+extern int32_t __xray_pack_id(int32_t FuncId, int32_t ObjId);
+
/// Initialize the required XRay data structures. This is useful in cases where
/// users want to control precisely when the XRay instrumentation data
/// structures are initialized, for example when the XRay library is built with
diff --git a/compiler-rt/lib/xray/CMakeLists.txt b/compiler-rt/lib/xray/CMakeLists.txt
index cf7b5062aae32d..f38c07420c9abf 100644
--- a/compiler-rt/lib/xray/CMakeLists.txt
+++ b/compiler-rt/lib/xray/CMakeLists.txt
@@ -10,6 +10,10 @@ set(XRAY_SOURCES
xray_utils.cpp
)
+set(XRAY_DSO_SOURCES
+ xray_dso_init.cpp
+ )
+
# Implementation files for all XRay modes.
set(XRAY_FDR_MODE_SOURCES
xray_fdr_flags.cpp
@@ -33,6 +37,11 @@ set(x86_64_SOURCES
xray_trampoline_x86_64.S
)
+set(x86_64_DSO_SOURCES
+ xray_trampoline_x86_64.S
+ )
+
+
set(arm_SOURCES
xray_arm.cpp
xray_trampoline_arm.S
@@ -128,10 +137,12 @@ set(XRAY_IMPL_HEADERS
# consumption by tests.
set(XRAY_ALL_SOURCE_FILES
${XRAY_SOURCES}
+ ${XRAY_DSO_SOURCES}
${XRAY_FDR_MODE_SOURCES}
${XRAY_BASIC_MODE_SOURCES}
${XRAY_PROFILING_MODE_SOURCES}
${x86_64_SOURCES}
+ ${x86_64_DSO_SOURCES}
${arm_SOURCES}
${armhf_SOURCES}
${hexagon_SOURCES}
@@ -162,6 +173,9 @@ set(XRAY_CFLAGS
${COMPILER_RT_CXX_CFLAGS})
set(XRAY_COMMON_DEFINITIONS SANITIZER_COMMON_NO_REDEFINE_BUILTINS XRAY_HAS_EXCEPTIONS=1)
+# DSO trampolines need to be compiled with GOT addressing
+set(XRAY_COMMON_DEFINITIONS_DSO ${XRAY_COMMON_DEFINITIONS} XRAY_PIC)
+
# Too many existing bugs, needs cleanup.
append_list_if(COMPILER_RT_HAS_WNO_FORMAT -Wno-format XRAY_CFLAGS)
@@ -201,7 +215,16 @@ if (APPLE)
CFLAGS ${XRAY_CFLAGS}
DEFS ${XRAY_COMMON_DEFINITIONS}
DEPS ${XRAY_DEPS})
+ add_compiler_rt_object_libraries(RTXrayDSO
+ OS ${XRAY_SUPPORTED_OS}
+ ARCHS ${XRAY_DSO_SUPPORTED_ARCH}
+ SOURCES ${XRAY_DSO_SOURCES}
+ ADDITIONAL_HEADERS ${XRAY_IMPL_HEADERS}
+ CFLAGS ${XRAY_CFLAGS}
+ DEFS ${XRAY_COMMON_DEFINITIONS_DSO}
+ DEPS ${XRAY_DEPS})
set(XRAY_RTXRAY_ARCH_LIBS "")
+ set(XRAY_DSO_RTXRAY_ARCH_LIBS "")
foreach(arch ${XRAY_SUPPORTED_ARCH})
if(NOT ${arch} IN_LIST XRAY_SOURCE_ARCHS)
continue()
@@ -215,6 +238,17 @@ if (APPLE)
DEFS ${XRAY_COMMON_DEFINITIONS}
DEPS ${XRAY_DEPS})
list(APPEND XRAY_RTXRAY_ARCH_LIBS RTXray_${arch})
+ if (${arch} IN_LIST XRAY_DSO_SUPPORTED_ARCH)
+ add_compiler_rt_object_libraries(RTXrayDSO_${arch}
+ OS ${XRAY_SUPPORTED_OS}
+ ARCHS ${XRAY_DSO_SUPPORTED_ARCH}
+ SOURCES ${${arch}_DSO_SOURCES}
+ ADDITIONAL_HEADERS ${XRAY_IMPL_HEADERS}
+ CFLAGS ${XRAY_CFLAGS}
+ DEFS ${XRAY_COMMON_DEFINITIONS_DSO}
+ DEPS ${XRAY_DEPS})
+ list(APPEND XRAY_DSO_RTXRAY_ARCH_LIBS RTXrayDSO_${arch})
+ endif()
endforeach()
add_compiler_rt_object_libraries(RTXrayFDR
OS ${XRAY_SUPPORTED_OS}
@@ -252,6 +286,17 @@ if (APPLE)
LINK_FLAGS ${XRAY_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}
LINK_LIBS ${XRAY_LINK_LIBS}
PARENT_TARGET xray)
+ add_compiler_rt_runtime(clang_rt.xray-dso
+ STATIC
+ OS ${XRAY_SUPPORTED_OS}
+ ARCHS ${XRAY_DSO_SUPPORTED_ARCH}
+ OBJECT_LIBS RTXrayDSO ${XRAY_DSO_RTXRAY_ARCH_LIBS}
+ CFLAGS ${XRAY_CFLAGS}
+ DEFS ${XRAY_COMMON_DEFINITIONS}
+ LINK_FLAGS ${XRAY_LINK_FLAGS} ${WEAK_SYMBOL_LINK_FLAGS}
+ LINK_LIBS ${XRAY_LINK_LIBS}
+ PARENT_TARGET xray)
+
add_compiler_rt_runtime(clang_rt.xray-fdr
STATIC
OS ${XRAY_SUPPORTED_OS}
@@ -346,16 +391,37 @@ else() # not Apple
DEFS ${XRAY_COMMON_DEFINITIONS}
OBJECT_LIBS RTXrayBASIC
PARENT_TARGET xray)
- # Profiler Mode runtime
- add_compiler_rt_runtime(clang_rt.xray-profiling
- STATIC
- ARCHS ${arch}
- CFLAGS ${XRAY_CFLAGS}
- LINK_FLAGS ${XRAY_LINK_FLAGS}
- LINK_LIBS ${XRAY_LINK_LIBS}
- DEFS ${XRAY_COMMON_DEFINITIONS}
- OBJECT_LIBS RTXrayPROFILING
- PARENT_TARGET xray)
+ # Profiler Mode runtime
+ add_compiler_rt_runtime(clang_rt.xray-profiling
+ STATIC
+ ARCHS ${arch}
+ CFLAGS ${XRAY_CFLAGS}
+ LINK_FLAGS ${XRAY_LINK_FLAGS}
+ LINK_LIBS ${XRAY_LINK_LIBS}
+ DEFS ${XRAY_COMMON_DEFINITIONS}
+ OBJECT_LIBS RTXrayPROFILING
+ PARENT_TARGET xray)
+
+ if (${arch} IN_LIST XRAY_DSO_SUPPORTED_ARCH)
+ # TODO: Only implemented for X86 at the moment
+ add_compiler_rt_object_libraries(RTXrayDSO
+ ARCHS ${arch}
+ SOURCES ${XRAY_DSO_SOURCES} ${${arch}_DSO_SOURCES}
+ ADDITIONAL_HEADERS ${XRAY_IMPL_HEADERS}
+ CFLAGS ${XRAY_CFLAGS}
+ DEFS ${XRAY_COMMON_DEFINITIONS_DSO}
+ DEPS ${XRAY_DEPS})
+ # DSO runtime archive
+ add_compiler_rt_runtime(clang_rt.xray-dso
+ STATIC
+ ARCHS ${arch}
+ CFLAGS ${XRAY_CFLAGS}
+ LINK_FLAGS ${XRAY_LINK_FLAGS}
+ LINK_LIBS ${XRAY_LINK_LIBS}
+ DEFS ${XRAY_COMMON_DEFINITIONS}
+ OBJECT_LIBS RTXrayDSO
+ PARENT_TARGET xray)
+ endif()
endforeach()
endif() # not Apple
diff --git a/compiler-rt/lib/xray/xray_AArch64.cpp b/compiler-rt/lib/xray/xray_AArch64.cpp
index c1d77758946edf..2b151162ce4b9b 100644
--- a/compiler-rt/lib/xray/xray_AArch64.cpp
+++ b/compiler-rt/lib/xray/xray_AArch64.cpp
@@ -89,18 +89,23 @@ inline static bool patchSled(const bool Enable, const uint32_t FuncId,
bool patchFunctionEntry(const bool Enable, const uint32_t FuncId,
const XRaySledEntry &Sled,
- void (*Trampoline)()) XRAY_NEVER_INSTRUMENT {
+ const XRayTrampolines &Trampolines,
+ bool LogArgs) XRAY_NEVER_INSTRUMENT {
+ auto Trampoline =
+ LogArgs ? Trampolines.LogArgsTrampoline : Trampolines.EntryTrampoline;
return patchSled(Enable, FuncId, Sled, Trampoline);
}
-bool patchFunctionExit(const bool Enable, const uint32_t FuncId,
- const XRaySledEntry &Sled) XRAY_NEVER_INSTRUMENT {
- return patchSled(Enable, FuncId, Sled, __xray_FunctionExit);
+bool patchFunctionExit(
+ const bool Enable, const uint32_t FuncId, const XRaySledEntry &Sled,
+ const XRayTrampolines &Trampolines) XRAY_NEVER_INSTRUMENT {
+ return patchSled(Enable, FuncId, Sled, Trampolines.ExitTrampoline);
}
-bool patchFunctionTailExit(const bool Enable, const uint32_t FuncId,
- const XRaySledEntry &Sled) XRAY_NEVER_INSTRUMENT {
- return patchSled(Enable, FuncId, Sled, __xray_FunctionTailExit);
+bool patchFunctionTailExit(
+ const bool Enable, const uint32_t FuncId, const XRaySledEntry &Sled,
+ const XRayTrampolines &Trampolines) XRAY_NEVER_INSTRUMENT {
+ return patchSled(Enable, FuncId, Sled, Trampolines.TailExitTrampoline);
}
// AArch64AsmPrinter::LowerPATCHABLE_EVENT_CALL generates this code sequence:
diff --git a/compiler-rt/lib/xray/xray_arm.cpp b/compiler-rt/lib/xray/xray_arm.cpp
index e1818555906c35..e318bb7070e802 100644
--- a/compiler-rt/lib/xray/xray_arm.cpp
+++ b/compiler-rt/lib/xray/xray_arm.cpp
@@ -128,18 +128,23 @@ inline static bool patchSled(const bool Enable, const uint32_t FuncId,
bool patchFunctionEntry(const bool Enable, const uint32_t FuncId,
const XRaySledEntry &Sled,
- void (*Trampoline)()) XRAY_NEVER_INSTRUMENT {
+ const XRayTrampolines &Trampolines,
+ bool LogArgs) XRAY_NEVER_INSTRUMENT {
+ auto Trampoline =
+ LogArgs ? Trampolines.LogArgsTrampoline : Trampolines.EntryTrampoline;
return patchSled(Enable, FuncId, Sled, Trampoline);
}
-bool patchFunctionExit(const bool Enable, const uint32_t FuncId,
- const XRaySledEntry &Sled) XRAY_NEVER_INSTRUMENT {
- return patchSled(Enable, FuncId, Sled, __xray_FunctionExit);
+bool patchFunctionExit(
+ const bool Enable, const uint32_t FuncId, const XRaySledEntry &Sled,
+ c...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
GEP offsets have sext_or_trunc semantics. We were already doing this for the outer-most GEP, but not for the inner ones. I believe one of the sanitizer buildbot failures was due to this, but I did not manage to reproduce the issue or come up with a test case. Usually the problematic case will already be folded away due to index type canonicalization.
bazelbuild: fix for llvm@2ce10f0. No functional changes intended.
Fix for llvm@d80b9cf. No functional changes intended.
…en issue Since llvm#109628 landed, this test has been failing on 32-bit Arm. This is due to a codegen problem (whether added or uncovered by the change, not known) where the trap instruction is placed after the frame pointer and link register are restored. llvm#113154 So the code was: ``` std::__1::vector<int>::operator[](unsigned int): sub sp, sp, llvm#8 str r0, [sp, llvm#4] str r1, [sp] add sp, sp, llvm#8 .inst 0xe7ffdefe bx lr ``` When lldb saw the trap, the PC was inside operator[] but the frame information actually pointed to g. This bug only happens for leaf functions so adding a return type works around it: ``` std::__1::vector<int>::operator[](unsigned int): push {r11, lr} mov r11, sp sub sp, sp, llvm#8 str r0, [sp, llvm#4] str r1, [sp] mov sp, r11 pop {r11, lr} .inst 0xe7ffdefe bx lr ``` (and operator[] should return T& anyway) Now the PC location and frame information should match and the test passes.
…xt parameter to getBufferForFile (llvm#111723) This patch adds an IsText parameter to the following getBufferForFile, getBufferForFileImpl. We introduce a new virtual function openFileForReadBinary which defaults to openFileForRead except in RealFileSystem which uses the OF_None flag instead of OF_Text. The default is set to OF_Text instead of OF_None, this change in value does not affect any other platforms other than z/OS. Setting this parameter correctly is required to open files on z/OS in the correct encoding. The IsText parameter is based on the context of where we open files, for example, in the ASTReader, HeaderMap requires that files always be opened in binary even though they might be tagged as text.
…) for reductions" This reverts commit 7f2e937 as it causes regressions in the tests it modifies, and undoes what was added in llvm#100653 (which itself was a fix for a previous regression).
The initial version of this feature would use the output file name if it could, but in switching to temp files I forgot to replicate that behaviour. What happens now is we always use a tempfile name and the output path is a template for that. I think the current behaviour still makes sense so I'm just correcting the documentation.
This code is heavily based on the SelectionDAG lowerINSERT_SUBVECTOR code.
The SPIR-V backend will need to use Reg2Mem, hence this pass needs to be wrapped to be used with the legacy pass manager. --------- Signed-off-by: Nathan Gauër <[email protected]>
Add sign extension on i32 return value.
…ace tablegen patterns This lowers the aarch64_neon_sqxtn intrinsics to the new TRUNCATE_SSAT_S ISD nodes, performing the same for sqxtun and uqxtn. This allows us to clean up the tablegen patterns a little and in a future commit add combines for sqxtn.
…llvm#112686) Currently, the `omp.simd` operation is ignored during MLIR to LLVM IR translation when it takes part in a composite construct. One consequence of this limitation is that any entry block arguments defined by that operation will trigger a compiler crash if they are used anywhere, as they are not bound to an LLVM IR value. A previous PR introducing support for the `reduction` clause resulted in the creation and use of entry block arguments attached to the `omp.simd` operation, causing compiler crashes on 'do simd reduction(...)' constructs. This patch disables Flang lowering of simd reductions in 'do simd' constructs to avoid triggering these errors while translation to LLVM IR is still incomplete.
llvm#113119) StringMap::find takes StringRef. We don't need to create an instance of std::string from StringRef only to convert it right back to StringRef.
This code intentionally discards the high bits, so set implicitTrunc=true. This is currently NFC but will enable an APInt assertion in the future.
…vm#111575) Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline, which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner, and removes them from the runner. The tests are changed to invoke mlir-opt with this flag before running the runner. The eventual goal is to move all host/device code generation steps out of the runner, like with some of the other runners.
6bac414 added this opcode with the wrong number of operands. It didn't fail on check-llvm for me or on pre-commit CI, but once committed we got buildbot failures. This patch fixes the definition of the instruction and fixes the failing test.
With the truncssat nodes these are relatively simple tablegen patterns to add. The existing intrinsics are converted to shift+truncsat to they can lower using the new patterns. Fixes llvm#112925.
For some reason, rebasing on main caused github to automatically request lots of reviews. Not sure why this is happening now. |
Closing this PR, as something got mixed up during the rebase |
This fixes remaining issues in my previous PR #90959.
Changes:
xray_interface.cpp
I have tested these changes on
x86_64
(natively), as well asppc64le
,aarch64
andarm32
(cross-compiled and emulated using qemu).Original description:
This PR introduces shared library (DSO) support for XRay based on a revised version of the implementation outlined in this RFC.
The feature enables the patching and handling of events from DSOs, supporting both libraries linked at startup or explicitly loaded, e.g. via
dlopen
.This patch adds the following:
-fxray-shared
flag to enable the feature (turned off by default)These changes are fully backward compatible, i.e. running without instrumented DSOs will produce identical traces (in terms of recorded function IDs) to the previous implementation.
Due to my limited ability to test on other architectures, this feature is only implemented and tested with x86_64. Extending support to other architectures is fairly straightforward, requiring only a position-independent implementation of the architecture-specific trampoline implementation (see
compiler-rt/lib/xray/xray_trampoline_x86_64.S
for reference).This patch does not include any functionality to resolve function IDs from DSOs for the provided logging/tracing modes. These modes still work and will record calls from DSOs, but symbol resolution for these functions in not available. Getting this to work properly requires recording information about the loaded DSOs and should IMO be discussed in a separate RFC, as there are mulitple feasible approaches.