Eric contracts #2

…te calls. (llvm#111457) Clang previously missed implementing P0522 pack matching for deduced function template calls. Fixes llvm#111363

@antiagainst

Extra builders for CallIntrinsicOp. This is inspired by the comment from @antiagainst from [here](llvm#108933 (comment)).

…t attributes and undeclared templates (llvm#107786) Fixes llvm#107047 Fixes llvm#49093

…11679) This DAG combine replaces a floating-point load/store pair which has no other uses with an integer one, but did not copy the memory operand flags to the new instructions, resulting in it dropping the volatile flag. This optimisation is still valid if one or both of the instructions is volatile, so we can copy over the whole MachineMemOperand to generate volatile integer loads and stores where needed.

These might also be called with vectors, but we don't support that.

This does a global rename from `flang-new` to `flang`. I also removed/changed any TODOs that I found related to making this change. --------- Co-authored-by: H. Vetinari <[email protected]> Co-authored-by: Andrzej Warzynski <[email protected]>

llvm#111797) This commit fixes a bug in the import of nameless globals. Before this change, the fake symbol names were only generated during the transformation of the definition. This caused issues when the symbol was used before it was defined.

…e. (llvm#111428) Similar to 112aac4, this converts log libcalls to llvm.log.f64 intrinsics if we know they do not set errno, as the input is not zero and not negative. As log will produce errno if the input is 0 (returning -inf) or if the input is negative (returning nan), we also perform the conversion when we have noinf and nonan.

follow up work of llvm#106229, add create pass overload function to create pass. --------- Co-authored-by: jingzec <[email protected]>

@farzonl

- Add handling for unsigned integers to hlsl_elementwise_sign - Use `select` instead of adding dx and spirv intrinsics for unsigned integers as [discussed previously ](llvm#101988 (comment)) fixes llvm#70078 ### Related PRs - llvm#101987 - llvm#101988 - llvm#101989 cc @farzonl @pow2clk @bob80905 @bogner @llvm-beanz

Saves me searching for this every time someone asks.

…XTRACT_SUBVECTOR(V,C1+C2) (llvm#111685) Extract from the original source vector whenever possible. This removes a number of dependency bottlenecks and helps a number of shuffle combining cases: either by allowing us to avoid a cross-lane variable shuffle on a slow target by keeping the instruction count below the threshold, or on fast targets make it easier to recognise that the subvectors all came form the same source.

…m#111747) This module is used in various helper scripts since llvm#93712

…lvm#111720) Fixes missing m0 initialize for pre-gfx9 targets with local extending loads.

Implement the addMachineSSAOptimizations passes for AMDGPU. Porting the other generic passes in this category is WIP.

Run ArgumentPromotion before IPSCCP in the LTO pipeline, to expose more constants to be propagated. We also run PostOrderFunctionAttrs to improve the information available to ArgumentPromotion's alias analysis, and SROA to clean up allocas.

)

Followup to llvm#111001

…#110099) Introduced a new check that finds cases when an uninstantiated virtual member function in a template class causes cross-compiler incompatibility.

2nd PR to fix llvm#108695 based on llvm#108702 --------- Signed-off-by: Kushal Pal <[email protected]>

Many LLDB's dotest.py based tests require the `make` tool. If it's not found in Path, they fail with an obscure error and show up as `UNRESOLVED`. On Windows, llvm-lit takes care of MSYS based testing tools like cat, printf, etc., but `make` is not part of that. Let's catch the situation early and check for it at configuration time. This error isn't fatal: It should fail the build, but not immediately stop the configuration process. There might be other issues further down the line that can be caught in the same buildbot run.

…lvm#111810) Otherwise it fails with "error: Embedded script interpreter unavailable. LLDB was built without scripting language support."

…tchOp (llvm#110269) While working with `emitc::SwitchOp`, it was identified that `mlir-translate` emits **invalid C code** for switch. This commit fixes the issue with the closing bracket in `CppEmitter` within `printOperation` for `emitc::SwitchOp`.

…rs, buffer.*.pN (llvm#110714)" v2 (llvm#111708)" This reverts commit 4b4a0d4. New test fails on buildbots https://lab.llvm.org/buildbot/#/builders/63/builds/2039 https://lab.llvm.org/buildbot/#/builders/127/builds/1055

…mm_cvtsi64_ss SSE1 intrinsics Followup to llvm#111001

…vm#111746) The libcxx/benchmarks directory was moved to libcxx/test/benchmarks, which is already checked by that grep command.

This is a dependency of llvm#80007.

…#80007) This patch always defines the cxx_shared, cxx_static & other top-level targets. However, they are marked as EXCLUDE_FROM_ALL when we don't want to build them. Simply declaring the targets should be of no harm, and it allows other projects to mention these targets regardless of whether they end up being built or not. This patch basically moves the definition of e.g. cxx_shared out of the `if (LIBCXX_ENABLE_SHARED)` and instead marks it as EXCLUDE_FROM_ALL conditionally on whether LIBCXX_ENABLE_SHARED is passed. It then does the same for libunwind and libc++abi targets. I purposefully avoided to reformat the files (which now has inconsistent indentation) because I wanted to keep the diff minimal, and I know this is an area of the code where folks may have downstream diffs. I will re-indent the code separately once this patch lands. This is a reapplication of 79ee034, which was reverted in a353909 because it broke the TSAN and the Fuchsia builds. Resolves llvm#77654 Differential Revision: https://reviews.llvm.org/D134221

…m#105664) Default to Global address space for memrefs that do not have an explicit address space set in the IR. --------- Co-authored-by: Victor Perez <[email protected]> Co-authored-by: Jakub Kuderski <[email protected]> Co-authored-by: Victor Perez <[email protected]>

We need to use the MaterializeTemporaryExpr here so the checks in ExprConstant.cpp do the right thing.

…change) (llvm#111816) This makes tests more portable. Make variables for LLVM utils are passed to `make` on Darwin as well. Co-authored-by: Vladimir Vereschaka <[email protected]>

…M_TARGETS_TO_BUILD. (llvm#111382) From llvm#111356

) There are artificial one-use limitations on foldExtractedCmps. Adjust the costs to account for multi-use, and strip the one-use matcher, lifting the limitations.

…m#111127) [template.bitset.general] indicates that `bitset` shouldn't have member typedef-names `iterator` and `const_iterator`. Currently libc++'s typedef-names are causing ambiguity in name lookup, which isn't conforming. As these iterator types are themselves useful, I think we should just use __uglified member typedef-names for them. Fixes llvm#111125

…1803) Fixes llvm#110265 Adding check-all causes us to run some tests twice if a project specific target like check-clang is also added. check-pstl is an alternative but as far as I can tell, check-all does not include this so we have not been running the tests in CI anyway. When I tried to run check-pstl locally I got a lot of compiler errors but have not found any instructions on how to setup a correct build environment. Even if such instructions exist, it's probably more than we want to do in CI. According to Louis Dionne, the project is probably not active. So if it's ever revived it'll be up to the new contributors to enable testing.

) Derived type results of BIND(C) function should be returned according the the C ABI for returning the related C struct type. This currently did not happen since the abstract-result pass was forcing the Fortran ABI for all derived type results. use the bind_c attribute that was added on call/func/dispatch in FIR to prevent such rewrite in the abstract result pass, and update the target-rewrite pass to deal with the struct return ABI. So far, the target specific part of the target-rewrite is only implemented for X86-64 according to the "System V Application Binary Interface AMD64 v1", the other targets will hit a TODO, just like for BIND(C), VALUE derived type arguments. This intends to deal with llvm#102113.

Fixes: llvm#111815 This patch replaces usage of the python `imp` library, which is deprecated since python3.4 and removed in python3.12, with the `importlib` library. As part of this update the repeated find_module+load_module pattern is moved into a utility function, since the importlib equivalent is much more verbose.

) It turns out that {s,u}int_to_fp nodes get their operation action from their operand's type, not the result type, so we don't need to set it for fp16 or bf16. vp_{s,u}int_to_fp uses the result type though so we need to keep it. This also means that we can lower int_to_fp for fixed length bf16 vectors already, so this adds tests for that. The cost model test changes are due to BasicTTIImpl's getCastInstrCost not taking into account that int_to_fp needs its legal type swapped. This can be fixed in a later patch, but its worth noting that the affected types in the tests currently crash when lowered anyway (due to them needing split at LMUL > 8)

Reverts llvm#111163, as this was merged prematurely.

…lvm#111129) Before this patch, redundant COPY couldn't be removed for the following case: ``` %reg1 = COPY %const-reg ... // There is a def of %const-reg %reg2 = COPY killed %reg1 ``` where this can be optimized to: ``` ... // There is a def of %const-reg %reg2 = COPY %const-reg ``` This patch allows for such optimization by not invalidating defined constant registers. This is safe, as architectures like AArch64 and RISCV replace a dead definition of a GPR with a zero constant register for certain instructions.

This change extends the current method for creating ABI object to allow users (plugin libraries) to create custom ABI objects for their needs. This is accomplished by inheriting one of the common ABIs and overriding one or more of the methods to create a custom ABI. To use a custom ABI for a given coroutine the coro.begin.custom.abi intrinsic is used in place of the coro.begin intrinsic. This takes an additional i32 arg that specifies the index of an ABI generator for the custom ABI object in a SmallVector passed to the CoroSplitPass ctor. The detailed changes include: * Add the llvm.coro.begin.custom intrinsic used to specify the index of the custom ABI to use for the given coroutine. * Add constructors to CoroSplit that take a list of generators that create the custom ABI object. * Extend the CreateNewABI function used by CoroSplit to return a unique_ptr to an ABI object. * Add has/getCustomABI methods to CoroBeginInst class. * Add a unittest for a custom ABI. See doc update here: llvm#111781

Summary: This had some leftover references to the old namespace and didn't put restrict on it.

…lvm#111760) The previous error test line is using a 16bit instruction to indicate an error. However this is a poor pick. The 16bit instructions on AMDGPU is under development and thus, some downstream branches are not showing this exact error message. Changing it to another error dasm code.

After the refactor in: * ed22913, the `args_in` and `args_out` attributes are no longer used by `linalg.generic`. This patch removes most the remaining references. I've left out BufferDeallocationInternals.md, which doesn't seem maintained anymore and is quite out of sync with other bits of MLIR (e.g. `test.generic` instead of `linalg.generic`).

A follow-up for llvm#111816. This is to fix buildbot failure https://lab.llvm.org/staging/#/builders/195/builds/4242. TestSymbolFileJSON.py doesn't pass with llvm-strip on macOS. Apparently, llvm-strip/llvm-objcopy can't clean symbols from Mach-O nlists.

…1783)

In `--icf=safe_thunks` mode, the linker differentiates `keepUnique` functions by creating thunks during a post-processing step after Identical Code Folding (ICF). While this ensures that `keepUnique` functions themselves are not incorrectly merged, it overlooks functions that reference these `keepUnique` symbols. If two functions are identical except for references to different `keepUnique` functions, the current ICF algorithm incorrectly considers them identical because it doesn't account for the future differentiation introduced by thunks. This leads to incorrect deduplication of functions that should remain distinct. To address this issue, we modify the ICF comparison to explicitly check for references to `keepUnique` functions during deduplication. By doing so, functions that reference different `keepUnique` symbols are correctly identified as distinct, preventing erroneous merging and ensuring the correctness of the linked output.

…lvm#111858) Reverts llvm#111678 Causes ARM failure in test suite. TYPE(C_PTR) result should not regress even if struct ABI no implemented for the target. https://lab.llvm.org/buildbot/#/builders/143/builds/2731 I need to revisit this.

…1737) Removes pragmas like `# 1 "<std>" 1 3` to make line numbers in failing tests more accurate. Use `basic_string_view` instead `string_view` to kick in GSL owner/pointer auto inference.

This is the minimal change to avoid the assert. There's an API flaw in invoke instructions where getLandingPad assumes all invoke unwind blocks have landingpads, when some have catchswitch instead. Fixes llvm#111817

…mbols. (llvm#111792) Add a MCSubtargetInfo& operand so we can control the subtarget for the new calls. The old signature is kept as a wrapper to pass *STI to maintain compatibility. By using EmitToStreamer we are able to compress the instructions when possible.

Add an "always on" log category and channel. Unlike other, existing log channels, it is not exposed to users. The channel is meant to be used sparsely and deliberately for logging high-value information to the system log. We have a similar concept in the downstream Swift fork and this has proven to be extremely valuable. This is especially true on macOS where system log messages are automatically captured as part of a sysdiagnose.

…1850) Fixes issue added by: llvm#111833 Following the previous commit that changed how Dexter imports modules, the ComInterface module import became broken. This is because it had a different directory structure to other modules, where we want to import single file rather than a dir containing a __init__.py. For this case, an optional extra arg has been added to load_module allowing a filename to be specified, letting us import ComInterface.py directly and fixing the issue.

…lvm#104783) The main goal of this patch is to extend the semantic of 'linalg.matmul' named op to include per operand transpose semantic while also laying out a way to move ops definition from OpDSL to tablegen. Hence, it is implemented in tablegen. Transpose semantic is as follows. By default 'linalg.matmul' behavior will remain as is. Transpose semantics can be appiled on per input operand by specifying the optional permutation attributes (namely 'permutationA' for 1st input and 'permutationB' for 2nd input) for each operand explicitly as needed. By default, no transpose is mandated for any of the input operand. Example: ``` %val = linalg.matmul ins(%arg0, %arg1 : memref<5x3xf32>, memref<5x7xf32>) outs(%arg2: memref<3x7xf32>) permutationA = [1, 0] permutationB = [0, 1] ```

Rather than invariantly running `F->verify()` when asserts are enabled, run machine IR verification in LIT tests only. Swap `CHECK-PERF` and `CHECK-SIZE` in `code_placement_ext_tsp_large.ll`. Remove `={0,1,true,false}` from flags in tests.

llvm#86960) Depends on [CWG1815](llvm#108039). Fixes llvm#85613. In [[Clang] Implement P2718R0 "Lifetime extension in range-based for loops"](llvm#76361), we've not implement the lifetime extensions for the temporaries which in `CXXDefaultInitExpr`. As the confirmation in llvm#85613, we should extend lifetime for that. To avoid modifying current CodeGen rules, in a lifetime extension context, the cleanup of `CXXDefaultInitExpr` was ignored. --------- Signed-off-by: yronglin <[email protected]>

…ts (llvm#111841) By allowing AnnotateAttr to be applied to statements, users can place arbitrary information in the AST for later use. For example, this can be used for HW-targeted language extensions that involve specialized loop annotations.

…1214) In tosa valiation pass, change the type of profile option to ListOption. Now TOSA profiles is turned from hierarchical to composable. Each profile is an independent set, i.e. an target can implement multiple profiles. Set the profile option to none by default, and limit to profiles if requested. The profiles can be specified via command line, e.g. $ mlir-opt ... --tosa-validate="profile=bi,mi" which tells the valiation pass that BI and MI are enabled. Change-Id: I1fb8d0c1b27eccd768349b6eb4234093313efb57

LLVM now triggers an assertion when the format string and arguments don't match. Fix a variety of incorrect format strings I discovered when enabling logging with a debug build.

…dPtrOrigin. (llvm#111222) Ignore std::forward when it appears while looking for the pointer origin.

…XXInheritedCtorInitExpr. (llvm#111198)

Make isUncountedPtr take QualType as an argument instead of Type*. This simplifies some code.

A COMMON block is a named area of memory that holds a collection of variables. Fortran subprograms may map the COMMON block memory area to a list of variables. A common block is represented in LLVM debug by DICommonBlock. This PR adds support for this in MLIR. The changes are mostly mechanical apart from small change to access the DICompileUnit when the scope of the variable is DICommonBlock. --------- Co-authored-by: Tobias Gysi <[email protected]>

This is a purely mechanical commit for fixing the indentation of the runtimes' CMakeLists files after llvm#80007. That PR didn't update the indentation in order to make the diff easier to review and for merge conflicts to be easier to resolve (for downstream changes). This doesn't change any code, it only reindents it.

@Artem-B

In [[NVPTX] Improve lowering of v4i8](llvm@cbafb6f) @Artem-B add the ability to lower ISD::BUILD_VECTOR with bfi PTX instructions. @Artem-B did this because: ([source](llvm#67866 (comment))) > Under the hood byte extraction/insertion ends up as BFI/BFE instructions, so we may as well do that in PTX, too. https://godbolt.org/z/Tb3zWbj9b However, the example that @Artem-B linked was targeting sm_52. On modern architectures, ptxas uses prmt.b32. [Example](https://godbolt.org/z/Ye4W1n84o). Thus, remove uses of NVPTXISD::BFI in favor of NVPTXISD::PRMT.

…11454) When an OPEN statement with a unit number fails in a recoverable manner, the runtime needs to delete the ExternalFileUnit instance that was created in the unit map. And we do this too soon -- that instance still holds some of the I/O statement state that will be used by a later call into the runtime for EndIoStatement. Move the code that deletes the unit after a failed but recoverable OPEN into ExternalIoStatementBase::EndIoStatement, and don't do things afterwards that would need the I/O statement state that has been destroyed. Fixes llvm#111404.

ProgramTree instances are created as the value of a local variable in the Pre(const parser::ProgramUnit &) member function in name resolution. But references to these ProgramTree instances can persist in SubprogramNameDetails symbol table entries that might survive that function call's lifetime, and lead to trouble later when (e.g.) expression semantics needs to deal with a possible forward reference in a function reference in an expression being processed later in expression checking. So put those ProgramTree instances into a longer-lived linked list within the SemanticsContext. Might fix some weird crashes reported on big-endian targets (AIX & Solaris).

The semantics utility GetAllNames has declarations in two header files and a definition that really should be in the common utilities source file. Remove the redudant declaration from resolve-names-utils.h and move code from resolve-names-utils.cpp into Semantics/tools.cpp.

llvm#108870) This commit changes the libc++ frame recognizer to hide implementation details of libc++ more aggressively. The applied heuristic is rather straightforward: We consider every function name starting with `__` as an implementation detail. This works pretty neatly for `std::invoke`, `std::function`, `std::sort`, `std::map::emplace` and many others. Also, this should align quite nicely with libc++'s general coding convention of using the `__` for their implementation details, thereby keeping the future maintenance effort low. However, this heuristic by itself does not work in 100% of the cases: E.g., `std::ranges::sort` is not a function, but an object with an overloaded `operator()`, which means that there is no actual call `std::ranges::sort` in the call stack. Instead, there is a `std::ranges::__sort::operator()` call. To make sure that we don't hide this stack frame, we never hide the frame which represents the entry point from user code into libc++ code

Header guard was in sync with the filename.

- add additional lowering for directx backend in CGBuiltin.cpp - add directx intrinsic to IntrinsicsDirectX.td - add semantic check of arguments in SemaHLSL.cpp - add mapping to DXIL op in DXIL.td - add testing of semantics in WaveGetLaneIndex-errors.hlsl - add testing of dxil lowering in WaveGetLaneIndex.ll Resolves llvm#70105

…11684) Remove unnecessary `proc_pidinfo` calling.

To consolidate behavior of function mangling and limit the number of places that ABI changes will need to be made, this switches the DirectX target used for HLSL to use the Itanium ABI from the Microsoft ABI. The Itanium ABI has greater flexibility in decisions regarding mangling of new types of which we have more than a few yet to add. One effect of this will be that linking library shaders compiled with DXC will not be possible with shaders compiled with clang. That isn't considered a terribly interesting use case and one that would likely have been onerous to maintain anyway. This involved adding a function to call all global destructors as the Microsoft ABI had done. This requires a few changes to tests. Most notably the mangling style has changed which accounts for most of the changes. In making those changes, I took the opportunity to harmonize some very similar tests for greater consistency. I also shaved off some unneeded run flags that had probably been copied over from one test to another. Other changes effected by using the new ABI include using different types when manipulating smaller bitfields, eliminating an unnecessary alloca in one instance in this-assignment.hlsl, changing the way static local initialization is guarded, and changing the order of inout parameters getting copied in and out. That last is a subtle change in functionality, but one where there was sufficient inconsistency in the past that standardizing is important, but the particular direction of the standardization is less important for the sake of existing shaders. fixes llvm#110736

This patch implements an iterator for iterating over both use-def and mem dependencies of MemDGNodes.

…ue. (llvm#110576) Update fixupIVUsers to compute the value for escaped inductions using the already computed end value of the induction (EndValue), but subtracting the step. This results in slightly simpler codegen, as we avoid computing the full transformed index at VectorTripCount - 1. PR: llvm#110576

* Replace usage of unique_ptr<>(new ...) -> make_unique<>();

…lvm#111886) First, ReadlanePieces should be in the scope of each MachineOperand. It is not correct if we declare in a outer scope without clearing after the use for a MachineOperand. Additionally, we do not need the OrigBB argyment for emitLoadScalarOpsFromVGPRLoop, since MachineFunction (the only use) can be obtained from LoopBB (or BodyBB).

This flag has been on for a while without any complaints.

…llvm#111897)

@maleadt

Before CUDA 12.3 `ptxas` did not recognize that the trap instruction terminates a basic block. Instead, it would assume that control flow continued to the next instruction. The next instruction could be in the block that's lexically below it. This would lead to phantom CFG edges being created within ptxas. [NVPTX: Lower unreachable to exit to allow ptxas to accurately reconstruct the CFG.](llvm@1ee4d88) added the LowerUnreachable pass to NVPTX to work around this. Several other WAR patches followed. This bug in `ptxas` was fixed in CUDA 12.3 and is thus impossible to encounter when targeting PTX ISA v8.3+ This commit reverts the WARs for the `ptxas` bug when targeting PTX ISA v8.3+ CC @maleadt

Update the llvm/docs/Coroutines.rst docs to include a full description of Custom ABI objects. This documentation describes the how ABI objects allow users (plugin libraries) to create custom ABI objects for their needs.

This commit only adds support for the `SBProcess::ReverseContinue()` API. A user-accessible command for this will follow in a later commit. This feature depends on a gdbserver implementation (e.g. `rr`) providing support for the `bc` and `bs` packets. `lldb-server` does not support those packets, and there is no plan to change that. So, for testing purposes, `lldbreverse.py` wraps `lldb-server` with a Python implementation of *very limited* record-and-replay functionality for use by *tests only*. The majority of this PR is test infrastructure (about 700 of the 950 lines added).

) `Type::getPointerTo()` is to be deprecated & removed soon.

NFC because I am not aware of any particular issue from seek, but reopen looks less error prone. Pull Request: llvm#111899

This fixes the following assertion: "Cannot create Expected<T> from Error success value." The problem was that GetFrameBaseValue return false without updating the Status argument. This patch eliminates the opportunity for mistakes by returning an llvm:Error.

This adds an include for SBLanguages.h in lldb-enumerations.h so that files that need this enum do not have to explicitly include SBLanguages.

Forks are common suspects for unusual sanitizer behavior. It can be handy to see them without rebuild.

…lvm#111750)

Need to track changes with the repeated reduced value, since it might be vectorized in the next attempt for reduction vectorization, to correctly generate the code and avoid compiler crash. Fixes llvm#111887

…#111907)" Temporarily Revert until Chelsea can look at this. With a clean build, SBLanguages.h won't be generated in the build directory at the point when it is included by lldb-enumerations when compiling e.g. Broadcaster.cpp. On a clean build (no pre-existing build directory), the dependency ordering is not explicitly stated so the build will fail. An incremental build will succeed. This reverts commit b355426.

For llvm#111901

Such threads can cause false leak reports, but often it's hard to diagnose the reason of failed PTRACE_ATTACH. Maybe we can find a clue from `/proc/*/task/*/status`

… in the current module (llvm#110064) Doing so could cause a bug where the linker tries to remap a function "reimported" from the current module when materializing it, causing a lookup assert in the type mappings.

…v (NFC)" This reverts commit b77fdf5.

…)" This reverts commit d5e1de6.

…lvm#109477) This patch adds the support to `Process.cpp` to automatically save off TLS sections, either via loading the memory region for the module, or via reading `fs_base` via generic register. Then when Minidumps are loaded, we now specify we want the dynamic loader to be the `POSIXDYLD` so we can leverage the same TLS accessor code as `ProcessELFCore`. Being able to access TLS Data is an important step for LLDB generated minidumps to have feature parity with ELF Core dumps.

This commit only adds support for the `SBProcess::ReverseContinue()` API. A user-accessible command for this will follow in a later commit. This feature depends on a gdbserver implementation (e.g. `rr`) providing support for the `bc` and `bs` packets. `lldb-server` does not support those packets, and there is no plan to change that. So, for testing purposes, `lldbreverse.py` wraps `lldb-server` with a Python implementation of *very limited* record-and-replay functionality for use by *tests only*. The majority of this PR is test infrastructure (about 700 of the 950 lines added).

This uses lldb-server in gdbserver mode, which requires a ProcessNative plugin. Darwin does not have a ProcessNative plugin; it uses debugserver instead of lldb-server. Skip these tests.

… defined in the current module" (llvm#111919) Reverts llvm#110064

…lvm#111915)

…reate. We can get a reference to the ExecutionSession from the ObjectLinkingLayer argument, so there's no need to pass it in separately.

This reverts commit c686eeb.

…v (NFC)" This reverts commit fae7d68.

…)" Reverting this again; I added a commit which added @skipIfDarwin markers to the TestReverseContinueBreakpoints.py and TestReverseContinueNotSupported.py API tests, which use lldb-server in gdbserver mode which does not work on Darwin. But the aarch64 ubuntu bot reported a failure on TestReverseContinueBreakpoints.py, https://lab.llvm.org/buildbot/#/builders/59/builds/6397 File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py", line 63, in test_reverse_continue_skip_breakpoint self.reverse_continue_skip_breakpoint_internal(async_mode=False) File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py", line 81, in reverse_continue_skip_breakpoint_internal self.expect( File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2372, in expect self.runCmd( File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1002, in runCmd self.assertTrue(self.res.Succeeded(), msg + output) AssertionError: False is not true : Process should be stopped due to history boundary Error output: error: Process must be launched. This reverts commit 4f29756.

- add degrees builtin - link degrees api in hlsl_intrinsics.h - add degrees intrinsic to IntrinsicsDirectX.td - add degrees intrinsic to IntrinsicsSPIRV.td - add lowering from clang builtin to dx/spv intrinsics in CGBuiltin.cpp - add semantic checks to SemaHLSL.cpp - add expansion of directx intrinsic to llvm fmul for DirectX in DXILIntrinsicExpansion.cpp - add mapping to spir-v intrinsic in SPIRVInstructionSelector.cpp - add test coverage: - degrees.hlsl -> check hlsl lowering to dx/spv degrees intrinsics - degrees-errors.hlsl/half-float-only-errors -> check semantic warnings - hlsl-intrinsics/degrees.ll -> check lowering of spir-v degrees intrinsic to SPIR-V backend - DirectX/degrees.ll -> check expansion and scalarization of directx degrees intrinsic to fmul Resolves llvm#99104

…pts (llvm#102669) If this pattern is used more than once in version script(s), only one will have an effect, so it's probably a user error and can be diagnosed.

Fix a typo in ReleaseNotes that introduced by llvm#86960. Signed-off-by: yronglin <[email protected]>

By default reuse can happend only after `UINT32_MAX` threads, so it's almost NFC.

FMINNM/FMAXNM instructions of AArch64 follow IEEE754-2008. We can use them to canonicalize a floating point number. And FMINNUM_IEEE/FMAXNUM_IEEE is used by something like expanding FMINIMUMNUM/FMAXIMUMNUM, so let's define them. --------- Co-authored-by: Your Name <[email protected]>

…111775) In the context of regular expressions, Python (used to) gracefully ignore the escape behavior of `\` in some contexts, e.g. for representing the regular expression `\w+`. However in newer versions of Python this now gives a warning in the form ``` SyntaxWarning: invalid escape sequence '\w' ``` Fix by explicitly using raw strings instead.

…llvm#111284) Some targets have better codegen for `ctpop(X) u< 2` than `ctpop(X) == 1`. After llvm#100899, we set the range of ctpop's return value to indicate the argument/result is non-zero. This patch converts `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` in CGP to fix llvm#95255.

Summary: This is failing on the NVPTX buildbot, https://lab.llvm.org/buildbot/#/builders/69/builds/6997/. I cannot reproduce it locally so I'm disabling it temporarily so the bot is green.

This commit adds `valueLocationReference` to function pointers and function references. Thereby, users can navigate directly to the pointed-to function from within the "variables" pane. In general, it would be useful to also a add similar location references also to member function pointers, `std::source_location`, `std::function`, and many more. Doing so would require extending the formatters to provide such a source code location. There were two RFCs about this a while ago: https://discourse.llvm.org/t/rfc-extending-formatters-with-a-source-code-reference/68375 https://discourse.llvm.org/t/rfc-sbvalue-metadata-provider/68377/26 However, both RFCs ended without a conclusion. As such, this commit now implements the lowest-hanging fruit, i.e. function pointers. If people find it useful, I will revive the RFC afterwards.

llvm#109512) X86 maxss/minss etc. instructions won't turn SNaN to QNaN, so we can combine fcmp + select to them for some predicates.

llvm#111804) TypedefNameDecl referenced by a synthesized CTAD guide for type aliases was not transformed previously, resulting in a substitution failure in BuildDeductionGuideForTypeAlias() when substituting into the right-hand-side deduction guide. This patch fixes it in the way we have been doing since https://reviews.llvm.org/D80743. We transform all the function parameters, parenting referenced TypedefNameDecls with the CXXDeductionGuideDecl. Then we instantiate these declarations in FindInstantiatedDecl() as we build up the eventual deduction guide, using the mechanism introduced in D80743 Fixes llvm#111508

…/u> 2/1`" (llvm#111932) Reverts llvm#111284 to fix clang stage2 builds. Investigating... Failed buildbots: https://lab.llvm.org/buildbot/#/builders/76/builds/3576 https://lab.llvm.org/buildbot/#/builders/168/builds/4308 https://lab.llvm.org/buildbot/#/builders/127/builds/1087

I'm about to post a PR in this area.

…lvm#111793) - Improve the accuracy of fast pass' range reduction. - Provide tighter error estimations. - Reduce the table size when `LIBC_MATH_SMALL_TABLES` flag is set.

'status_path_' must include `tid`. Regression from llvm#111909.

Not doing this is wrong in general and we need to reject expressions where it would matter differently.

…spendThread` (llvm#111943) Allows to distinguish failure from stopped threads.

Co-authored-by: YunQiang Su <[email protected]>

Before the first reuse, after 2^32 threads they are equal.

Now we can pass `invalid tid`.

This change implements support of metadata strings in operand bundle values. It makes possible calls like: call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ] It requires some extension of the bitcode serialization. As SSA values and metadata are stored in different tables, there must be a way to distinguish them during deserialization. It is implemented by putting a special marker before the metadata index. The marker cannot be treated as a reference to any SSA value, so it unambiguously identifies metadata. It allows extending the bitcode serialization without breaking compatibility. Metadata as operand bundle values are intended to be used in floating-point function calls. They would represent the same information as now is passed by the constrained intrinsic arguments.

... and add getCtx (file->ctx). This allows InputSectionBase and OutputSection to access ctx without taking an extra function argument.

Apparently this can fail as well.

- Don't treat inline ASM as indirect calls - Remove call to alias testing, which was broken (only working by pure luck right now) and isn't needed anyway. GlobalOpt should take care of them for us.

Since Ctx &ctx is a member variable, 1f391a7 7a5b9ef e2f0ec3 can be reverted.

This allows us to emit wide generic and scratch memory accesses when we do not have alignment information. In cases where accesses happen to be properly aligned or where generic accesses do not go to scratch memory, this improves performance of the generated code by a factor of up to 16x and reduces code size, especially when lowering memcpy and memmove intrinsics. Also: Make the use of the FeatureUnalignedScratchAccess feature more consistent: FeatureUnalignedScratchAccess and EnableFlatScratch are now orthogonal, whereas, before, code assumed that the latter implies the former at some places. Part of SWDEV-455845.

…11662) Summary: `#pragma` and headers that finish with them shouldn't prevent `import "header_unit.h"` syntax. Test Plan: check-clang

…11956) This fixes the error message generated.

…#110976) This PR relocates the tests added in llvm#109435 to a new file named `no_lowering.mlir` and adds some new tests.

PR makes winograd.output_transform op a destination style op and fixes handing of a pre-existing data in its output argument (i.e. possibly pre-initialized with bias, which was discarded before). --------- Signed-off-by: Dmitriy Smirnov <[email protected]>

…lvm#108170) The other side has no way of telling which namespace do these codes belong to, so mashing them all together is not very helpful. I'm mainly doing this to simplify some code in a pending patch <https://github.com/llvm/llvm-project/pull/106774/files#r1752628604>, and I've picked the posix error category semi-randomly. If we wanted to be serious about assigning meaning to these error codes, we should create a special error category for "gdb errors".

…111902) Fixed the error `unable to create target: 'No available targets are compatible with triple "x86_64-apple-macosx10.4.0"'` running `clang --target=x86_64-apple-macosx -c -gdwarf -o %t %s`.

llvm#109928) There are a number of places where we call getSmallConstantMaxTripCount without passing a vector of predicates: getSmallBestKnownTC isIndvarOverflowCheckKnownFalse computeMaxVF isMoreProfitable I've changed all of these to now pass in a predicate vector so that we get the benefit of making better vectorisation choices when we know the max trip count for loops that require SCEV predicate checks. I've tried to add tests that cover all the cases affected by these changes.

…' ops. (llvm#104783)" This reverts commit 0348373 and 99c8557, which is a fix-up on top of the former. I'm reverting because this commit broke two tests: mlir/test/python/integration/dialects/linalg/opsrun.py mlir/test/python/integration/dialects/transform.py See https://lab.llvm.org/buildbot/#/builders/138/builds/4872 I'm not familiar with the tests, so I'm leaving it to the original author to either remove or adapt the broken tests, as discussed here: llvm#104783 (comment)

@petrhosek

This PR introduces shared library (DSO) support for XRay based on a revised version of the implementation outlined in [this RFC](https://discourse.llvm.org/t/rfc-upstreaming-dso-instrumentation-support-for-xray/73000). The feature enables the patching and handling of events from DSOs, supporting both libraries linked at startup or explicitly loaded, e.g. via `dlopen`. This patch adds the following: - The `-fxray-shared` flag to enable the feature (turned off by default) - A small runtime library that is linked into every instrumented DSO, providing position-independent trampolines and code to register with the main XRay runtime - Changes to the XRay runtime to support management and patching of multiple objects These changes are fully backward compatible, i.e. running without instrumented DSOs will produce identical traces (in terms of recorded function IDs) to the previous implementation. Due to my limited ability to test on other architectures, this feature is only implemented and tested with x86_64. Extending support to other architectures is fairly straightforward, requiring only a position-independent implementation of the architecture-specific trampoline implementation (see `compiler-rt/lib/xray/xray_trampoline_x86_64.S` for reference). This patch does not include any functionality to resolve function IDs from DSOs for the provided logging/tracing modes. These modes still work and will record calls from DSOs, but symbol resolution for these functions in not available. Getting this to work properly requires recording information about the loaded DSOs and should IMO be discussed in a separate RFC, as there are mulitple feasible approaches. @petrhosek @jplehr

Seems like passing the quantities directly seems to work fine.

Both input and output of ballot are lane-masks: result is lane-mask with 'S32/S64 LLT and SGPR bank' input is lane-mask with 'S1 LLT and VCC reg bank'. Ballot copies bits from input lane-mask for all active lanes and puts 0 for inactive lanes. GlobalISel did not set 0 in result for inactive lanes for non-constant input.

…1965) This let's the type conversion fail instead of generating invalid array types.

…111670) At present, alias analysis does not work for operations inside OMP target regions because the FIR declare operations within OMP target do not offer sufficient information for alias analysis. Consequently, it is necessary to examine the FIR code outside the OMP target region.

…rted to numerical type (llvm#111846) Pointer values casted to integer (non-pointer) type should be able to be subtracted as usual.

Fixes llvm#111934.

Any-of reductions are narrowed to i1. Update the legacy cost model to use the correct type when computing the cost of a phi that gets lowered to selects (BLEND). This fixes a divergence between legacy and VPlan-based cost models after 36fc291. Fixes llvm#111874.

Removes a dependency on LLVM in `xray_interface.cpp` by replacing `llvm_unreachable` with compiler-rt's `UNREACHABLE`. Applies clang-format to some unformatted changes. Original PR: llvm#90959

…ges (llvm#111562) The bulk of this change are new tests to check that we get a "Not yet implemneted: *some stuff here*" message when using some not yet supported OpenMP functionality. For some of these cases, this also means adding additional clauses to a filter list in OpenMP.cpp - this changes nothing [to the best of my understanding] other than allowing the clause to get to the point where it can be rejected in a TODO with a more clear message. One of the TOOD filters were missing Mergeable clause, so this was also added and the existing test updated for the new more specific error message. There is no functional change intended here.

The original patch had a reasonably significant bug. You could not use `.insn` to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that `getMachineOpValue` was truncating the immediate value, and I did not commit enough tests of useful cases. This changes the result of `getMachineOpValue` to be able to return the 48-bit and 64-bit immediates needed for the wider `.insn` directives. I took the opportunity to move some of the test cases around in the file to make looking at the output of `llvm-objdump` a little clearer.

…#111538) Introduce a description of late forwarding to the Neoverse-V1 Scheduling model.

…m#90959)" This reverts commit a440203 and 4451f9f

…#111982) These are already in target specific test directories.

…lvm#111733) Add example to document that single statement `else` needs a brace if the associated `if` needs a brace.

…lvm#111752) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).

…rc.b (llvm#111828) This patch generalizes the DAG combine for `(sub (shl X, 8), X) => (orc.b X)` into the more general form of `(sub (shl X, 8 - Y), (srl X, Y)) => (orc.b X)`. Alive2 generalized proof: https://alive2.llvm.org/ce/z/dFcf_n Related issue: llvm#96595 Related PR: llvm#96680

This reverts commit 3f9998a. It breaks downstream tests with egregious numerical differences. Unfortunately no upstream tests are broken, but the fact that a prior iteration of the commit (pre-optimization) does work with our downstream tests (coming from the Triton repo) supports the claim that the final version of the commit is incorrect. Reverting now so that the original author can evaluate.

…llvm#111990) Fix build failure from the rename change. Looks like one additional reference sneaked in between pre-commit checks and the commit itself.

…n template calls. (llvm#111457)" See discussion in llvm#111711 This reverts commit 4dadf42.

…llvm#107350)" See discussion in llvm#111711 This reverts commit 224519b.

See discussion in llvm#111711 This reverts commit 6213aa5.

This reduces the total number of TableGen records produced by AMDGPU.td by about 6%.

It would be nice to see what our users think about this change, as this is something that WG21/EWG quite wants to fix a handful of questionable issues with UB. Depending on the outcome of this after being committed, we might instead suggest EWG undeprecate this, and require a bit of 'magic' from the lexer. Additionally, this patch makes it so we emit this diagnostic ALSO in cases where the literal name is reserved. It doesn't make sense to limit that. --------- Co-authored-by: Vlad Serebrennikov <[email protected]>

Fixes 0e91323 / llvm#111531 For reasons I can't explain, a clean build works fine for me, and all the bots are working fine. But if I rebuild in some way the make tool becomes None. Looking at the other variables, they had these extra lines so I've added those for make and it seems to solve the problem.

…lvm#111519) Lowering fixed-size BUILD_VECTORS without Neon may introduce stack spills, leading to more stores/reloads than if the stores were not merged. In some cases, it can also prevent using paired store instructions. In the future, we may want to relax when SVE is available, but currently, the SVE lowerings for BUILD_VECTOR are limited to a few specific cases.

Add a new enumeration `SuppressInlineNamespaceMode` to `PrintingPolicy` that is explicit about how to handle inline namespaces. `SuppressInlineNamespace` uses that enumeration now instead of a Boolean value. Specializing a template from an inline namespace should be transparent. For instance ``` namespace foo { inline namespace v1 { template<typename A> void function(A&); } } namespace foo { template<> void function<int>(int&); } ``` `hasName` should match both declarations of `foo::function`. Makes the behavior of `matchesNodeFullSlow` and `matchesNodeFullFast` consistent, fixing an assert inside `HasNameMatcher::matchesNode`.

…m#111824) This hasn't been used for several years, so it's effectively dead code at this point.

This improves the CI output by providing collapsable sections for sub-parts of our build. This was originally opened as llvm#75233. Co-authored-by: eric <[email protected]>

) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * vector.fma * vector.reduce

The purpose of this optimization is to make the VL argument, for instructions that have a VL argument, as small as possible. This is implemented by visiting each instruction in reverse order and checking that if it has a VL argument, whether the VL can be reduced. By putting this pass before VSETVLI insertion, we see three kinds of changes to generated code: 1. Eliminate VSETVLI instructions 2. Reduce the VL toggle on VSETVLI instructions that also change vtype 3. Reduce the VL set by a VSETVLI instruction The list of supported instructions is currently whitelisted for safety. In the future, we could add more instructions to `isSupportedInstr` to support even more VL optimization. We originally wrote this pass because vector GEP instructions do not take a VL, which leads us to emit code that uses VL=VLMAX to implement GEP in the RISC-V backend. As a result, some of the vector instructions will write to lanes, specifically between the intended VL and VLMAX, that will never be read. As an alternative to this pass, we considered adding a vector predicated GEP instruction, but this would not fit well into the intrinsic type system since GEP has a variable number of arguments, each with arbitrary types. The second approach we considered was to put this pass after VSETVLI insertion, but we found that it was more difficult to recognize optimization opportunities, especially across basic block boundaries -- the data flow analysis was also a bit more expensive and complex. While this pass solves the GEP problem, we have expanded it to handle more cases of VL optimization, and there is opportunity for the analysis to be improved to enable even more optimization. We have a few follow up patches to post, but figured this would be a good start. --------- Co-authored-by: Craig Topper <[email protected]> Co-authored-by: Kito Cheng <[email protected]>

The concept of a 'program point' in the original data flow framework is ambiguous. It can refer to either an operation or a block itself. This representation has different interpretations in forward and backward data-flow analysis. In forward data-flow analysis, the program point of an operation represents the state after the operation, while in backward data flow analysis, it represents the state before the operation. When using forward or backward data-flow analysis, it is crucial to carefully handle this distinction to ensure correctness. This patch refactors the definition of program point, unifying the interpretation of program points in both forward and backward data-flow analysis. How to integrate this patch? For dense forward data-flow analysis and other analysis (except dense backward data-flow analysis), the program point corresponding to the original operation can be obtained by `getProgramPointAfter(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointBefore(block)`. For dense backward data-flow analysis, the program point corresponding to the original operation can be obtained by `getProgramPointBefore(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointAfter(block)`. NOTE: If you need to get the lattice of other data-flow analyses in dense backward data-flow analysis, you should still use the dense forward data-flow approach. For example, to get the Executable state of a block in dense backward data-flow analysis and add the dependency of the current operation, you should write: ``getOrCreateFor<Executable>(getProgramPointBefore(op), getProgramPointBefore(block))`` In case above, we use getProgramPointBefore(op) because the analysis we rely on is dense backward data-flow, and we use getProgramPointBefore(block) because the lattice we query is the result of a non-dense backward data flow computation. related dsscussion: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8 corresponding PSA: https://discourse.llvm.org/t/psa-program-point-semantics-change/81479

@nunoplopes

… to IEEE-754 (llvm#102140) Fixes llvm#60942: IEEE semantics is likely what many frontends want (it definitely is what Rust wants), and it is what LLVM passes already assume when they use APFloat to propagate float operations. This does not reflect what happens on x87, but what happens there is just plain unsound (llvm#89885, llvm#44218); there is no coherent specification that will describe this behavior correctly -- the backend in combination with standard LLVM passes is just fundamentally buggy in a hard-to-fix-way. There's also the questions around flushing subnormals to zero, but [this discussion](https://discourse.llvm.org/t/questions-about-llvm-canonicalize/79378) seems to indicate a general stance of: this is specific non-standard hardware behavior, and generally needs LLVM to be told that basic float ops do not return the standard result. Just naively running LLVM-compiled code on hardware configured to flush subnormals will lead to llvm#89885-like issues. AFAIK this is also what Alive2 implements (@nunoplopes please correct me if I am wrong).

…tInliningLastCallToStaticBonus` (llvm#111311) Currently we will not be able to inline a large function even if it only has one live use because the inline cost is still very high after applying `LastCallToStaticBonus`, which is a constant. This could significantly impact the performance because CSR spill is very expensive. This PR adds a new function `getInliningLastCallToStaticBonus` to TTI to allow targets to customize this value. Fixes SWDEV-471398.

While implementing a different clause, I discovered these placeholder clauses had their 'classof' implementation done incorrectly, so this fixes that.

We store these in a few places, so ensuring they are kept in a uint8_t will minimize the amount of storage on the stack.

Add support for specializing linalg.broadcast and linalg.transform from generic. Also, does some refactoring to reuse specialization checks, migrating some common uses to op interface methods.

/llvm-project/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp:125:21: error: unused function 'operator<<' [-Werror,-Wunused-function] static raw_ostream &operator<<(raw_ostream &OS, const OperandInfo &OI) { ^ 1 error generated.

This patch implements growing the DAG towards the top or bottom. This does the necessary dependency checks and adds new mem dependencies.

…nctions (llvm#111004) Avoid constructing recursive MCExpr definitions when multiple functions cause a recursion. Fixes llvm#110863

…ity and TargetSchedmodel (llvm#109818) Remove s_cbranch_execnz branches if the transformation is profitable according to `BranchProbability` and `TargetSchedmodel`.

…to Undefined Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate sections that would be discarded by GC. Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return false (when Defined) and then return true (been demoted to Undefined). ``` addScriptReferencedSymbolsToSymTable shouldAddProvideSym(f1): false // the RHS (bar) is not added to `referencedSymbols` and may be GCed declareSymbols shouldAddProvideSym(f1): false markLive demoteSymbolsAndComputeIsPreemptible // demoted f1 to Undefined processSymbolAssignments addSymbol shouldAddProvideSym(f1): true ``` The inconsistency can cause `cmd->expression()` in `addSymbol` to be evaluated, leading to `symbol not found: bar` errors (since `bar` in the RHS is not in `referencedSymbols` and is GCed) (llvm#111478). Fix this by adding a `sym->isUsedInRegularObj` condition, making `shouldAddProvideSym(f1)` values consistent. In addition, we need a `sym->exportDynamic` condition to keep provide-shared.s working. Fixes: ebb326a Pull Request: llvm#111945

It's very confusing to have support for Verion 3 but not default to it. This patch teaches llvm-profdata to use MemProf version 3 by default.

The 'gang' clause is used to specify parallel execution of loops, thus has some complicated rules depending on the 'loop's associated compute construct. This patch implements all of those.

and implement them for dwim-print (a.k.a. `p`) as an example. The next step will be to expose them as structured data in SBCommandReturnObject.

Previously this would assert when attempting to getMutableData.

…tiple functions" (llvm#112013) Reverts llvm#111004

… JITLink.

Add tests showing potential to use PSHUFB for shifts of constant uniform values by using a pre-computed LUT of all legal shift amounts

…t_category (llvm#111777) Use a word boundary, current code was currently failing when parsing the definition of because it would also match `CooperativeMatrixOp` from a later mention of `SPIRV_KHR_CooperativeMatrixOperandsAttr`.

This commit adds parsing of type modifiers for the MAP clause: CLOSE, OMPX_HOLD, and PRESENT. The support for ALWAYS has already existed. The new modifiers are not yet handled in lowering: when present, a TODO message is emitted and compilation stops.

Many build bots are getting failures because of this: https://lab.llvm.org/buildbot/#/builders/140/builds/8600 https://lab.llvm.org/buildbot/#/builders/137/builds/6824 https://lab.llvm.org/buildbot/#/builders/140/builds/8600

Skipping another reoptimization test when target is not found.

) It checks that a Region can have non-contiguous instructions, and that when iterating through it you don't get the instructions in-between that aren't part of the Region.

llvm#110762) Consider llvm#109148: ```c++ template <typename ...Ts> void f() { [] { (^Ts); }; } ``` When we encounter `^Ts`, we try to parse a block and subsequently call `DiagnoseUnexpandedParameterPack()` (in `ActOnBlockArguments()`), which sees `Ts` and sets `ContainsUnexpandedParameterPack` to `true` in the `LambdaScopeInfo` of the enclosing lambda. However, the entire block is subsequently discarded entirely because it isn’t even syntactically well-formed. As a result, `ContainsUnexpandedParameterPack` is `true` despite the lambda’s body no longer containing any unexpanded packs, which causes an assertion the next time `DiagnoseUnexpandedParameterPack()` is called. This pr moves handling of unexpanded parameter packs into `CapturingScopeInfo` instead so that the same logic is used for both blocks and lambdas. This fixes this issue since the `ContainsUnexpandedParameterPack` flag is now part of the block (and before that, its `CapturingScopeInfo`) and no longer affects the surrounding lambda directly when the block is parsed. Moreover, this change makes blocks actually usable with pack expansion. This fixes llvm#109148.

…gument lists (llvm#106585, llvm#111173)" (llvm#111852) This patch reapplies llvm#111173, fixing a bug when instantiating dependent expressions that name a member template that is later explicitly specialized for a class specialization that is implicitly instantiated. The bug is addressed by adding the `hasMemberSpecialization` function, which return `true` if _any_ redeclaration is a member specialization. This is then used when determining the instantiation pattern for a specialization of a template, and when collecting template arguments for a specialization of a template.

…uctions (llvm#112030)

This patch implements `Interval::comesBefore(const Interval &Other)` which returns true if this interval is strictly before Other in program order. The function asserts that the intervals are disjoint.

Create the `ReportedErrors` class to track the number of reported errors during verification. The class will block reporting errors if some other thread is currently reporting an error. I've encountered a case where there were many different verifications reporting errors at the same time on different threads. This ensures that we don't start printing the error from one case until we are completely done printing errors from other cases. Most of the time `AbortOnError = true` so we usually abort after reporting the first error. Depends on llvm#111602.

This patch implements the `ConvertToLLVMPatternInterface` for the OpenMP dialect, allowing `convert-to-llvm` to act on the OpenMP dialect.

The API was already using top()/bottom() but internally we were still using From/To. This patch fixes this. Top/Bottom seems a better choice because implies program order, whereas From/To does not.

Builds on llvm#73789, enabling store clustering by default using the same heuristic.

When encountering an instruction like `if (p0) r0 = add(r0,##bar@GOT)`, lld would fail with: ``` ld.lld: error: unrecognized instruction for 16_X type: 0x7400C000 ``` This issue was encountered while building libreadline with clang 19.1.0. Fixes: llvm#111876

a non-functional change Update test script with update_mc_test_check script and sort the testline to be alphabetic order. This helps to maintain the test file in a clean state

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eric contracts #2

Eric contracts #2

Commits on Oct 10, 2024

Commits on Oct 11, 2024

Commits on Oct 14, 2024

Eric contracts #2

Are you sure you want to change the base?

Eric contracts #2

Commits on Oct 10, 2024

Commits on Oct 11, 2024

Commits on Oct 14, 2024