Test #2

This paper adds 'i' and 'j' as suffixes for forming a _Complex constant. This feature has been supported in Clang since at least Clang 3.0, so only test coverage is needed. It does remove -Wgnu-imaginary-constant in C mode (still used in C++ mode) because the feature is now a C2y feature rather than a GNU one.

… better formatting

When forcing emit zero, we need to skip terminators of a MBB; otherwise the terminator list of the MBB would be broken.

…ions (llvm#112148) Bug fix: `@autoreleasepool`, `@synchronized`, and `@finally` were not being noticed and treated as function effect violations. --------- Co-authored-by: Doug Wyatt <[email protected]>

The worker clause specifies iterations of the loop/ that are executed in parallel by distributing the iterations among the multiple works within a single gang. The sema rules for this type are simply that it cannot be combined with a `kernel` construct with a `num_workers` clause, child `loop` clauses cannot contain a `gang` or `worker` clause, and that the argument is oly allowed when associated with a `kernel`.

…as `constexpr`. (llvm#112129) Closes llvm#107985. LanguageExtensions.rst states that `__builtin_shufflevector` and `__builtin_convertvector` can be evaluated as constants, but this is not reflected in Butiltins.td. This patch aligns these two.

…llvm#112112) Fixes: llvm#109385

…108931) 1. Propagate the nneg flag in WidenVecRes 2. Use SINT_TO_FP in expandUINT_TO_FP when possible.

Currently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt. While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind. We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous). With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios.

Check invoked tool with `starts_with`. Addresses the issue where `perf2bolt` invoked using a distro symlink `perf2bolt-16` fails to run in perf2bolt mode and runs in llvm-bolt mode instead. The issue is mentioned in https://vondra.me/posts/playing-with-bolt-and-postgres/ Test Plan: ``` ln -sf perf2bolt perf2bolt-20 perf2bolt-20 clang -p perf.data -o fdata.clang -w yaml.clang ... PERF2BOLT: wrote 188593 objects and 0 memory objects to fdata.clang ``` Reviewers: ayermolo, rafaelauler, dcci, maksfb Reviewed By: maksfb Pull Request: llvm#111072

…lvm#111659) Refer: 7.3.1 from [ISO SPEC](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf) I have added complex variants of F16 and F128 in libc doc but have omitted support for them since we will have to first investigate how their support matrix for clang and gcc looks like, and then add header guards for them accordingly. Planning to add them in follow up PRs once this gets landed.

…m#111332) Currently, WebAssembly/WASI target does not provide direct support for code coverage. This patch set fixes several issues to unlock the feature. The main changes are: 1. Port `compiler-rt/lib/profile` to WebAssembly/WASI. 2. Adjust profile metadata sections for Wasm object file format. - [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections instead of data segments. - [lld] Align the interval space of custom sections at link time. - [llvm-cov] Copy misaligned custom section data if the start address is not aligned. - [llvm-cov] Read `__llvm_prf_names` from data segments 3. [clang] Link with profile runtime libraries if requested See each commit message for more details and rationale. This is part of the effort to add code coverage support in Wasm target of Swift toolchain.

For the block smaller than a page size, one block is unlikely to introduce more unused pages (at most 2 if it acrosses the page boundary and both touched pages are unused). So it's better to apply the threshold to reduce the time of scanning groups that can't release any new pages.

Fix buildbot errors due to llvm#111659

Fixes llvm#83482

This adds support for these instructions and also tests getOperandInfo for these instructions as well.

This is a copy of llvm#97402(with minor updates), which is now ready to land. --------- Co-authored-by: Sergey Kozub <[email protected]>

This change is identical to 26800a2 ("[sanitizer] Undef _TIME_BITS along with _FILE_OFFSET_BITS on Linux"), but for sanitizer_procmaps_solaris.cpp. Indeed, even though sanitizer_procmaps_solaris.cpp is Solaris specific, it also gets built on Linux platforms. It also includes sanitizer_platform.h, which also ends up including features-time64.h, causing a build failure on 32-bit Linux platforms on which 64-bit time_t is enabled by setting _TIME_BITS=64. To fix this, we do the same change: undefine _TIME_BITS, which anyway will cause no harm as the rest of this file is inside a SANITIZER_SOLARIS compile-time conditional. Fixes: In file included from /home/thomas/buildroot/buildroot/output/host/i686-buildroot-linux-gnu/sysroot/usr/include/features.h:394, from ../../../../libsanitizer/sanitizer_common/sanitizer_platform.h:25, from ../../../../libsanitizer/sanitizer_common/sanitizer_procmaps_solaris.cpp:14: /home/thomas/buildroot/buildroot/output/host/i686-buildroot-linux-gnu/sysroot/usr/include/features-time64.h:26:5: error: #error "_TIME_BITS=64 is al lowed only with _FILE_OFFSET_BITS=64" 26 | # error "_TIME_BITS=64 is allowed only with _FILE_OFFSET_BITS=64" | ^~~~~ Signed-off-by: Thomas Petazzoni <[email protected]> Closes: llvm#99699

This is a follow-up to llvm#108344. The original bailout check was overly strict, causing it to miss cases like the vector(initializer_list, allocator) constructor. This patch relaxes the check to address that issue. Fix llvm#111680

Ensures that struct padding is not skipped, as it may contain actual data if the struct is really a union. The patch originated from a discussion on llvm#53710 Fixes llvm#53710

…rective (llvm#112066) Fixes llvm#87717.

It doesn't make a difference currently, but MTE globals are only supported on Android, so that's the more natural target to use.

This addresses an issue introduced in llvm#112041.

…iption (llvm#112057) When NewInterval is below DAGInterval we used to revisit instructions already visited. This patch fixes this by separating the scan in two: 1. The full scan of the NewInterval, and 2. The cross-interval scan for DAGInterval. This is further explained in the new description.

…12252) Reverts llvm#112014 The change didn't update the iterator

If a pseudo has a passthru, I believe the first source operand will have operand no 2, not 1.

Fixes llvm#110122 - Create remap_file_pages.h/.cpp wrapper for the linux sys call. - Add UnitTests for remap_file_pages - Add function to libc/spec/linux.td - Add Function spec to mman.yaml

This reverts commit 4a0dc3e. Breaks tests, see comments on llvm#108802

…#112235)

CFPropertyListCreateXMLData has been deprecated since macOS 10.10. Use CFPropertyListCreateData instead.

…2257) After llvm#80007 Fuchsia builds are now always building cxx_shared for arm64 and x64 Linux. Ultimately, this is because the LIBCXX_ENABLE_SHARED is not used in compiler-rt to select the correct libc++ target, and because cxx_shared is now always defined, it is selected as a dependency when building runtimes tests. --------- Co-authored-by: Petr Hosek <[email protected]>

…112275) Fix `BUILD_SHARED_LIBS=ON` build for d4efc3e

We usually override operand related fields in the class body instead of at the top level.

…m#112254)

) c.srli(64) and c.srai(64) are encoded differently than c.slli(64). The former have a 3-bit register, while the latter has a 5-bit register. c.srli and c.srai already use RVInst16CB. The "let Inst{11-10} =" prevented this from causing any functional issues by dropping the upper 2 bits of the register. The ins/outs list uses GPRC so the register class is constrained.

…1409)" This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame #2: {{.*}}`main

As discussed in llvm#109284 Copied msan tests from 64-bit platforms to following 32-bit platforms: * MIPS * ARM * RISCV * PowerPC * i386 Most of the tests have been copied form mips64. Target triple and test contents have not been changed: to be done in next PR. --------- Co-authored-by: Kamil Kashapov <[email protected]>

Fixes the following error: ``` clang-tools-extra/docs/ReleaseNotes.rst:247: WARNING: unknown document: 'clang-tidy/checks/readability/readability-identifier-naming' [ref.doc] ```

This patch implements the UnscheduledSuccs counter in DGNode. It counts the number of unscheduled successors and is used by the scheduler to determine when a node is ready.

) This allows IDEs to render LLDB expression diagnostics to their liking without relying on characterprecise ASCII art from LLDB. It is exposed as a versioned SBStructuredData object, since it is expected that this may need to be tweaked based on actual usage.

…s in C" (llvm#109898) (llvm#110051) This reverts commit d50eaac. Also fixes a bug calculating offsets for bit fields in the original patch.

…lvm#112282)

…es (llvm#112048)

/llvm-project/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.h:231:10: error: unused variable 'Inserted' [-Werror,-Wunused-variable] auto Inserted = MemPreds.insert(PredN).second; ^ 1 error generated.

Update the TestCTF Makefile to remove the DWARF 5 sections.

…lvm#112260) Remove support for ASL (Apple System Log) which has been deprecated since macOS 10.12. Fixes the following warnings: warning: 'asl_new' is deprecated: first deprecated in macOS 10.12 - os_log(3) has replaced asl(3) warning: 'asl_set' is deprecated: first deprecated in macOS 10.12 - os_log(3) has replaced asl(3) warning: 'asl_vlog' is deprecated: first deprecated in macOS 10.12 - os_log(3) has replaced asl(3)

…/u> 2/1` (llvm#111284)` (llvm#111998) Relands llvm#111284. Test failure with stage2 build has been fixed by llvm#111946. Some targets have better codegen for `ctpop(X) u< 2` than `ctpop(X) == 1`. After llvm#100899, we set the range of ctpop's return value to indicate the argument/result is non-zero. This patch converts `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` in CGP to fix llvm#95255.

…lvm#111977) This commit essentially reverts https://reviews.llvm.org/D30453. In llvm#109961, objcopy util search code was added to dotest.py. dotest.py should use llvm-X by default if no path to a utility X is provided externally. However, it doesn't work out for llvm-objcopy, since objcopy path is always overridden with the lines being removed here. It causes a problem with cross-platform testing when objcopy used by cmake doesn't support targets/executable file formats other than native. I suppose these lines are unnecessary after llvm#109961, so they can be safely removed.

This revision adds hyperbolic trigonometric sinh, cosh, and tanh intrinsic ops.

…112242)

…aced init (llvm#109505)" This reverts commit 98281da which caused a regression. Fixes llvm#112176.

…111967) - V_SET_INACTIVE is always in WWM/WQM so can be treated like any other operation in WWM/WQM. - After encountering SI_SPILL_S32_TO_VGPR loop should bypass to avoid double processing its defs.

Windows NT/MIPS and Windows CE/MIPS always used COFF format. This is an extract of PR llvm#107744.

Add the MIPS COFF relocation types. They will be needed to add support for MIPS Windows object file. This is an extract of PR llvm#107744.

The spec refers to the field as rd'/rs1' so we might as well name the destination rd.

The operation will be used in the CUF constructor to register the kernel functions. This allow to delay this until codegen when the gpu.binary will be available.

Reverts llvm#112268

…d sf.v.xvw instructions (llvm#111630) The instruction has the constraint, but the pseudo instruction is missing.

…#112291) This is an assembly only alias for c.nop.

…112252) (llvm#112266) This reverts commit 037938d. Fixed the iterator to avoid infinite loop

…ers (llvm#112177) When parsing its function parameters, we don't change the CurContext to the lambda's function declaration. However, CheckIfAnyEnclosingLambdasMustCaptureAnyPotentialCaptures() has not yet adapted to such behavior when nested lambdas come into play. Consider the following case, struct Foo {}; template <int, Foo f> struct Arr {}; constexpr void foo() { constexpr Foo F; [&]<int I>() { [&](Arr<I, F>) {}; }.template operator()<42>(); } As per [basic.def.odr]p5.2, the use of F constitutes an ODR-use. And per [basic.def.odr]p10, F should be ODR-usable in that interleaving scope. We failed to accept the case because the call to tryCaptureVariable() in getStackIndexOfNearestEnclosingCaptureCapableLambda() suggested that F is needlessly captureable. That was due to a missed handling for AfterParameterList in FunctionScopeIndexToStopAt, where it still presumed DC and LSI matched. Fixes llvm#47400 Fixes llvm#90896

Fixes: llvm#110190

(reland with fixed sed command for macos) Handle the `!callees` metadata to further reduce the amount of indirect call cases that end up conservatively assuming that any indirectly callable function is a potential target.

`APFloat::APFloat(const fltSemantics &Semantics, integerPart I)` interprets 'I' as a unsigned integer. Fix the bug found in llvm#112113 (comment).

Use explicit target and stop restricting hosts it can run on.

…12238) Comparing their PrimTypes isn't enough in this case. We can have a floating cast here as well.

The [last attempt](llvm#89036) to fix llvm#41441 has been reverted immediately. Here I'm trying the simplest idea I've been able to come with: skip handling dependent case in `BuildCXXNew`. The original test (borrowed form llvm#89036) passes. Also I've created and added to the tests a minimal repro of the code llvm#89036 fails on. This (obviously) also passes.

…112316) Use const pointers for various `Init` objects. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

…terializations (llvm#112128) This commit adds an optional `originalType` parameter to target materialization functions. Without this parameter, target materializations are underspecified. Note: `originalType` is only needed for target materializations. Source/argument materializations do not have it. Consider the following example: Let's assume that a conversion pattern "P1" replaced an SSA value "v1" (type "t1") with "v2" (type "t2"). Then a different conversion pattern "P2" matches an op that has "v1" as an operand. Let's furthermore assume that "P2" determines that the legalized type of "t1" is "t3", which may be different from "t2". In this example, the target materialization callback will be invoked with: outputType = "t3", inputs = "v2", originalType = "t1". Note that the original type "t1" cannot be recovered from just "t3" and "v2"; that's why the `originalType` parameter is added. This change is in preparation of merging the 1:1 and 1:N dialect conversion drivers. As part of that change, argument materializations will be removed (as they are no longer needed; they were just a workaround because of missing 1:N support in the dialect conversion). The new `originalType` parameter is needed when lowering MemRef to LLVM. During that lowering, MemRef function block arguments are replaced with the elements that make up a MemRef descriptor. The type converter is set up in such a way that the legalized type of a MemRef type is an `!llvm.struct` that represents the MemRef descriptor. When the bare pointer calling convention is enabled, the function block arguments consist of just an LLVM pointer. In such a case, a target materialization will be invoked to construct a MemRef descriptor (output type = `!llvm.struct<...>`) from just the bare pointer (inputs = `!llvm.ptr`). The original MemRef type is required to construct the MemRef descriptor, as static sizes/strides/offset cannot be inferred from just the bare pointer.

Add an additional adaptor constructor that copies everything except for the values. The values are provided with by a second parameter. This commit is in preparation of merging the 1:1 and 1:N dialect conversions. As part of that, a new `matchAndRewrite` function is added. For details, see this RFC: https://discourse.llvm.org/t/rfc-merging-1-1-and-1-n-dialect-conversions/82513 ```c++ template <typename SourceOp> class OpConversionPattern : public ConversionPattern { public: using OneToNOpAdaptor = typename SourceOp::template GenericAdaptor<ArrayRef<ArrayRef<Value>>>; virtual LogicalResult matchAndRewrite(SourceOp op, OneToNOpAdaptor adaptor, ConversionPatternRewriter &rewriter) const { SmallVector<Value> oneToOneOperands = getOneToOneAdaptorOperands(adaptor.getOperands()); // This OpAdaptor constructor is added by this commit. return matchAndRewrite(op, OpAdaptor(oneToOneOperands, adaptor), rewriter); } }; ```

…vm#112172) This adds computeKnownBits support for vector->vector G_UNMERGE_VALUES, grabbing the known bits with an adjusted DemandedElts mask.

This makes sure that APInt::getAllOnes() keeps working after the APInt constructor assertions are enabled. I'm relaxing the requirement for the signed case to either an all zeros or all ones integer. This is basically saying that we can interpret the zero-width integer as either positive or negative.

This fixes a problem introduced in llvm#80094. That PR copied negative features from the TargetMachine to the end of the feature string. This is not correct, because even if we have a baseline TM of say `-simd128`, but a function with `+simd128`, the coalesced feature string should have `+simd128`, not `-simd128`. To address the original motivation of that PR, we should instead explicitly materialize the negative features in the target feature string, so that explicitly disabled default features are honored. Unfortunately, there doesn't seem to be any way to actually test this using llc, because `-mattr` appends the specified features to the end of the `"target-features"` attribute. I've tested this locally by making it prepend the features instead.

9ad72df added split of _BitInt constants when required. Before folding back, check that the constant exists.

Implementation of `computeCost()` function for `VPReductionRecipe`. Note that `in-loop` and `any-of` reductions are not supported by VPlan-based cost model currently.

When investigating PR llvm#101634, it turned out that `UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp` isn't Linux-specific at all. In fact, none of the `ubsan/TestCases/Misc/Linux` tests are. Therefore this patch moves them to `Misc/Posix` instead. Tested on `sparc64-unknown-linux-gnu`, `sparcv9-sun-solaris2.11`, `x86_64-pc-linux-gnu`, and `amd64-pc-solaris2.11`.

When testing on Linux/sparc64 with a `runtimes` build, the `UBSan-Standalone-sparc :: TestCases/Misc/Linux/sigaction.cpp` test `FAIL`s: ``` runtimes/runtimes-bins/compiler-rt/test/ubsan/Standalone-sparc/TestCases/Misc/Linux/Output/sigaction.cpp.tmp: error while loading shared libraries: libclang_rt.ubsan_standalone.so: wrong ELF class: ELFCLASS64 ``` It turns out SPARC needs the same `LD_LIBRARY_PATH` handling as x86. This is what this patch does, at the same time noticing that the current duplication between `lit.common.cfg.py` and `asan/Unit/lit.site.cfg.py.in` isn't necessary. Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.

This PR would fix llvm#16855 . The correct lookup to use for class names is Tag name lookup, because it does not take namespaces into account. The lookup before does and because of this some valid programs are not accepted. An example scenario of a valid program being declined is when you have a struct (let's call it `y`) inheriting from another struct with a name `x` but the struct `y` is in a namespace that is also called `x`: ``` struct x {}; namespace { namespace x { struct y : x {}; } } ``` This shall be accepted because: ``` C++ [class.derived]p2 (wrt lookup in a base-specifier): The lookup for // the component name of the type-name or simple-template-id is type-only. ```

Inspired by https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423

Some typos are also fixed. Address llvm#112067 (review).

Write dedicated tests for foldSelectValueEquivalence, demonstrating that it does not perform many GVN-like replacements when: - the comparison is a vector-type - the comparison is a floating-point type as a prelude to fixing these deficiencies.

LDV could reorder reinserted fragment and non-fragment debug values for the same variable (compared to the input order), potentially resulting in stale values being presented. For example, before: DBG_VALUE 1001, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16) DBG_VALUE 1002, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16) DBG_VALUE %0, $noreg, !13, !DIExpression() After (without this patch): DBG_VALUE %stack.0, 0, !13, !DIExpression() DBG_VALUE 1002, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 16, 16) DBG_VALUE 1001, $noreg, !13, !DIExpression(DW_OP_LLVM_fragment, 0, 16) It would also reorder DBG_VALUEs for different variables. Although that does not matter for the debug information output, it resulted in some noise in before/after pass diffs. This should hopefully align so that instruction referencing and DBG_VALUE emit debug instructions in the same order (see the sdag-salvage-add.ll change).

…to scalar Constant{Int,FP}. (llvm#111005) This fixes a failure path when the use-constant-##-for-###-splat IR options are enabled.

foldSelectEquivalence currently doesn't support GVN-like replacements on vector types. Put in the checks for potentially lane-crossing operations, and lift the limitation.

…inked with libc++ statically (llvm#98694) This is to fix buildbot failure https://lab.llvm.org/staging/#/builders/195/builds/4255. The test expects 'libstdc++' or 'libc++' SO module in the module list. In case when static linking with libc++ is on by default, none of them may be present. Thus, USE_SYSTEM_STDLIB is added to ensure the presence of any of them. --------- Co-authored-by: Vladimir Vereschaka <[email protected]>

This patch simplifies the representation of OpenMP loop wrapper operations by introducing the `NoTerminator` trait and updating accordingly the verifier for the `LoopWrapperInterface`. Since loop wrappers are already limited to having exactly one region containing exactly one block, and this block can only hold a single `omp.loop_nest` or loop wrapper and an `omp.terminator` that does not return any values, it makes sense to simplify the representation of loop wrappers by removing the terminator. There is an extensive list of Lit tests that needed updating to remove the `omp.terminator`s adding some noise to this patch, but actual changes are limited to the definition of the `omp.wsloop`, `omp.simd`, `omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion and OpenMP dialect documentation.

…2335)

) This patch adds extra class declarations to the `omp.declare_reduction` and `omp.private` operations to access the entry block arguments defined by their regions. Some existing accesses to these arguments are updated to use the new named methods to improve code readability.

…m#110855) The iterator passed to `fixupCalleeSaveRestoreStackOffset` may be incorrect when it tries to skip over the instructions that get the current value of 'vg', when there is a 'rdsvl' instruction straight after the prologue. That's because it doesn't check that the instruction is still a 'frame-setup' instruction.

This PR adds a `__sanitizer_copy_contiguous_container_annotations` function, which copies annotations from one memory area to another. New area is annotated in the same way as the old region at the beginning (within limitations of ASan). Overlapping case: The function supports overlapping containers, however no assumptions should be made outside of no false positives in new buffer area. (It doesn't modify old container annotations where it's not necessary, false negatives may happen in edge granules of the new container area.) I don't expect this function to be used with overlapping buffers, but it's designed to work with them and not result in incorrect ASan errors (false positives). If buffers have granularity-aligned distance between them (`old_beg % granularity == new_beg % granularity`), copying algorithm works faster. If the distance is not granularity-aligned, annotations are copied byte after byte. ```cpp void __sanitizer_copy_contiguous_container_annotations( const void *old_storage_beg_p, const void *old_storage_end_p, const void *new_storage_beg_p, const void *new_storage_end_p) { ``` This function aims to help with short string annotations and similar container annotations. Right now we change trait types of `std::basic_string` when compiling with ASan and this function purpose is reverting that change as soon as possible. https://github.com/llvm/llvm-project/blob/87f3407856e61a73798af4e41b28bc33b5bf4ce6/libcxx/include/string#L738-L751 The goal is to not change `__trivially_relocatable` when compiling with ASan. If this function is accepted and upstreamed, the next step is creating a function like `__memcpy_with_asan` moving memory with ASan. And then using this function instead of `__builtin__memcpy` while moving trivially relocatable objects. https://github.com/llvm/llvm-project/blob/11a6799740f824282650aa9ec249b55dcf1a8aae/libcxx/include/__memory/uninitialized_algorithms.h#L644-L646 --- I'm thinking if there is a good way to address fact that in a container the new buffer is usually bigger than the previous one. We may add two more arguments to the functions to address it (the beginning and the end of the whole buffer. Another potential change is removing `new_storage_end_p` as it's redundant, because we require the same size. Potential future work is creating a function `__asan_unsafe_memmove`, which will be basically memmove, but with turned off instrumentation (therefore it will allow copy data from poisoned area). --------- Co-authored-by: Vitaly Buka <[email protected]>

) Adds tests with scalable vectors for the Vector-To-LLVM conversion pass. Covers the following Ops: * `vector.transfer_read`, * `vector.transfer_write`. In addition: * Duplicate tests from "vector-mask-to-llvm.mlir" are removed. * Tests for xfer_read/xfer_write are moved to a newly created test file, "vector-xfer-to-llvm.mlir". This follows an existing pattern among VectorToLLVM conversion tests. * Tests that test both xfer_read and xfer_write have their names updated to capture that (e.g. @transfer_read_1d_mask -> @transfer_read_write_1d_mask) * @transfer_write_1d_scalable_mask and @transfer_read_1d_scalable_mask are re-written as @transfer_read_write_1d_mask_scalable. This is to make it clear that this case is meant to complement @transfer_read_write_1d_mask. * @transfer_write_tensor is updated to also test xfer_read.

…m#112342) In recent PR llvm#111531 for Windows support, we enabled tests that require the `make` tool. On Windows, default install directories likely contain spaces, in this case e.g. `C:\Program Files (x86)\GnuWin32\bin\make.exe`. It's typically handled well by CMake, so that today invocations from `dotest.py` don't cause issues. However, we also have nested invocations from a number of Makefiles themselves. These still failed if the path to the `make` tool contains spaces. This patch attempts to fix the functionalities/completion test by adding quotes in the respective Makefile. If it keeps passing on the bots, we can roll out the fix to all affected tests.

Adds MLIR to LLVM lowering support for `target ... nowait`. This leverages the already existings code-gen patterns for `task` by treating `target ... nowait` as `task ... if(1)` and `target` (without `nowait`) as `task ... if(0)`; similar to what clang does.

…eof-expression` (llvm#111178) In some cases and for projects that deal with a lot of low-level buffers, a pattern often emerges that an array and its full size, not in the number of "elements" but in "bytes", are known with no syntax-level connection between the two values. To access the array elements, the pointer arithmetic involved will have to divide 'SizeInBytes' (a numeric value) with `sizeof(*Buffer)`. Since the previous patch introduced this new warning, potential false-positives were triggered from `bugprone-sizeof-expression`, as `sizeof` appeared in pointer arithmetic where integers are scaled. This patch adds a new check option, `WarnOnOffsetDividedBySizeOf`, which allows users to opt out of warning about the division case. In arbitrary projects, it might still be worthwhile to get these warnings until an opt-out from the detection of scaling issues, especially if a project might not be using low-level buffers intensively.

…ximum patterns in addition to the existing ISD nodes." (llvm#112203) This patch adds icmp+select patterns for integer min/max matchers in SDPatternMatch, similar to those in IR PatternMatch. Reapply llvm#111774. Closes llvm#108218.

…111514) In replaceVPBBWithIRVPBB we spend time erasing and appending predecessors and successors from a list, when all we really have to do is replace the old with the new. Not only is this more efficient, but it also preserves the ordering of successors and predecessors. This is something which may become important for vectorising early exit loops (see PR llvm#88385), since a VPIRInstruction is the wrapper for a live-out phi with extra operands that map to the incoming block according to the block's predecessor.

The 'vector' clause specifies the iterations to be executed in vector or SIMD mode. There are some limitations on which associated compute contexts may be associated with this and have arguments, but otherwise this is a fairly unrestricted clause. It DOES have region limits like 'gang' and 'worker'.

…or completeness Noticed while reviewing llvm#110760

…for completeness Noticed while reviewing llvm#110760

This PR applies the changes discussed in [[RFC] Rationale for Flang AliasAnalysis pointer component logic](https://discourse.llvm.org/t/rfc-rationale-for-flang-aliasanalysis-pointer-component-logic/79252). In summary, this PR replaces the existing pointer component logic in Flang's AliasAnalysis implementation. That logic focuses on aliasing between pointers and non-pointer, non-target composites that have pointer components. However, it is more conservative than necessary, and some existing tests expect its current results when less conservative results seem reasonable. This PR splits the logic into two cases: 1. Source values are the same: Return MayAlias when one value is the address of a composite, and the other value is statically the address of a pointer component of that composite. 2. Source values are different: Return MayAlias when one value is the address of a composite (actual argument), and the other value is the address of a pointer (dummy arg) that might dynamically be a component of that composite. In both cases, the actual implementation is still more conservative than described above, but it can be improved further later. Details appear in the comments. Additionally, this PR revises the logic that reports MayAlias for a pointer/target vs. another pointer/target. It constrains the existing logic to handle only isData cases, and it adds less conservative handling of !isData cases elsewhere. First, it extends case 2 listed above to cover the case where the actual argument is the address of a pointer rather than a composite. Second, it adds a third case: where target attributes enable aliasing with a dummy argument.

…2253) A reduction instruction always has a passthru operand, so the scalar operand should always be vs1 which is at index 3. Even though the destination operand is also scalar, I think the passthru will need to preserve all elements so I haven't included it.

Sometimes users (esp. gdb-longtime users) accidentally use GDB syntax, such as `breakpoint foo`, and they would get an error message from LLDB saying simply `Invalid command "breakpoint foo"`, which is not very helpful. This change provides additional suggestions to help correcting the mistake.

… comments

- **[Inliner] Add tests for bad propagationg of access attr for `byval` param; NFC** - **[Inliner] Don't propagate access attr to `byval` params** We previously only handled the case where the `byval` attr was in the callbase's param attr list. This PR also handles the case if the `ByVal` was a param attr on the function's param attr list.

…s not use them

Previously, SFINAE constraints and exception specification propagation were missing in the return type of libc++'s `std::mem_fn`. The requirements on expression-equivalence (or even plain "equivalent" in pre-C++20 specification) in [func.memfn] are actually requiring them. This PR adds the missed stuffs. Fixes llvm#86043. Drive-by changes: - removing no longer used `__invoke_return`, - updating synopsis comments in several files, and - merging several test files for `mem_fn` into one.

…vm#112165) `overload_compare_iterator` only supports operations required for forward iterators. On the other hand, it is used for output iterators of uninitialized memory algorithms, which requires it to be forward iterator. As a result, `overload_compare_iterator<I>::iterator_category` should always be `std::forward_iterator_tag` if we don't extend its ability. The correct `iterator_category` can prevent standard library implementations like MSVC STL attempting random access operations on `overload_compare_iterator`. Fixes llvm#74756.

… [nfc] (llvm#111592) Previously, the cost model was returning an invalid cost. This simply moves the check from one place to another. This is mostly to make the cost modeling code a bit easier to follow. --------- Co-authored-by: Mel Chen <[email protected]>

…gment intrinsics (llvm#111476) Indexed segment load/store intrinsics don't have SEW information encoded in the name, so we need to get the information from its pointer type argument at runtime.

…SSLANE check prefixes

…king for freely-concatable subvectors Fixes llvm#111611

…s` (llvm#112358) Fix llvm#112356.

…llvm#110324) Check name conflicts between intrinsics caused by mangling suffix. If the base name of an overloaded intrinsic is a proper prefix of another intrinsic, check if the other intrinsic name suffix after the proper prefix can match a mangled type and issue an error if it can.

…#112307) InsertPosition has been deprecated in favor of using BasicBlock::iterator. (See llvm#102608)

…d. NFC. Noticed while triaging llvm#112347 which is using this fold - we described the or->and fold, but not the equivalent and->or which is also handled.

…ands Don't fold AND(X,OR(NOT(Z),C)) -> AND(X,NOT(AND(Z,C'))) as DAGCombiner will invert it back again. Fixes llvm#112347

…ductions Enables initial non-power-of-2 support (but still requiresnumber of elements, forming whole registers) for reductions. Enables extra vectorization for MultiSource/Benchmarks/7zip/7zip-benchmark, CINT2006/464.h264ref and CFP2017rate/526.blender_r (checked for SSE2) Reviewers: RKSimon Reviewed By: RKSimon Pull Request: llvm#112361

…st type in C23 constexpr (llvm#112211) ```cpp const int V33 = 4; const int V34 = 0; const int V35 = 2; constexpr int V36 = V33 / V34; // expected-error@-1 {{constexpr variable 'V36' must be initialized by a constant expression}} constexpr int V37 = V33 / V35; // expected-error@-1 {{constexpr variable 'V37' must be initialized by a constant expression}} ``` --------- Signed-off-by: yronglin <[email protected]>

Remove redundant copy parameter and move it from `explicit OwningMemoryBlock(MemoryBlock M) : M(M) {}` to `explicit OwningMemoryBlock(MemoryBlock M) : M(std::move(m)) {}` Fixes: llvm#95640

…a constant value. (llvm#112113) This patch adds support for constant folding for the `log1p` and `log1pf` libc functions.

The inaccurate llvm#111945 condition fixes a PROVIDE regression (llvm#111478) but introduces another regression: in a DSO link, if a symbol referenced only by bitcode files is defined as PROVIDE_HIDDEN, lld would not set the visibility correctly, leading to an assertion failure in DynamicReloc::getSymIndex (https://reviews.llvm.org/D123985). This is because `(sym->isUsedInRegularObj || sym->exportDynamic)` is initially false (bitcode undef does not set `isUsedInRegularObj`) then true (in `addSymbol`, after LTO compilation). Fix this by making the condition accurate: use a map to track defined symbols. Reviewers: smithp35 Reviewed By: smithp35 Pull Request: llvm#112386

…ranch/insertBranch functions (llvm#110653) This PR fixes implementation of `SPIRVInstrInfo::analyzeBranch()` and adds implementations of `SPIRVInstrInfo::removeBranch()` and `SPIRVInstrInfo::insertBranch()` to support Branch Folding and If Conversion optimization. The attached test case failed before this PR due to report_fatal_error() firing on missing implementation of `SPIRVInstrInfo::removeBranch()`. The new test case is not able to pass spirv-val check at the moment due to the issue described in llvm#110652 , this is not related to this PR. This PR also updates instructions definition in tablegen to set isBranch=1 for relevant instructions.

…ns for indirect calls using SPV_INTEL_function_pointers (llvm#111159) This PR improves implementation of SPV_INTEL_function_pointers and type inference for phi-nodes and indirect calls.

…sion (llvm#112359) This PR implements support of the SPV_EXT_arithmetic_fence SPIRV extension (https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_split_barrier.asciidoc) and adds builtins from https://registry.khronos.org/OpenCL/extensions/intel/cl_intel_split_work_group_barrier.html

…cking (llvm#112376)

…, __sptr, __uptr for AArch64 (llvm#111879) MSVC has a set of qualifiers to allow using 32-bit signed/unsigned pointers when building 64-bit targets. This is useful for WoW code (i.e., the part of Windows that handles running 32-bit application on a 64-bit OS). Currently this is supported on x64 using the 270, 271 and 272 address spaces, but does not work for AArch64 at all. This change adds the same 270, 271 and 272 address spaces to AArch64 and adjusts the data layout string accordingly. Clang will generate the correct address space casts, but these will currently be ignored until the AArch64 backend is updated to handle them. Partially fixes llvm#62536 This is a resurrected version of <https://reviews.llvm.org/D158857> (originally created by @a_vorobev) - I've cleaned it up a little, fixed the rest of the tests and added to auto-upgrade for the data layout.

…ion in `Diagnostic` (llvm#112154) Diagnostic stores various notes/error messages which might help the user in debugging. For the most part, the `Diagnostic` when receiving an error message will copy and own the contents of the string. However, there is one optimization where given a `const char*`, the class will assume this is a StringLiteral which is immutable and lifetime matches that of the entire program. As a result, instead of copying the message in these cases the class will simply store the underlying pointer. This is problematic since `const char*` is not specific enough to always imply a StringLiteral which can lead to bugs, e.g. if the underlying pointer is freed before the diagnostic reports. We solve this problem by choosing a more specific function signature. While not full-proof, this should cover a lot more cases. A potentially better alternative is just deleting this special handling of string literals, but I am unsure of the implications (it does sound safe to do however with a negligble impact on performance).

This would consistently fail for me locally, to the point where I could not run ninja libc-unit-tests without ninja libc_setjmp_unittests failing. Turns out that since I enabled -ftrivial-auto-var-init=pattern in commit 1d5c16d ("[libc] default enable -ftrivial-auto-var-init=pattern (llvm#78776)") this has been a problem. Our x86_64 setjmp definition disabled -Wuninitialized, so we wound up clobbering these registers and instead backing up 0xAAAAAAAAAAAAAAAA rather than the actual register value. Use `naked` function attribute to avoid function prolog/epilog.

If `cvta.param` is used in regular functions, it may produce an invalid pointer. It's unclear if it's a bug in ptxas or we're not using `cvta.param` correctly, but, regardless of the underlying reason, the instruction has to be disabled for non-kernels, at least for now.

I find multiline 'RUN:' statements hard to read. ` *\\\n; RUN: *` -> ` ` for ./llvm/test/Instrumentation/

This PR updates the document for `vector.splat`, specifying that the operand type must match the element type of the result.

Check as early as possible for the deleted instructions before trying to vectorize the code. May reduce number of attempts for the vectorization.

…lvm#112088) Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The ordering of the operands in both the fcmp and select instructions is important for the folding to occur. maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult} The second pattern is supposed to make the order of the operands in the select instruction irrelevant. However, the pattern matching code uses the CmpInst::getInversePredicate method to invert the comparison predicate. This method doesn't take into account the fast-math flags, which can lead missing the folding opportunity. The patch extends the pattern matching code to handle unordered fcmp instructions. This allows the folding to occur even when the select instruction has the operands in the inverse order. New maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt} The same changes are applied to the minnum intrinsic.

The operation will be used in the CUF constructor to register the kernel functions. This allow to delay this until codegen when the gpu.binary will be available. Reland of llvm#112268 with correct shared library build support.

Previously you could cast between bigints with different numbers of bits, but only if they had the same underlying type. This patch adds the ability to cast between bigints with different underlying types, which is needed for llvm#110894

The pack_paddings attribute in the structure.pad TD Op is used to set the `nofold` attribute in the generated tensor.pad Op. The current name is confusing and suggests that there's a relation with the tensor.pack Op. This patch renames it as `nofold_flags` to better match the actual usage.

…ns (llvm#109406)

Obsolete since opaque pointers.

…lvm#112413)

… request. (llvm#112396) Adjusting the name from `lldb-dap startDebugging` to `lldb-dap start-debugging` to improve consistency with other names for commands in lldb/lldb-dap.

Summary: I'm going to attempt to move the `rpc.h` header to a separate folder that we can install and include outside of `libc`. Before doing this I'm going to try to trim up the file so there's not as many things I need to copy to make it work. This dependency on `cpp::functional` is a low hanging fruit. I only did it so that I could overload the argument of the work function so that passing the id was optional in the lambda, that's not a *huge* deal and it makes it more explicit I suppose.

Implement computing costs for VPInterleaveRecipe. PR: llvm#106067

… SA checker emitter (llvm#112321) Use const pointers for various Init objects in SA checker emitter. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

…) for reductions" This reverts commit 8287fa8 to investigate and fix compile time regressions reported by https://llvm-compile-time-tracker.com/compare.php?from=ec78f0da0e9b1b8e2b2323e434ea742e272dd913&to=8287fa8e596d8fc8655c8df3bc99e068ad9f7d4b&stat=instructions:u

Fixes: llvm-project/libc/src/setjmp/x86_64/setjmp.cpp:21:25: error: ‘int __llvm_libc_19_0_0_git::setjmp(__jmp_buf*)’ specifies less restrictive attribute than its target ‘int __llvm_libc_19_0_0_git::__setjmp_impl__(__jmp_buf*)’: ‘nothrow’ [-Werror=missing-attributes] 21 | LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) { | ^~~~~~ observed in the GCC build by manually expanding LLVM_LIBC_FUNCTION to add `gnu::nothrow` to the alias. We probably need to revisit adding nothrow throughout our declarations, so there is probably a better way to clean this up in the future. Link: llvm#88054

The patch llvm#98694 was not enough. This test is still failed on the buildbot https://lab.llvm.org/staging/#/builders/195/builds/4438 Use `USE_LIBSTDCPP := 1` instead for non Darwin OS and skip the test if libstdc++.so is missing.

…iters (llvm#112418) When pipelining an `scf.for` with dynamic loop bounds, the epilogue ramp-down must align with the prologue when num_stages > total_iterations. For example: ``` scf.for (0..ub) { load(i) add(i) store(i) } ``` When num_stages=3 the pipeline follows: ``` load(0) - add(0) - scf.for (0..ub-2) - store(ub-2) load(1) - - add(ub-1) - store(ub-1) ``` The trailing `store(ub-2)`, `i=ub-2`, must align with the ramp-up for `i=0` when `ub < num_stages-1`, so the index `i` should be `max(0, ub-2)` and each subsequent index is an increment. The predicate must also handle this scenario, so it becomes `predicate[0] = total_iterations > epilogue_stage`.

This reverts commit bec839d. Caused buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/52/builds/2928

This change enables `DynamicLoaderDarwin` to load modules in parallel using the thread pool. This new behavior is controlled by a new setting `plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load`, which is enabled by default. When disabled, DynamicLoaderDarwin will load modules sequentially as before.

In DwarfTransformer::verify() line number information is retrieved for each address using: auto DwarfInlineInfos = DICtx.getInliningInfoForAddress(SectAddr, DLIS); Later down the loop, another such invocation was made before: Gsym->dump(Log, *FI); There is a continue after that, DwarfInlineInfos do not affect the dump() invocation, I am not aware of any other side effects that is needed from the extra getInliningInfoForAddress() invocation, and tests pass without it, so just remove it.

…o convert HLSL types to DirectX target types (llvm#110327) Translates `RWBuffer` and `StructuredBuffer` resources buffer types to DirectX target types `dx.TypedBuffer` and `dx.RawBuffer`. Includes a change of `HLSLAttributesResourceType` from 'sugar' type to full canonical type. This is required for codegen and other clang infrastructure to work property on HLSL resource types. Fixes llvm#95952 (part 2/2)

…t arguments (llvm#112047) Fixes llvm#68596.

Use VPWidenIntrinsicRecipe (llvm#110486) to create vp.select intrinsics. This potentially offers an alternative to duplicating EVL recipes for all existing recipes. There are some recipes that will need duplicates (at least at the moment), due to extra code-gen needs (e.g. widening loads and stores). But in cases the intrinsic can directly be used, creating the widened intrinsic directly would reduce the need to duplicate some recipes. PR: llvm#110489

…vm#112243) The assumption that a symbol is either `Defined` or `Undefined` is not always true for some cases. For example, `mangleMaybe` may create a weak alias to a lazy archive symbol.

Use PPC `MatchRegisterName()` that is auto generated by table gen.

Update the CUF constructor with the cuf.register_kernel operations.

…m#111922) Type interoperability warnings current issue for intrinsic types when their type, kind, or length do not meet the requirements for C interoperability. This turns out to be too noisy for the case of one-byte characters with lengths other than one when creating C pointers from C_LOC or C_F_POINTER -- it is not uncommon for programs to use pointers to longer character objects. So split the interoperability warning so that the case of a known bad character length for an otherwise interoperable type is controlled by its own UsageWarning enumerator, and leave that usage warning off by default. This will better fit expectations in the default case while still showing a warning under -pedantic.

Nearly every Fortran compiler supports "PRINT namelistname" as a synonym for "WRITE (*, NML=namelistname)". Implement this extension via parse tree rewriting. Fixes llvm#111738.

…lvm#112054) It is possible for the compiler to emit an impossible error message about dummy argument character length incompatibility in the case of a MODULE SUBROUTINE or FUNCTION defined later in a submodule with MODULE PROCEDURE, when the character length is defined by USE association in its interface. The checking for separate module procedure interface compatibility needs to use a more flexible check than just operator== on a semantics::ParamValue.

Move the ErfcScaled template function from the runtime into a new header file in flang/include/Common, then use it in constant folding to implement folding for the erfc_scaled() intrinsic function.

When running fixed-form source through the compiler under -E, don't aggressively remove space characters, since the parser won't be parsing the result and some tools might need to see the spaces in the -E preprocessed output. Fixes llvm#112279.

Important part of the test to have correct `ThreadDescriptorSize` after `InitTlsSize()`. It's not a problem if another test called `InitTlsSize()` before. Fixes llvm#112399.

If we run all test in a single process, there is high probability that `99` is already claimed.

…m#112393) The aim is to have the same set of promotions on fixed-length bf16 vectors as on fixed-length f16 vectors, and then deduplicate them similarly to what was done for scalable vectors. It looks like fneg/fabs/fcopysign end up getting expanded because fsub is now legal, and the default operation action must be expand.

…dboxVec pass pipelines. (llvm#112288) My previous attempt (llvm#111904) hacked creation of Regions from metadata into the bottom-up vectorizer. I got some feedback that it should be its own pass. So now we have two SandboxIR function passes (`BottomUpVec` and `RegionsFromMetadata`) that are interchangeable, and we could have other SandboxIR function passes doing other kinds of transforms, so this commit revamps pipeline creation and parsing. First, `sandboxir::PassManager::setPassPipeline` now accepts pass arguments in angle brackets. Pass arguments are arbitrary strings that must be parsed by each pass, the only requirement is that nested angle bracket pairs must be balanced, to allow for nested pipelines with more arguments. For example: ``` bottom-up-vec<region-pass-1,region-pass-2<arg>,region-pass-3> ``` This has complicated the parser a little bit (the loop over pipeline characters now contains a small state machine), and we now have some new test cases to exercise the new features. The main SandboxVectorizerPass now contains a customizable pipeline of SandboxIR function passes, defined by the `sbvec-passes` flag. Region passes for the bottom-up vectorizer pass are now in pass arguments (like in the example above). Because we have now several classes that can build sub-pass pipelines, I've moved the logic that interacts with PassRegistry.def into its own files (PassBuilder.{h,cpp} so it can be easily reused. Finally, I've added a `RegionsFromMetadata` function pass, which will allow us to run region passes in isolation from lit tests without relying on the bottom-up vectorizer, and a new lit test that does exactly this. Note that the new pipeline parser now allows empty pipelines. This is useful for testing. For example, if we use ``` -sbvec-passes="bottom-up-vec<>" ``` SandboxVectorizer converts LLVM IR to SandboxIR and runs the bottom-up vectorizer, but no region passes afterwards. ``` -sbvec-passes="" ``` SandboxVectorizer converts LLVM IR to SandboxIR and runs no passes on it. This is useful to exercise SandboxIR conversion on its own.

This PR adds missing `sched.group.barrier` and `rocdl.iglp.opt` ops to the ROCDL dialect (see [here](https://github.com/llvm/llvm-project/blob/ec78f0da0e9b1b8e2b2323e434ea742e272dd913/clang/include/clang/Basic/BuiltinsAMDGPU.def#L66-L68)). The ops are converted to the corresponding intrinsic calls during the translation from MLIR to LLVM IRs. This intrinsics are hints to the instruction scheduler of the AMDGPU backend.

These were inadvertently changed in llvm#112393

I just introduced a dependency from the Evaluate library to the Semantics library, which is circular in a shared library build. Rearrange the code a little to ensure that the dependence is only on a header.

Half-precision floating point (16-bit) implementation of the trigonometric function Sin for inputs scaled by pi

…Emitter (llvm#112317) Use const pointers for various Init objects in NeonEmitter. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

…nostic Emitter (llvm#112318) Use const pointers for various Init objects in Diagnostic Emitter. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089

As discussed in llvm#111911, we have consensus that as it stands, the system log is only meaningful on Darwin and that by default it should be a NOOP on other platforms.

@lhames

… factory & constructor (llvm#112419) We can get a reference to the `ExecutionSession` from the `ObjectLinkingLayer` argument, so there's no need to pass it in separately. This mirrors recent changes to `ElfNixPlatform` and `MachOPlatform` by @lhames in llvm@3dba4ca and llvm@cc20dd2.

Fixes bug where a device that supports tagged pointers doesn't use the tagged pointer when computing the checksum. Add tests to verify that double frees result in chunk state error not corrupted header errors.

This adds a minor change to command-expr-diagnostics.test to make it pass on windows. Clang produces PDB on windows by default which was ignoring main symbol due to optimization. The problem is fixed by adding -gdwarf to commandline, making sure dwarf debug info gets generated on both Windows and Linux.

…2109)" This reverts commit eca3206. This broke LLDB Linux bot for no apparent reason. I ll post a more suitable fix later. Disabled command-expr-diagnostics.test on windows for now.

This patch adds support for R_X86_64_SIZE32/R_X86_64_SIZE64 relocation types by introducing edge kinds x86_64::Size32/x86_64::Size64. The calculation for these relocations is: Z + A, where: Z - Represents the size of the symbol whose index resides in the relocation entry. A - Represents the addend used to compute the value of the relocation field. Ref: [System V Application Binary Interface x86-64](https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build)

- create a clang built-in in Builtins.td - add semantic checking in SemaHLSL.cpp - link the WaveReadLaneAt api in hlsl_intrinsics.h - add lowering to spirv backend op GroupNonUniformShuffle with Scope = 2 (Group) in SPIRVInstructionSelector.cpp - add WaveReadLaneAt intrinsic to IntrinsicsDirectX.td and mapping to DXIL.td - add tests for HLSL intrinsic lowering to spirv intrinsic in WaveReadLaneAt.hlsl - add tests for sema checks in WaveReadLaneAt-errors.hlsl - add spir-v backend tests in WaveReadLaneAt.ll - add test to show scalar dxil lowering functionality - note that this doesn't include support for the scalarizer to handle WaveReadLaneAt will be added in a future pr This is the first part llvm#70104

in favor of LLVM_ENABLE_RUNTIMES

Hi there, When building llvm-libc on the openEuler system, I encountered an issue as shown in the image below: ![image](https://github.com/user-attachments/assets/75667de4-5bea-4a95-be28-ed34db0e05b9) This issue happens because the regular expression used in `libc/cmake/modules/LLVMLibCArchitectures.cmake`: `string(REGEX MATCH "Target: [-_a-z0-9.]+[ \r\n]+")` does not handle capital letters properly in `openEuler`. To fix this, I modified the regular expression to: `string(REGEX MATCH "Target: [-_a-zA-Z0-9.]+[ \r\n]+")`. This change makes it compatible with capital letters.

This aligns with GCC. LoongArch kernel developers requested that this option generate some corresponding relations in a section, including the addresses of the jump instruction(jr) and the `MachineJumpTableEntry`. Reviewed By: heiher Pull Request: llvm#102411

This patch adds support for forced loading of archive members, similar to the behavior of the -all_load and -ObjC options in ld64. To enable this, the StaticLibraryDefinitionGenerator class constructors are extended with a VisitMember callback that is called on each member file in the archive at generator construction time. This callback can be used to unconditionally add the member file to a JITDylib at that point. To test this the llvm-jitlink utility is extended with -all_load (all platforms) and -ObjC (darwin only) options. Since we can't refer to symbols in the test objects directly (these would always cause the member to be linked in, even without the new flags) we instead test side-effects of force loading: execution of constructors and registration of Objective-C metadata. rdar://134446111

Fix gcc warning: lld/ELF/SymbolTable.cpp:340:33: warning: enumeral and non-enumeral type in conditional expression [-Wextra]

* This avoids the need to call printAsOperand that requires use of an ostream and thus avoids a str copy. * ModuleSlotTracker is used to get a BB # for BB's without names when dumping SuspendCrossingInfo and materialization info. * getBasicBlockLabel() is changed to dumpBasicBlockLabel() that directly prints the label to dbgs() * The label corresponds with the print-before BB #s. * This change does not require any additional arguments to be added to dump() methods, at least those that currently do not require any args. Co-authored-by: tnowicki <[email protected]>

Fix R_386_32 and other relocations by correcting Addend computations.

…112143) This patch adds operand bundle support for `llvm.intr.assume`. This patch actually contains two parts: - `llvm.intr.assume` now accepts operand bundle related attributes and operands. `llvm.intr.assume` does not take constraint on the operand bundles, but obviously only a few set of operand bundles are meaningful. I plan to add some of those (e.g. `aligned` and `separate_storage` are what interest me but other people may be interested in other operand bundles as well) in future patches. - The definitions of `llvm.call`, `llvm.invoke`, and `llvm.call_intrinsic` actually define `op_bundle_tags` as an operation property. It turns out this approach would introduce some unnecessary burden if applied equally to the intrinsic operations because properties are not available through `Operation *` but we have to operate on `Operation *` during the import/export of intrinsics, so this PR changes it from a property to an array attribute.

The atexit needs a signext attribute on its return type. See llvm#109658.

This patch contains following changes to fix vp intrinsics tests. 1. v\*float -> v\*f32, v\*double -> v\*f64 and v\*half -> v\*f16 2. Fix the order of the vp-intrinsics.

VectorPointer doesn't read from memory or have any sideeffects. Mark it accordingly.

…ws (llvm#109024) This is part of the effort to support for enabling plugins on windows by adding better support for building llvm and clang as a DLL. Since windows doesn't implicitly import and merge exported symbols across shared libraries like other platforms we need to explicitly add a extern template declaration for each instantiation of llvm::Registry to force the registry symbols to be dllimport'ed. I've added a new visibility macro that doesn't switch between dllimport and dllexport on windows since the existing macro would be in the wrong mode for llvm::Registry's declared in Clang. This PR also depends Clang symbol visibility macros that will be added by llvm#108276 --------- Co-authored-by: Saleem Abdulrasool <[email protected]>

…on windows (llvm#109024)" This reverts commit 00cd1a0. This effectively reverts llvm#109024

Test case from llvm#106431.

CMP0114 was originally set to old to get rid of warnings. However, this behavior is now set to new by default with the minimum CMake version that LLVM requires so does not produce any warnings, and setting it explicitly to old does produce a warning in newer CMake versions. Due to these reasons, remove this check for now. Splitting off from removing the CMP0116 check just in case something breaks. Partially fixes llvm#83727.

llvm#112143)" This reverts commit d8fadad. The commit breaks the following CI builds: - ppc64le-mlir-rhel-clang: https://lab.llvm.org/buildbot/#/builders/129/builds/7685 - ppc64le-flang-rhel-clang: https://lab.llvm.org/buildbot/#/builders/157/builds/10338

Use getAllocTypeSize to get compute the offset to the start of interleave groups instead getScalarSizeInBits, which may return 0 for pointers. This is in line with the analysis building the interleave groups and fixes a mis-compile reported for llvm#106431.

Extra tests for llvm#92177, split off the PR.

I am removing the recently added integration test for various Arith Ops. These operations and their lowerings are effectively already verified by the Arith-to-LLVM conversion tests in: * "mlir/test/Conversion/ArithToLLVM/arith-to-llvm.mlir" I've noticed that a few variants of `arith.cmpi` were missing in that file - those are added here as well. This is a follow-up for this discussion: * llvm#92272 See also the recent update to our guidelines on e2e tests in MLIR: * llvm/mlir-www#203

…#112353) Combine is needed to clear redundant ANDs with 1 that will be created by reg-bank-select to clean-up high bits in register. Fix replaceRegWith from CombinerHelper: If copy had to be inserted, first create copy then delete MI. If MI is deleted first insert point is not valid.

…eMetrics pass(NFC). (llvm#108506)

…ger. (llvm#108507)

…llvm#93421)" This reverts commit 23c64be.

A miscompilation issue has been addressed with improved checking. Fixes: llvm#97758.

…vm#112378) As discussed in https://discourse.llvm.org/t/clang-cl-adding-std-c-23preview/82553

…rminated checker (llvm#112019) CStringChecker has a sub-checker alpha.unix.cstring.NotNullTerminated which checks for invalid objects passed to string functions. The checker and its name are not exact and more functions could be checked, this change only adds some tests and improves documentation.

…lvm#112489) In llvm@5dbfca3 we assume that RHS is poison implies LHS is also poison. It doesn't hold after introducing samesign flag. This patch drops the `samesign` flag on RHS if the original expression is a logical and/or. Closes llvm#112467.

…112380) Instead of the pragmas, which are less familiar to people. This is a follow-up of a discussion from llvm#111992.

…ll(). (llvm#112499)

`RecordRecord::classOfKind` and `TagRecord::classofKind` didn't correctly capture `RK_CXXClass` and derived variants, e.g. `RK_ClassTemplate`. This materialized by anonymous C++ tag types not being correctly detected when they need to be merged with another record.

…m#109981) Fixes llvm#94620

…110300) Remove redundant copy parameter in method Fixes llvm#94233

Reduce redundant copy parameter in lambda Fixes llvm#95642

…tible] functions (llvm#112213) The single-element vector variants of FCVTZS, FCVTZU, UCVTF, and SCVTF are only supported in streaming[-compatible] functions with `+sme2p2`. Reference: - https://developer.arm.com/documentation/ddi0602/2024-09/SIMD-FP-Instructions/FCVTZS--vector--integer---Floating-point-convert-to-signed-integer--rounding-toward-zero--vector-- - https://developer.arm.com/documentation/ddi0602/2024-09/SIMD-FP-Instructions/UCVTF--vector--integer---Unsigned-integer-convert-to-floating-point--vector-- - https://developer.arm.com/documentation/ddi0602/2024-09/SIMD-FP-Instructions/SCVTF--vector--integer---Signed-integer-convert-to-floating-point--vector-- Codegen will be improved in follow up patches.

Fix gcc warning: clang/lib/Sema/SemaOpenACC.cpp:2208:5: warning: this statement may fall through [-Wimplicit-fallthrough=]

…s" (llvm#112506) Reverts llvm#112316 Bots are failing.

A fix-it patch for dbfca24 llvm#110228. No need for a container. This allows 8 flags for a register. The virtual register flags vector had a memory leak because the vector's memory is not freed. The `BumpPtrAllocator` handles the deallocation and missed calling the `std::vector<uint8_t> Flags` destructor.

This patch enables lowering to MLIR of the reduction clause of `simd` constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at that stage it will result in errors being emitted rather than silently ignoring it as it is currently done. On composite `do simd` constructs, this lowering error will remain untriggered, as the `omp.simd` operation in that case is currently ignored. The MLIR representation, however, will now contain `reduction` information.

…-shot-bufferize (llvm#112505)

Libunwind manages the regiser context including the program counter which is used effectively as return address. To increase the robustness of libunwind let's protect the stored address with PAC. Since there is no unwind info for this let's use the A key and the base address of the context/registers as modifier. __libunwind_Registers_arm64_jumpto can go anywhere where the given buffer 's PC points to. After this patch it needs a signed PC therefore the context is more harder to cract outside of libunwind. The register value is internal to libunwind and the change is not visible on the the APIs. w

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test #2

Test #2

Commits on Oct 14, 2024

Commits on Oct 15, 2024

Commits on Oct 16, 2024

Commits on Oct 22, 2024

Commits on Oct 24, 2024