Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update LLVM/Clang to LLVM 16 #185

Open
larryk85 opened this issue May 24, 2023 · 8 comments
Open

Update LLVM/Clang to LLVM 16 #185

larryk85 opened this issue May 24, 2023 · 8 comments
Labels
enhancement New feature or request triage

Comments

@larryk85
Copy link
Contributor

larryk85 commented May 24, 2023

Currently, CDT is using LLVM 9 for its internal toolchain. This is a modified version of LLVM, so some care needs to be made during the upgrade process to ensure that all parts that are specific to CDT are carried over and up-ported if needed.

[EDIT]: Question the need to use a modified version of LLVM. Using a vanilla version would be simpler approach. Projects like Blanc have show this approach can be successfully. We lack the experts in compilers and the resourcing to make custom compiler workable. - EP

Since the LLVM project has changed how they 'house' their project on Github, i.e. no longer distinct repositories and now use a mono-repo, this will make things a bit more difficult. The easiest way to move forward I believe is to generate a Git patch from our LLVM code against the mono-repos directories with the mono-repo being checked out as llvmorg-9.0.0 branch. From there you will be able to apply the patch to the 9.0.0 branch (hopefully no issues exist here), then merge llvmorg-16.0.0 and address any issues.

Instead of forking llvm the same as we have in the past and creating cdt- repos, we will be simply adding it to the project directly.

@larryk85 larryk85 added this to the CDT 5.0.0-rc1 milestone May 24, 2023
@larryk85 larryk85 added enhancement New feature or request and removed triage labels May 24, 2023
@jolly-fellow
Copy link
Contributor

I have compared llvmorg-9.0.0 tag from LLVM repo with main cdt-llvm repo and found these differences:

--- a/llvm/utils/release/build_llvm_package.bat
+++ b/utils/release/build_llvm_package.bat
@@ -26,8 +26,8 @@ set python64_dir=C:\Users%USERNAME%\AppData\Local\Programs\Python\Python36

set revision=%1
set branch=trunk
-set package_version=9.0.0-r%revision%
-set clang_format_vs_version=9.0.0.%revision%
+set package_version=9.0.1-r%revision%^M
+set clang_format_vs_version=9.0.1.%revision%^M
set build_dir=llvm_package_%revision%

--- a/llvm/utils/lit/lit/init.py
+++ b/utils/lit/lit/init.py
@@ -2,7 +2,7 @@

author = 'Daniel Dunbar'
email = '[email protected]'
-versioninfo = (0, 9, 0)
+versioninfo = (0, 9, 1)
version = '.'.join(str(v) for v in versioninfo) + 'dev'

all = []

--- a/llvm/utils/gn/secondary/llvm/version.gni
+++ b/utils/gn/secondary/llvm/version.gni
@@ -1,4 +1,4 @@
llvm_version_major = 9
llvm_version_minor = 0
-llvm_version_patch = 0
+llvm_version_patch = 1

Was it a try to switch to the next version?

@jolly-fellow
Copy link
Contributor

The easiest way to move forward I believe is to generate a Git patch from our LLVM code against the mono-repos directories with the mono-repo being checked out as llvmorg-9.0.0 branch. From there you will be able to apply the patch to the 9.0.0 branch (hopefully no issues exist here), then merge llvmorg-16.0.0 and address any issues.

cdt-llvm repository main branch is an LLVM repository tag llvmorg-9.0.0 with changes from our team so I don't understand why not just merge llvmorg-16.0.0 into cdt-llvm main and then move cdt-llvm repo to cdt repo as a directory if it should be there.
P.S. But I'd prefer to leave cdt-llvm repository separated because clang and lld placed in separated repositories too. Or lets use and patch whole LLVM repository where versions of LLVM, clang and lld are synchronized. Separated repositories give flexibility to work with different versions of LLVM, clang and lld if you need it as now we work with new clang but old LLVM. I just not sure if you really need it.

@jolly-fellow
Copy link
Contributor

We agreed with @larryk85 a procedure of the update in the Telegram:

@larryk85 -----------------------------------------------------------------------------------------------------
The separate repos is the way that LLVM used to have their projects. They changed to a mono repo around LLVM 10 I believe.

Yes that is the ask, to do away with the separated forked repos and just merge our changes onto the monorepo of LLVM some that we then just have one repo.

Also instead of forking the llvm repo and adding it as a submodule. What I would like to see is just adding the merged LLVM code directly to cdt. I.e. hoist and rebase into a new folder that we will just call llvm and do away with the cdt-llvm stuff.

That way we would be able to work on those parts and only have one PR. We would also be able to apply upstream changes to llvm-project directly to cdt.

@jolly-fellow -----------------------------------------------------------------------------------------------------
Where do you want to have llvm directory? The path.

@larryk85 -----------------------------------------------------------------------------------------------------
The same location as cdt-llvm. We'll just replace that with this one and we'd have to fix up some Cmake leading to it. I believe we rely on an old way of how LLVM used to build.

So if you do it and find out that it is only building llvm and not clang and lld. Then Google cmake for LLVM building and you'll see a section on the CMake variables to toggle on which subprojects to build.

@jolly-fellow -----------------------------------------------------------------------------------------------------
I'd prefer to put whole the code of LLVM monorepo to CTD. It avoids any problems with difference in building or merging because of difference in directory structure. Now we have clang and lld submodules included into cdt-llvm directory. In the monorepo they are on the same level as llvm itself and other utilities. IMHO it is better to synchronize the directory structure one time and then follow llvm-project structure instead to keep structure of cdt-llvm and have problems each time when we will need to update versions of LLVM.

git easily merge corresponding directories from lvm-project into our repositories, as I understand because these directories are roots of the corresponding repos. So only LLVM code will be merged from lvm-project into cdt-llvm repo, cmake code will be merged into cmake and lld to our lld.
So I propose to update the code into our repos step by step: llvm, then clang, then lld to the code from llvmorg-16.0.0 tag of the monorepo. We can merge and test each step separately. Then when whole the code will be updated we copy lvm-project into cdt repo, copy code from llvm, clang and lld repos to the corresponding dirs of lvm-project into cdt repo and update building system to the new dirs structure.
I believe each step will be small and controlled.
@larryk85 -----------------------------------------------------------------------------------------------------
We are saying the same thing 😊 yes continue with that.

@jolly-fellow
Copy link
Contributor

jolly-fellow commented Jun 21, 2023

llvm_9-llvm_cdt.patch.gz
Branch switch_llvm_1600 has been created in cdt repo for work with this update.
Monorepo LLVM 16.0.0 has been added to the root of cdt repo to directory llvm as a subtree as described here:
https://docs.github.com/en/get-started/using-git/about-git-subtree-merges
I believe it will allows to update LLVM in the future in one command git pull -s subtree llvmorg main

cdt-llvm repo was forked from branch release_90 of https://github.com/llvm-mirror/llvm.git repo
I have made a patch with difference between of the current state of cdt-llvm repo and the branch release_90
It contains all what EOS team added to the llvm. I have attached this patch here.
The patch is not applicable to LLVM 16.0.0 directly because of too big differences so I have added all needed differences manually based of this patch.

Now I need to do the same for clang and lld repos, and then update cmake files for integration of building of LLVM monorepo into CDT building process.

@jolly-fellow
Copy link
Contributor

@ericpassmore
Copy link
Contributor

Need to re-evaluate the approach. Can we move to vanilla LLVM 16 with custom extensions.

@ericpassmore ericpassmore modified the milestones: CDT 4.1.0-rc1, Future Milestone Aug 24, 2023
@ericpassmore ericpassmore modified the milestones: Future Milestone, CDT 5.0 Sep 9, 2023
@dimas1185 dimas1185 self-assigned this Oct 25, 2023
@dimas1185
Copy link
Contributor

here is my latest investigation results on this:

  1. Update to llvm-16 old way.
  2. Try to move all modified code out of llvm
  3. Try to move llvm out of cdt.
    This order sounds to me most reasonable and here is why:
  4. Blank is abandoned and not so documented, I didn’t manage to build it in few of days and decided to approach this other way.
  5. All 3 steps above are not interconnected which means we can do only one and release it. We can optionally take 2 and 3 step this release or later.
  6. Even if we will go vanilla llvm path, the way I proposed it looks more feasible because of we have new changes to cdt and I can’t even build blanc.

@dimas1185
Copy link
Contributor

here is latest work on this: https://github.com/AntelopeIO/cdt/tree/new_llvm
clang and llvm are migrated. lld was in the middle of applying patch manually.
your steps forward:

  1. have llvm-project checked out on llvmorg-9.0.0.
  2. have cdt checked out on main.
  3. make a patch between cdt-llvm/tools/lld and 'llvm-project/lld'.
  4. got to a new_llvm branch and apply this patch manually to llvm/llvm/lld. You'll notice that some changes are already there as I was in the middle of making this when priorities have changed. You'll need to update only: lld/wasm/MarkLive.cpp , lld/wasm/Options.td, lld/wasm/Relocations.cpp, lld/wasm/SymbolTable.h, lld/wasm/Symbols.cpp, lld/wasm/Symbols.h, lld/wasm/SyntheticSections.cpp, lld/wasm/Writer.cpp.
  5. compile, resolve all errors and run tests
  6. if that works - voila, cdt is on llvm-16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage
Projects
Status: In Progress
Development

No branches or pull requests

7 participants