Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zig 0.11 llvm 16 #5851

Merged
merged 57 commits into from
Oct 25, 2023
Merged

Zig 0.11 llvm 16 #5851

merged 57 commits into from
Oct 25, 2023

Conversation

bhansconnect
Copy link
Contributor

This branch isn't quite ready, but I want to see what we get out of CI. Currently, I think that only a few gen_dev (I think a lambdaset issue leading to invalid frees) and gen_wasm (not sure, but probably linking related) tests are failing. Hopefully we can patch those up and get this submitted soon.

Also, no idea the state of windows, but hopeful it isn't too bad.

@bhansconnect
Copy link
Contributor Author

It looks like for nix build we need to update the rust version. That's definitely a bit inconvenient.

@bhansconnect
Copy link
Contributor Author

Also, quite confused why after my pr we are still failing to run zig 0.11.0. It is still trying to build zig in ci with 0.9.1

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 25, 2023

Non-nix workflows on self-hosted servers have zig 0.9.1 installed, I'll install zig 0.11.0 on there and try to use our ROC_ZIG env var.

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 25, 2023

For the nix issues I'm going to upgrade to rust 1.72.0 but on main branch first, so we know that's not causing any issues here.

@bhansconnect
Copy link
Contributor Author

For the nix issues I'm going to upgrade to rust 1.72.0 but on main branch first, so we know that's not causing any issues here.

🙏 Thanks so much. Definitely will be helpful to keep that complexity out of this PR.

@bhansconnect
Copy link
Contributor Author

Awesome, so it looks like nix with rust 1.71.0 is new enough. No need to update to 1.72.0 for this branch!!!

@ostcar
Copy link
Contributor

ostcar commented Sep 27, 2023

Zig changed the way to build wasm-modules. Before 0.11 the wasm-module exported all exported symbols. Starting from 0.11, you have to do it explicitly with --export=[name] or use the flag -rdynamic to export all exported symbols: https://ziglang.org/download/0.11.0/release-notes.html#WebAssembly

Maybe this is relevant for this PR? If not, then sorry for the noise.

(I am exited about this PR. Thank you for your work ❤️).

@bhansconnect
Copy link
Contributor Author

bhansconnect commented Sep 27, 2023

@folkertdev and @brian-carroll, not sure if either of you will have time or energy to look at the issues on this branch, but I think that you two are probably the most knowledgable in fixing the issues we are hitting. So any support would be greatly appreciated. Landing this will be awesome.

Also, I am a bit confused why more wasm tests are failing now. Apparently updating rust or some other change broke more of the wasm tests. That or maybe I screw up at some point and reverted a wasm based change, not fully sure. (could be something else that was pulled in when rebasing on main)

EDIT: Also @Anton-4, thanks for pulling out the specific issues.

@brian-carroll
Copy link
Contributor

@bhansconnect I'll pull the branch and have a quick look! I don't have a lot of free time this week unfortunately but might have a bit more from next week or the week after.

@folkertdev
Copy link
Contributor

I fixed the issues with the llvm wasm backend. Also #5859 seems to be fixed now (at least those tests all succeed for me locally)

@bhansconnect
Copy link
Contributor Author

My current plan is to try and fix any dev wasm related issues but then submit these changes even if it leaves windows abi stuff slightly more broken. I want to make larger changes to how we call bitcode functions in a separate PR. That will resolve the windows abi issues. I don't want to block this on that work.

@folkertdev folkertdev force-pushed the zig-11-llvm-16 branch 2 times, most recently from 088a83b to 0cb2aea Compare September 28, 2023 19:26
@bhansconnect
Copy link
Contributor Author

So did some more local testing, most of the wasm failures where caused by the update from rust 1.70.0 to 1.71.1.

With rust 1.70.0, I get 3 failures. With 1.71.1, I get 24.

er...kinda. It seems that no mater the rust version, all of these failures happen on my x86 linux machine, but my m1 mac follows the version related failure change that I mentioned above. It is super strange to me.

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 29, 2023

The CI failure is a segfault but it only happens outside of nix, so I'm wondering if it's some kind of dependency issue. I'm investigating.

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 29, 2023

I have one more thing I'm going to try out tomorrow and if that does not work I'll do a clean OS install on that CI machine.

@bhansconnect
Copy link
Contributor Author

I think the 2 cli run tests that are failing in CI / test zig, rust, wasm... seem to be failing consitently. I am not sure why. I can't repro it locally on either my x86 or m1 mac machine. @Anton-4 any ideas?

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 30, 2023

I failed to reproduce as well, the failures do indeed happen every time. I'm going to compare binaries on CI vs a very similar environment and see if I can learn anything from that.

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 30, 2023

This does look like a real bug, not just mixed dependencies.
This is the valgrind output:

valgrind --leak-check=full --track-origins=yes ./examples/platform-switching/rocLovesZig
==1811229== Memcheck, a memory error detector
==1811229== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1811229== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1811229== Command: ./examples/platform-switching/rocLovesZig
==1811229== 
==1811229== 
==1811229== Process terminating with default action of signal 11 (SIGSEGV)
==1811229==  Bad permissions for mapped region at address 0x239060
==1811229==    at 0x239060: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x1CF0A7: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x19A455: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x198472: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x1EE98A: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x1DEAD0: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x1CB918: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x1995AC: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x48AE51F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==1811229==    by 0x23901F: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x1999C0: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229==    by 0x197EF9: ??? (in /home/small-ci-user/gitrepos/roc2/roc/examples/platform-switching/rocLovesZig)
==1811229== 
==1811229== HEAP SUMMARY:
==1811229==     in use at exit: 0 bytes in 0 blocks
==1811229==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==1811229== 
==1811229== All heap blocks were freed -- no leaks are possible
==1811229== 
==1811229== For lists of detected and suppressed errors, rerun with: -s
==1811229== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

An interesting failure of building with --opt-size:

RUST_BACKTRACE=1 ./target/release/roc build ./examples/platform-switching/rocLovesZig.roc --opt-size
🔨 Rebuilding platform...
thread '<unnamed>' panicked at 'There must be a symtab section in the executable', crates/linker/src/elf.rs:1044:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/panicking.rs:67:14
   2: roc_linker::elf::preprocess_elf
   3: roc_linker::preprocess_host
   4: roc_build::program::build_and_preprocess_host_lowlevel
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'main' panicked at 'Failed to (re)build platform.: Any { .. }', crates/compiler/build/src/program.rs:981:46
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/panicking.rs:67:14
   2: core::result::unwrap_failed
             at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/result.rs:1651:5
   3: roc_build::program::build_loaded_file
   4: roc_build::program::build_file
   5: roc_cli::build
   6: roc::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

I compared strings rocLovesZig for the CI executable and one from a very similar VM. The CI one was targeting haswell target.x86.cpu.haswell, the VM one was targeting x86_64 target.x86.cpu.x86_64. I'm not sure if this target cpu is being set by zig, rust or us.

I've created an archive with the executables, asm, and strings output:
comparison.tar.gz

@Anton-4
Copy link
Collaborator

Anton-4 commented Sep 30, 2023

I'm going to take a stab at upgrading the debugir repo to llvm 16.

@Anton-4
Copy link
Collaborator

Anton-4 commented Oct 24, 2023

Nightlies are good 🎉
The PR should now be ready for final review.

@bhansconnect bhansconnect merged commit bacdfa3 into main Oct 25, 2023
14 checks passed
@bhansconnect bhansconnect deleted the zig-11-llvm-16 branch October 25, 2023 18:21
@rtfeldman
Copy link
Contributor

woooooooooo!!!! 😍 😍 😍 😍 😍

amazing work landing this!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants