-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is .symtab so huge on riscv? #1036
Comments
This is due, at least in part, to the fact that RISC-V doesn’t compute branch targets until link time, to facilitate aggressive linker relaxation. By contrast, x86 and ARMv8 assemblers compute branch targets at assembly time, so don’t need to carry around some of those symbols. Ordinarily, this only bloats the intermediate build artifacts (and static libraries), not linked executables. |
That's very interesting, but even for trivial linked binaries the symtab has large difference in size, and has very weird contents.
Not sure how we go from 8 to 61 entries from intermediate build artifacts to a linked executable.
Why does .symtab start with Scrt1.o on x86_64, but has like the entire contents of all the sections that come before .symtab, at the start of .symtab prior to its Scrt1.o on riscv64? What purpose do they have? Going from 4->34 and 8->60 entries looks odd still. At the moment I am experimenting with using |
Yeah, my earlier response only addresses the various I did notice that |
This issue was brought up briefly on the RISC-V GNU Toolchain Biweekly Sync-up. It was pointed out that there probably is room to improve toolchain to emit less things under |
RISC-V is the only target that gets relaxation correct. I have a meta bug in binutils that describes relaxation related bugs with all other targets. This unfortunately results in a lot of extra symbols and relocations for RISC-V. Most of these other targets that use relaxation aren't linux capable targets though, they are almost all embedded only targets. Some of the issues are obscure, but they need to work for a linux toolchain, so RISC-V has to get them right. There is a known problem with GNU as that we still emit extra symbols and relocations when -mno-relax is used. Kernel modules are compiled -mno-relax. So fixing this would help reduce the symtab size. |
Only x86 defines TARGET_KEEP_UNUSED_SECTION_SYMBOLS to false so far, because doing so is known to break valid use cases. |
I didn't find an existing gcc bugzilla bug report for this, so i have now opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107251 |
Hi @xnox - thanks a lot for logging the GCC bug upstream. |
Those .L0 are possible used to make %pcrel_lo linked to the correct %pcrel_hi. If so, then -mno-relax cannot remove them, otherwise relocation will fail. |
What should I do to check about assertion? So far I have not seen issues with stripping those from kernel modules, but have not tried the same with userspace binaries yet. |
I've noticed that another source of f:
.cfi_startproc
nop
.cfi_endproc Produces the following relocations:
And this adds 2
The same code on x86 and aarch64 produces a similar relocation but uses section-relative relocation. For example for aarch64:
Is this something that could be done on riscv as well? |
Due to relaxation you do not know the final offset from the section anchor. |
I am very naive and don't understand things. When i strip locals from my linux kernel modules, they still load and work. What functionality am I missing, or how come they still work? Similarly in userspace as well. Should a linux distribution default to strip-locals on the final linked binaries, libraries, kernel modules if the tool-chain only needs them in the interim and not at runtime? Because I currently strip locals in riscv kernels in Ubuntu itself. |
Most binaries compiled on riscv have
.symtab
that is 2x-3x larger than on any other architecture. Is that normal and intentional, or is something wrong with my toolchain?For example you can see that even in Scrt1.o, where on riscv64 there are alsmost 3 times as many symbols defined.
This is further exaggerated for trivial hello-world binaries, and appears to become exponentially worse for linux kernel code and modules.
As an example see btrfs.ko module on x86_64 and riscv64:
I can understand that there are architectural differences and optimisations and more compact code on one arch versus the other, but it is unexpected to me that unstripped btrfs.ko with similar configuration of the kernel is 2x big on riscv64, that stripped module is 6x as big, and that locals take up 4x of the binary, and even then overall the binary is still 1.5x as big.
Is all of the above as expected and intentional that riscv64 toolchains emit a lot of
.L*
symbols in the symtab?Or is my toolchain somehow miss configured?
The text was updated successfully, but these errors were encountered: