Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows/Linux disassembly differences when using #include ../something.h? #9034

Closed
marc-hb opened this issue Apr 11, 2024 · 5 comments · Fixed by #9071
Closed

Windows/Linux disassembly differences when using #include ../something.h? #9034

marc-hb opened this issue Apr 11, 2024 · 5 comments · Fixed by #9071
Labels
bug Something isn't working as expected P3 Low-impact bugs or features

Comments

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 11, 2024

Originally posted by @marc-hb in #9021 (comment)

Same, previously seen disassembly difference in https://github.com/thesofproject/sof/actions/runs/8630426621?pr=9021 already observed in #8863

--- win/build-sof-staging/sof-info/imx8x/zephyr.lst	2024-04-10 19:30:27
+++ lin/build-sof-staging/sof-info/imx8x/zephyr.lst	2024-04-04 09:12:34
@@ -81134,6 +81134,12 @@
 92433b10:	833441                      	l32r	a4, 924147e0 <_stext+0x1598> (92413034 <s_lpuartClock>)
 92433b13:	a03340                      	addx4	a3, a3, a4
 92433b16:	03a8                          	l32i.n	a10, a3, 0
+ * @param name  Which clock to enable, see \ref clock_ip_name_t.
+ * @return true for success, false for failure.
+ */
+static inline bool CLOCK_EnableClock(clock_ip_name_t name)
+{
+    return CLOCK_EnableClockExt(name, 0);
 92433b18:	31e9                          	s32i.n	a14, a1, 12
 92433b1a:	00a0b2                      	movi	a11, 0
 92433b1d:	ffcfa5                      	call8	92433818 <CLOCK_EnableClockExt>

from your experience, do you happen to know what would cause differences between windows and linux builds?

I downloaded the files, did some inspection, jogged my memory and searched archives and I think I may have found something... Do you have any .. in some #include paths by any chance? We could have a situation similar to these:

So the inline C code seems correct in both cases, but for some reason the same zephyr-sdk/xtensa-...-objdump -S is "not as good" when running on Windows and can't find the source of the inline function CLOCK_EnableClock() when disassembling LPUART_Init()... even though both Linux and Windows find that same source earlier in the binary when disassembling mcux_ccm_on()!

There's one notable difference here: mcux_ccm_on() is located in Zephyr whereas LPUART_Init() is located in modules/hal/nxp! Which is why I'm suspecting #include paths?

I think the simple reason it is appearing now is because you're now enabling some lpuart in this commit.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Apr 11, 2024

Results here: https://github.com/thesofproject/sof/actions/runs/8645734717/job/23703888439?pr=9029. @marc-hb would you mind pasting the diffs for that as well? Can't seem to be able to check it out in the CI job :(.

You have to click on the "Summary" / home button in the top-left to access the build artefacts. Don't ask me why.

Then don't forget to run "dos2unix" before comparing text files.

Also, is there any other info in the CI job that may help with debugging this?

I don't think there is a lot CI can do at this point.

Based on this example and the previous ones I referenced above, I have a strong suspicion that some debug sections are "not as good" on Windows, likely because of some funky #include "../something.h". So the next step is to compile on both Linux and Windows and manually compare the debug sections for LPUART_Init() to confirm that suspicion.

We can't automate the comparison of debug sections because they're almost always full of absolute paths.

IF I'm guessing this correctly, then we have 2 approaches to solve this problem:

  1. Either we decide that a slightly inferior debug inferior on Windows is OK. Who uses Windows anyway? :-) Debug sections don't actually matter for reproducible builds; now they seem to just "get in the way". Then the best course is probably a new Zephyr CONFIG_OUTPUT_DISASSEMBLE_WITH_SOURCE that defaults to "yes" but that we can turn off in this test. That new Kconfig would just turn off disassembly_flag_inline_source and problem solved! It should be a small CMake change; there is already CONFIG_OUTPUT_DISASSEMBLE_ALL from @tejlmand that does something somewhat similar.

  2. Or you care about the debug experience on Windows and you think reproducible builds is a surprising and indirect but effective way to keep the debug experience on par with the Linux one. Then you need to find why debug sections are not as good on Windows. Hopefully just some relative #include to convert to absolute?

But again: the first thing is to prove my educated guess about debug sections. EDIT: I would start with comparing xtensa-...-objdump -S LPUART_Init.o between Linux and Windows.

@marc-hb marc-hb changed the title Disassembly differences when using #include ../something.h? Windows/Linux disassembly differences when using #include ../something.h? Apr 11, 2024
@marc-hb marc-hb added the bug Something isn't working as expected label Apr 11, 2024
@LaurentiuM1234
Copy link
Contributor

Strangely enough, dropping all of the static inline stuff from CLOCK_EnableClock()'s definition seems to fix this issue?

i.e: Inside fsl_clock.h we change

static inline bool CLOCK_EnableClock(clock_ip_name_t name)
{
    return CLOCK_EnableClockExt(name, 0);
}

to

bool CLOCK_EnableClock(clock_ip_name_t name);

and move the definition of CLOCK_EnableClock() to fsl_clock.c.

(note: fsl_clock.h is included by fsl_common_dsp.h, which is included by fsl_common.h, which, in turn is included by fsl_lpuart.h. This header is used by the LPUART driver)

Results in https://github.com/thesofproject/sof/actions/runs/8689213734/job/23827370722?pr=9033 (ignore 8ULP which doesn't have this change).
HAL change in https://github.com/zephyrproject-rtos/hal_nxp/pull/374/files.

So far I wasn't able to find any ".." include paths in the sources/headers used for SOF.

IF I'm guessing this correctly, then we have 2 approaches to solve this problem:

  1. Either we decide that a slightly inferior debug inferior on Windows is OK. Who uses Windows anyway? :-) Debug sections don't actually matter for reproducible builds; now they seem to just "get in the way". Then the best course is probably a new Zephyr CONFIG_OUTPUT_DISASSEMBLE_WITH_SOURCE that defaults to "yes" but that we can turn off in this test. That new Kconfig would just turn off disassembly_flag_inline_source and problem solved! It should be a small CMake change; there is already CONFIG_OUTPUT_DISASSEMBLE_ALL from @tejlmand that does something somewhat similar.
  2. Or you care about the debug experience on Windows and you think reproducible builds is a surprising and indirect but effective way to keep the debug experience on par with the Linux one. Then you need to find why debug sections are not as good on Windows. Hopefully just some relative #include to convert to absolute?

The main issue that I have with 2) is that it's time consuming and I'm not sure it's worth the effort (like you said, who uses Windows?). These issues are (to me) somewhat annoying to debug (I don't have a Windows environment so I have to rely on the CI to test stuff out) and very unintuitive (I can't really tell you why the aforementioned fix works).

@marc-hb
Copy link
Collaborator Author

marc-hb commented Apr 15, 2024

It could also be a bug somewhere in the toolchain that gets triggered only on Windows somehow. So a 3rd option would be to try to update the SDK? As @dbaluta did in earlier and small commit 45cbf04. It's a minor release so it's unlikely to hurt anyway.

Otherwise can you help with option 1. CONFIG_OUTPUT_DISASSEMBLE_WITH_SOURCE? It's CMake and Kconfig which very few enjoy but it looks pretty trivial.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Apr 15, 2024

CONFIG_OUTPUT_DISASSEMBLE_WITH_SOURCE

Small tangent: after optimization it's extremely hard to connect machine code with source code, anyone who has ever tried to debug optimized code knows this.
https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html

Disassembly runs into the same issue. So maybe using -S on optimized code is too complex to be reliable? Now we're seeing a Windows-only bug, but how many "portable" -S bugs have we ignored so far? Maybe a lot.

@LaurentiuM1234
Copy link
Contributor

LaurentiuM1234 commented Apr 16, 2024

Otherwise can you help with option 1. CONFIG_OUTPUT_DISASSEMBLE_WITH_SOURCE? It's CMake and Kconfig which very few enjoy but it looks pretty trivial.

Yep, created zephyrproject-rtos/zephyr#71535. Results in: #9051 (ignore zmain failures, should be caused by the fact that this change is not upstream yet). 8ULP failures and the 8QXP failures introduced by #9021 seem fixed.

LaurentiuM1234 added a commit to LaurentiuM1234/zephyr that referenced this issue Apr 17, 2024
By default, the generated disassembly file (i.e: the .lst
file) also contains inline source code. This has the
potential to become problematic when attempting to compare
the generated .lst files accross platforms since they may
differ for various reasons. As such, add the option to
control whether the disassembly file should contain inline
source code or not.

The need for this patch was sparked by the differences
observed in the generated .lst file for Linux and Windows
in one of SOF's CI jobs (details in thesofproject/sof/issues/9034),
which were caused by the addition of the inline source code.
With this, we can keep testing for reproducible builds while
not having to deal with differences in the .lst files caused
by things such as having ".." include paths.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
LaurentiuM1234 added a commit to LaurentiuM1234/zephyr that referenced this issue Apr 19, 2024
By default, the generated disassembly file (i.e: the .lst
file) also contains inline source code. This has the
potential to become problematic when attempting to compare
the generated .lst files accross platforms since they may
differ for various reasons. As such, add the option to
control whether the disassembly file should contain inline
source code or not.

The need for this patch was sparked by the differences
observed in the generated .lst file for Linux and Windows
in one of SOF's CI jobs (details in thesofproject/sof/issues/9034),
which were caused by the addition of the inline source code.
With this, we can keep testing for reproducible builds while
not having to deal with differences in the .lst files caused
by things such as having ".." include paths.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
carlescufi pushed a commit to zephyrproject-rtos/zephyr that referenced this issue Apr 22, 2024
By default, the generated disassembly file (i.e: the .lst
file) also contains inline source code. This has the
potential to become problematic when attempting to compare
the generated .lst files accross platforms since they may
differ for various reasons. As such, add the option to
control whether the disassembly file should contain inline
source code or not.

The need for this patch was sparked by the differences
observed in the generated .lst file for Linux and Windows
in one of SOF's CI jobs (details in thesofproject/sof/issues/9034),
which were caused by the addition of the inline source code.
With this, we can keep testing for reproducible builds while
not having to deal with differences in the .lst files caused
by things such as having ".." include paths.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
LaurentiuM1234 added a commit to LaurentiuM1234/sof that referenced this issue Apr 22, 2024
By default, the disassembly file (.lst) generated
during the build process when enabling `CONFIG_OUTPUT_DISASSEMBLY`
will also contain inline source code. This is not ideal
for reproducible builds that compare the .lst files
obtained on different platforms (i.e: Windows and Linux)
because of the differences in the inline source code that
may appear.

One of the identified causes for such differences were
the ".." include paths, which resulted in the Windows
.lst omitting some bits of the inline source code that
were present in the Linux .lst file.

Because these issues are hard to debug and unintuitive,
the solution is to disable the inline source code. This way,
the CI can keep testing for reproductibility using just the
assembly and machine code from the .lst file, which are
more important than the inline source code.

This fixes thesofproject/issues/9034.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
LaurentiuM1234 added a commit to LaurentiuM1234/sof that referenced this issue Apr 22, 2024
By default, the disassembly file (.lst) generated
during the build process when enabling `CONFIG_OUTPUT_DISASSEMBLY`
will also contain inline source code. This is not ideal
for reproducible builds that compare the .lst files
obtained on different platforms (i.e: Windows and Linux)
because of the differences in the inline source code that
may appear.

One of the identified causes for such differences were
the ".." include paths, which resulted in the Windows
.lst omitting some bits of the inline source code that
were present in the Linux .lst file.

Because these issues are hard to debug and unintuitive,
the solution is to disable the inline source code. This way,
the CI can keep testing for reproductibility using just the
assembly and machine code from the .lst file, which are
more important than the inline source code.

This fixes thesofproject/issues/9034.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
LaurentiuM1234 added a commit to LaurentiuM1234/sof that referenced this issue Apr 22, 2024
By default, the disassembly file (.lst) generated
during the build process when enabling `CONFIG_OUTPUT_DISASSEMBLY`
will also contain inline source code. This is not ideal
for reproducible builds that compare the .lst files
obtained on different platforms (i.e: Windows and Linux)
because of the differences in the inline source code that
may appear.

One of the identified causes for such differences were
the ".." include paths, which resulted in the Windows
.lst omitting some bits of the inline source code that
were present in the Linux .lst file.

Because these issues are hard to debug and unintuitive,
the solution is to disable the inline source code. This way,
the CI can keep testing for reproductibility using just the
assembly and machine code from the .lst file, which are
more important than the inline source code.

This fixes thesofproject/issues/9034.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
kv2019i pushed a commit that referenced this issue Apr 23, 2024
By default, the disassembly file (.lst) generated
during the build process when enabling `CONFIG_OUTPUT_DISASSEMBLY`
will also contain inline source code. This is not ideal
for reproducible builds that compare the .lst files
obtained on different platforms (i.e: Windows and Linux)
because of the differences in the inline source code that
may appear.

One of the identified causes for such differences were
the ".." include paths, which resulted in the Windows
.lst omitting some bits of the inline source code that
were present in the Linux .lst file.

Because these issues are hard to debug and unintuitive,
the solution is to disable the inline source code. This way,
the CI can keep testing for reproductibility using just the
assembly and machine code from the .lst file, which are
more important than the inline source code.

This fixes /issues/9034.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
coreboot-org-bot pushed a commit to coreboot/zephyr-cros that referenced this issue Apr 23, 2024
By default, the generated disassembly file (i.e: the .lst
file) also contains inline source code. This has the
potential to become problematic when attempting to compare
the generated .lst files accross platforms since they may
differ for various reasons. As such, add the option to
control whether the disassembly file should contain inline
source code or not.

The need for this patch was sparked by the differences
observed in the generated .lst file for Linux and Windows
in one of SOF's CI jobs (details in thesofproject/sof/issues/9034),
which were caused by the addition of the inline source code.
With this, we can keep testing for reproducible builds while
not having to deal with differences in the .lst files caused
by things such as having ".." include paths.

(cherry picked from commit fb7e937)

Original-Signed-off-by: Laurentiu Mihalcea <[email protected]>
GitOrigin-RevId: fb7e937
Change-Id: I8fd219cb90ae3553bf81465b18835b85adf5eb0b
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/zephyr/+/5474826
Tested-by: ChromeOS Prod (Robot) <[email protected]>
Tested-by: Dawid Niedźwiecki <[email protected]>
Reviewed-by: Dawid Niedźwiecki <[email protected]>
Commit-Queue: Dawid Niedźwiecki <[email protected]>
jman168 pushed a commit to Gator-Motorsports/zephyr that referenced this issue Apr 25, 2024
By default, the generated disassembly file (i.e: the .lst
file) also contains inline source code. This has the
potential to become problematic when attempting to compare
the generated .lst files accross platforms since they may
differ for various reasons. As such, add the option to
control whether the disassembly file should contain inline
source code or not.

The need for this patch was sparked by the differences
observed in the generated .lst file for Linux and Windows
in one of SOF's CI jobs (details in thesofproject/sof/issues/9034),
which were caused by the addition of the inline source code.
With this, we can keep testing for reproducible builds while
not having to deal with differences in the .lst files caused
by things such as having ".." include paths.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
eddy1021 pushed a commit to eddy1021/sof that referenced this issue Jul 15, 2024
By default, the disassembly file (.lst) generated
during the build process when enabling `CONFIG_OUTPUT_DISASSEMBLY`
will also contain inline source code. This is not ideal
for reproducible builds that compare the .lst files
obtained on different platforms (i.e: Windows and Linux)
because of the differences in the inline source code that
may appear.

One of the identified causes for such differences were
the ".." include paths, which resulted in the Windows
.lst omitting some bits of the inline source code that
were present in the Linux .lst file.

Because these issues are hard to debug and unintuitive,
the solution is to disable the inline source code. This way,
the CI can keep testing for reproductibility using just the
assembly and machine code from the .lst file, which are
more important than the inline source code.

This fixes thesofproject/issues/9034.

Signed-off-by: Laurentiu Mihalcea <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected P3 Low-impact bugs or features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants