Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.0] remove EOS VM OC's erroneous GS register is not as expected message #2406

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

spoonincode
Copy link
Member

For some background and history see #1216 (comment), I'm not going to repeat that research here. But the TLDR of it is that since some CPUs ignore setting the GS register to 0x00, expecting the register to be 0x00 (and printing this warning otherwise) seems dubious: it sounds as if the kernel does not and can not guarantee that upon the start of a new process GS is 0x00.

Now that said, I can't reproduce any cases where GS is not cleared to 0x00 on the start of a new process. I even dug out a Zen 1 CPU, confirmed /proc/cpuinfo shows the null_seg bug, and still GS is reset to 0x00 on a new process after the parent process sets it to some value. Not sure what nuance I am misunderstanding about this 'bug', maybe some x86 guru can chime in.

But it doesn't matter: we're still seeing this warning regardless. What's going on?

Creating a new thread on Linux has defined behavior of copying the current registers -- this includes the segment registers. You can see the copying of GS here,
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/process.c?h=v6.10#n182
This is not new behavior, for example going back to 4.19 (just picking oldest current LTS version) can find similar code,
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/process_64.c?h=v4.19#n306

Empirically can also observe this behavior,

#include <thread>
#include <stdio.h>
#include <unistd.h>
#include <asm/prctl.h>
#include <sys/syscall.h>
#include <stdint.h>
#include <inttypes.h>
#include <assert.h>

int main() {
	uint64_t gs;
	assert(syscall(SYS_arch_prctl,ARCH_GET_GS, &gs) == 0);

	printf("GS at start: 0x%016" PRIx64 "\n", gs);

	assert(syscall(SYS_arch_prctl,ARCH_SET_GS, &printf) == 0);
	assert(syscall(SYS_arch_prctl,ARCH_GET_GS, &gs) == 0);

	printf("GS after set: 0x%016" PRIx64 "\n", gs);

	std::thread t([]() {
	        uint64_t tgs;
	        assert(syscall(SYS_arch_prctl,ARCH_GET_GS, &tgs) == 0);
	        printf("GS in thread start: 0x%016" PRIx64 "\n", tgs);
	});
	t.join();

	return 0;
}

The output can be,

GS at start: 0x0000000000000000
GS after set: 0x000079f7eaf4abc0
GS in thread start: 0x000079f7eaf4abc0

Thus, once we added read-only threads we introduced behavior that causes this GS register is not as expected message to erroneously print after a replay: the main thread's Executor will have set GS to its WASM memory for the replay, and when the read only threads are spawned after replay, they will inherit the non-zero GS register causing the read only threads' Executor to print this warning upon construction. This will look something like

warn  2024-09-19T18:50:56.777 read-0    executor.cpp:144              executor             ] x86_64 GS register is not set as expected. EOS VM OC may not run correctly on this platform
warn  2024-09-19T18:50:56.777 read-1    executor.cpp:144              executor             ] x86_64 GS register is not set as expected. EOS VM OC may not run correctly on this platform
warn  2024-09-19T18:50:56.777 read-2    executor.cpp:144              executor             ] x86_64 GS register is not set as expected. EOS VM OC may not run correctly on this platform

Since now with RO threads it is clearly wrong to be printing this scary warning I'm suggesting removal of it back to 5.0.

@ericpassmore
Copy link
Contributor

Note:start
category: Chores
component: Internal
summary: Clean up unhelpful GS register error.
Note:end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants