-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bus error #1
Comments
Thanks for the bug report! I see you're running a Linux VM, not a unikernel, and guessing from the name, an HVM one. Since the tool is primarily designed as a unikernel profiler, I'm not surprised it isn't working all that well. Especially when you sample userspace application stacks, I don't expect uniprof to produce much useful output because of the missing symbol resolution. Obviously, crashing isn't the best behavior though. ;-) I'll try to see how to gracefully handle that situation. You're probably right that the crash is due to going into stacks that don't have frame pointers, and consequently trying to map or read bogus memory addresses. I could try to add an option, too, that allows you to only walk kernel stacks (since those are the ones one would likely have symbol information for). Have to think of the best way to do that, though... |
So, after thinking about this:
I have a patch for this ready, which seems to work well on x86. This also considers the arm version though, and I don't have a test platform for it right now. I hope I'll have time today or tomorrow to test this on arm before pushing it out. |
Could you try commit fab6d1e? It should fix the bus error, and instead show warnings whenever it tries to walk a stack and ends up with invalid addresses. |
Thanks; it doesn't crash anymore. It prints this:
and
What's the meaning of the 0 after the stack trace? |
The 0 effectively invalidates the printed stack trace, because it says "this trace occurred 0 times". The idea is that, since this stack walk didn't finish successfully, it's not trustworthy and most likely contains bogus addresses. So it still prints it out (in case the user is interested in it, for example, for debugging), but further processing or aggregation tools (like your stackcollapse.pl) should not consider them. The good thing is that it now works as intended if it can't successfully walk stacks, and doesn't crash any more. The bad thing is that it's a bit underwhelming: I would have expected it to at least pick up stack traces when its sampling hits stack for code that was compiled with frame pointers. Are all of them invalidated with a 0? |
ok, thanks. All 0:
|
So it does seem to hit only two different stacks, I wonder whether those both happen to be framepointer-less? |
I was running a microbenchmark I'd compiled with -fno-omit-frame-pointer. |
Alright, then I'll have to dig deeper into the issue. Sounds like there might be some issue, maybe with translating the addresses, or traversing the stack correctly. The tool's really been mostly used as a unikernel profiler so far, so I have done little testing on "real" OSs. At least the crash-bug seems resolved. Out of curiosity, did you test uniprof on a Linux VM because that's what you had available most conveniently? Because I've been wondering how useful the tool actually is for full-blown VMs, where you could profile your applications from inside the VM with all the kinds of applications that are already available for application profiling. That's one of the reasons I never looked much into it for uniprof. For unikernels, the need is much more obvious because there's a lack of tools, and you generally can't simply log into the VM and run perf. |
I just happened to have a Linux VM handy, and tried it out. You're right in that this is not the main use case. A number of us have speculated about what it would take for a dom0 profiler to work on all OSes, without needing to login and run a profiler in the guest (although we will need its symbol tables). But that's secondary. |
Just mucking around with an existing Xen server and Linux guests:
Of course, I also had to unpause the domain to let it continue:
If it's crashing when trying to walk frame pointers, then sure, Linux is likely running things that aren't using them...
It doesn't crash all the time. It does sometimes work.
The text was updated successfully, but these errors were encountered: