-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement kallsyms for vmlinux only #177
Conversation
This [1] patch series includes kallsyms related symbols in the vmcoreinfo note of the kernel. These symbols can be used to locate, load, and interpret the kernel symbol table. Read them into the drgn program when vmcoreinfo is parsed. [1]: https://lore.kernel.org/lkml/YoTIMEPAxLF9t2eo@MiWiFi-R3L-srv/ Signed-off-by: Stephen Brennan <[email protected]>
If kallsyms symbols are found in the vmcoreinfo, and DWARF info is not found, then we can try to parse the kallsyms to provide symbol data. This commit adds a kernel_info system which will contain any internal debuginfo, and adds a kallsyms registry as the first component of the kernel info. At initialization time, when DWARF is not available, parse the kallsyms into an array of symbols, addresses, and symbol types. Signed-off-by: Stephen Brennan <[email protected]>
Symbol lookup is not modular, like type or object lookup. To enable kallsyms as a source for systems, we need a pluggable Symbol Finder API. Define this symbol finder API as a flexible search which allows looking up one or more symbols, listing all symbols, as well as conditioning based on name, address, or both. Refactor the ELF symbol search to fit this API, and make the existing symbol lookups based on top of this. Leave drgn_program_find_symbol_by_address_internal() alone -- at least for now. It includes a Dwfl_Module argument which can't be provided to the Symbol Finder API. Signed-off-by: Stephen Brennan <[email protected]>
With a Symbol Finder API now in place, implement the API for kallsyms, and register the finder when the kallsyms registry is created. Signed-off-by: Stephen Brennan <[email protected]>
f753cf7
to
3bf70cc
Compare
static struct drgn_error * | ||
elf_symbols_search(const char *name, uint64_t addr, enum drgn_find_symbol_flags flags, | ||
void *data, union drgn_find_symbol_result *ret); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm reminder for myself, this needs to go into a header. Sorry for missing that.
Over the weekend Andrew Morton moved my vmcoreinfo updates into his mm-nonmm-stable queue, which I think will be included in his pull requests for the 5.20 merge window. So hopefully this will be applicable to upstream kernels soon! |
Awesome! I'm currently working on adding some BPF tests to unblock #152, and then this will be the next thing I turn my focus to. |
I gave this a high-level look and it looks really nice overall. The kallsyms format is more complicated than I expected, but there's not much we can do about that :) The big thing we need to figure out is when we should load and use kallsyms. It looks like this PR loads the main vmlinux kallsyms if debug info for any module is missing, which makes sense for this proof of concept. But presumably, kallsyms is only useful if we don't have the symbols available for vmlinux, right? And similarly for kernel modules, we probably only want the kallsyms for modules whose symbols we didn't find? I'm thinking that the ideal strategy would be to lazily load kallsyms the first time you try to get look up a symbol in a module (vmlinux or kernel module) that doesn't have symbols available from a file. Otherwise, we're potentially going to be wasting work and memory loading the kallsyms data. What do you think? How expensive is loading kallsyms, CPU- and memory-wise? If this is the strategy we want to go with, the mechanism to make it happen might be a little tricky. That'll need more discussion. |
Yeah, the integration with the other debuginfo sources is going to be the trickier part of this whole thing. I had envisioned this system as a binary choice - either all DWARF debuginfo with the ELF symbol table, or kallsyms with the BTF type info. I hadn't really given any thought to the option of using one to supplement the other when it's unavailable. In terms of performance overhead, did a
This gives an overhead of 24 milliseconds ((11.218 - 6.384) / 200 = 0.02417) for the kallsyms loading as currently written. In terms of memory, I loaded the same vmcore (it's 5.4 based, Oracle UEK6, but I think it's reasonably representative for an average distribution kernel) and saw the allocations were:
All told, about 3.35 MiB of allocations on this kernel, and of course this without modules. So there's real overhead here in CPU and memory, but it's not outrageous. The module symbol loading will add more memory and and a bit of CPU overhead, but should be easier on both, because there are fewer symbols and also none of the compression to undo. I don't know where this leaves us regarding lazy loading kallsyms. Compared to DWARF handling, I'd assume this is still faster, so it makes sense to fallback to it when there's no DWARF. Module kallsyms could be supported without needing to do any of the complex parsing, so it should be faster, and it wouldn't depend on vmlinux kallsyms. However, module kallsyms would still depend on type information provided by DWARF or BTF. So we'd need to be mindful of all kinds of dependencies for using kallsyms, which might get a bit ... unwieldy. I do think it could be valuable to put some of the policy into separate |
To try to clarify dependencies a bit:
|
Great, I just wanted to note that I would like to see I noticed
where
|
Hi @marxin, this branch is quite old. It actually has been partially rebased and improved already. Let me try to break down what the current state is.
So to sum up, what I'd like to do for this is:
Heh, crash has a few tricks up its sleeve. If you look at Finally - do you want an invite to the Linux Kernel Debug slack? Some lower-latency drgn development discussions can happen there, and there's some community from outside the drgn world (libkdumpfile and crash folks, as well as Delphix/sdb folks, and others who are interested in this stuff). Essentially the same community as the monthly meetings (which are currently on hold for a couple months). If so, just shoot me an email via the commit author email from any of my branches/commits, and I'll get you setup :) |
Hey. This sounds like a feasible solution one can use in the future. Actually, it's nothing super-important, I was only interested where crash founds the symbol name and I realized it's something provided by kallsysm.
Thank you for the offer! However, my interest in drgn is only a side project and I'll be leaving my employer soon. On the other hand, there are my current colleagues @tehcaster and @Werkov who are interested in drgn and itss possible use as a |
@marxin you can check out the rebased branch, the one which contains both the symbol finder API, and the kallsyms symbol finder API, in my For this pull request, I'll close it for two reasons. First, I've rebased the branch and now this one is outdated. Second, whatever happens with the symbol API, kallsyms finders, and CTF implementation described there... it all depends on the module API rework which is nearing completion. So the final implementation won't look like the branch here. I'm closing my PRs and just describing the current status on isuse #176 with links to current branches. Hopefully that will generate less noise in the issue / PR board. |
Oh and @tehcaster and @Werkov, please do feel free to reach out to me (see the committer email in one of my branches) and I'll gladly add you to the slack. I strongly feel that we all should be working together on debugging tools :) |
Hi Omar,
I've been reworking my very work-in-progress branch for BTF & kallsyms, and I sketched out my approach in #176. This pull request implements step 1 - kallsyms parsing for vmlinux only. Using symbols from vmcoreinfo (patches for that look to be accepted), this parses and reads the compressed kallsyms data and extracts it to some easily-queried arrays.
To make this patch have a functional, testable result, I've refactored the ELF symbol code which I had earlier worked on -- it is now modular. Then, I implemented the new symbol API for kallsyms. I'm not absolutely wedded to the particular API design I chose, but I wanted to start somewhere. I don't have a ton of work building off of this patch series yet, so we can iterate on the API if you have better ideas.
So now, we're able to load a vmcore without any debuginfo or ELF symbol table, and do the following: