Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] progressive progress reporting for darwin kernel/firmware #98845

Conversation

jasonmolenda
Copy link
Collaborator

When doing firmware/kernel debugging, it is frequent that binaries and debug info need to be retrieved / downloaded, and the lack of progress reports made for a poor experience, with lldb seemingly hung while downloading things over the network. This PR adds progress reports to the critical sites for these use cases.

When doing firmware/kernel debugging, it is frequent that
binaries and debug info need to be retrieved / downloaded,
and the lack of progress reports made for a poor experience,
with lldb seemingly hung while downloading things over the
network. This PR adds progress reports to the critical sites
for these use cases.
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 15, 2024

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

When doing firmware/kernel debugging, it is frequent that binaries and debug info need to be retrieved / downloaded, and the lack of progress reports made for a poor experience, with lldb seemingly hung while downloading things over the network. This PR adds progress reports to the critical sites for these use cases.


Full diff: https://github.com/llvm/llvm-project/pull/98845.diff

2 Files Affected:

  • (modified) lldb/source/Core/DynamicLoader.cpp (+24-3)
  • (modified) lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp (+25-14)
diff --git a/lldb/source/Core/DynamicLoader.cpp b/lldb/source/Core/DynamicLoader.cpp
index 7871be6fc451d..a59136381c23b 100644
--- a/lldb/source/Core/DynamicLoader.cpp
+++ b/lldb/source/Core/DynamicLoader.cpp
@@ -13,6 +13,7 @@
 #include "lldb/Core/ModuleList.h"
 #include "lldb/Core/ModuleSpec.h"
 #include "lldb/Core/PluginManager.h"
+#include "lldb/Core/Progress.h"
 #include "lldb/Core/Section.h"
 #include "lldb/Symbol/ObjectFile.h"
 #include "lldb/Target/MemoryRegionInfo.h"
@@ -195,20 +196,40 @@ ModuleSP DynamicLoader::LoadBinaryWithUUIDAndAddress(
   Target &target = process->GetTarget();
   Status error;
 
+  StreamString prog_str;
+  if (!name.empty()) {
+    prog_str << name.str() << " ";
+  }
+  if (uuid.IsValid())
+    prog_str << uuid.GetAsString();
+  if (value_is_offset == 0 && value != LLDB_INVALID_ADDRESS) {
+    prog_str << "at ";
+    prog_str.PutHex64(value);
+  }
+
   if (!uuid.IsValid() && !value_is_offset) {
+    Progress progress_memread("Reading load commands from memory",
+                              prog_str.GetString().str());
     memory_module_sp = ReadUnnamedMemoryModule(process, value, name);
 
-    if (memory_module_sp)
+    if (memory_module_sp) {
       uuid = memory_module_sp->GetUUID();
+      if (uuid.IsValid()) {
+        prog_str << " ";
+        prog_str << uuid.GetAsString();
+      }
+    }
   }
   ModuleSpec module_spec;
   module_spec.GetUUID() = uuid;
   FileSpec name_filespec(name);
-  if (FileSystem::Instance().Exists(name_filespec))
-    module_spec.GetFileSpec() = name_filespec;
 
   if (uuid.IsValid()) {
+    Progress progress("Locating external symbol file",
+                      prog_str.GetString().str());
+
     // Has lldb already seen a module with this UUID?
+    // Or have external lookup enabled in DebugSymbols on macOS.
     if (!module_sp)
       error = ModuleList::GetSharedModule(module_spec, module_sp, nullptr,
                                           nullptr, nullptr);
diff --git a/lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp b/lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
index 8d83937aab668..93eb1e7b9dea9 100644
--- a/lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
+++ b/lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
@@ -13,6 +13,7 @@
 #include "lldb/Core/Module.h"
 #include "lldb/Core/ModuleSpec.h"
 #include "lldb/Core/PluginManager.h"
+#include "lldb/Core/Progress.h"
 #include "lldb/Core/Section.h"
 #include "lldb/Interpreter/OptionValueProperties.h"
 #include "lldb/Symbol/ObjectFile.h"
@@ -757,6 +758,23 @@ bool DynamicLoaderDarwinKernel::KextImageInfo::LoadImageUsingMemoryModule(
     const ModuleList &target_images = target.GetImages();
     m_module_sp = target_images.FindModule(m_uuid);
 
+    std::unique_ptr<Progress> progress_up;
+    if (IsKernel()) {
+      StreamString prog_str;
+      // 'mach_kernel' is a fake name we make up to find kernels
+      // that were located by the local filesystem scan.
+      if (GetName() != "mach_kernel")
+        prog_str << GetName() << " ";
+      if (GetUUID().IsValid())
+        prog_str << GetUUID().GetAsString() << " ";
+      if (GetLoadAddress() != LLDB_INVALID_ADDRESS) {
+        prog_str << "at ";
+        prog_str.PutHex64(GetLoadAddress());
+      }
+      progress_up = std::make_unique<Progress>("Loading kernel",
+                                               prog_str.GetString().str());
+    }
+
     // Search for the kext on the local filesystem via the UUID
     if (!m_module_sp && m_uuid.IsValid()) {
       ModuleSpec module_spec;
@@ -1058,12 +1076,9 @@ void DynamicLoaderDarwinKernel::LoadKernelModuleIfNeeded() {
         }
       }
     }
-
-    if (m_kernel.GetLoadAddress() != LLDB_INVALID_ADDRESS) {
-      if (!m_kernel.LoadImageUsingMemoryModule(m_process)) {
+    if (m_kernel.GetLoadAddress() != LLDB_INVALID_ADDRESS)
+      if (!m_kernel.LoadImageUsingMemoryModule(m_process))
         m_kernel.LoadImageAtFileAddress(m_process);
-      }
-    }
 
     // The operating system plugin gets loaded and initialized in
     // LoadImageUsingMemoryModule when we discover the kernel dSYM.  For a core
@@ -1352,14 +1367,17 @@ bool DynamicLoaderDarwinKernel::ParseKextSummaries(
     for (uint32_t new_kext = 0; new_kext < num_of_new_kexts; new_kext++) {
       if (to_be_added[new_kext]) {
         KextImageInfo &image_info = kext_summaries[new_kext];
-        bool kext_successfully_added = true;
         if (load_kexts) {
+          std::string prog_str = kext_summaries[new_kext].GetName();
+          prog_str += " ";
+          prog_str += kext_summaries[new_kext].GetUUID().GetAsString();
+
+          Progress progress("Loading kext", prog_str);
           if (!image_info.LoadImageUsingMemoryModule(m_process)) {
             kexts_failed_to_load.push_back(std::pair<std::string, UUID>(
                 kext_summaries[new_kext].GetName(),
                 kext_summaries[new_kext].GetUUID()));
             image_info.LoadImageAtFileAddress(m_process);
-            kext_successfully_added = false;
           }
         }
 
@@ -1369,13 +1387,6 @@ bool DynamicLoaderDarwinKernel::ParseKextSummaries(
             m_process->GetStopID() == image_info.GetProcessStopId())
           loaded_module_list.AppendIfNeeded(image_info.GetModule());
 
-        if (load_kexts) {
-          if (kext_successfully_added)
-            s.Printf(".");
-          else
-            s.Printf("-");
-        }
-
         if (log)
           kext_summaries[new_kext].PutToLog(log);
       }

@jasonmolenda
Copy link
Collaborator Author

The one thing I was uncertain of is when I load kexts -- I have a known number of kexts that I will be attempting to load -- and I wasn't sure if I should log them individually as they happen (what I did), or if I should initialize a progress category with the number of kexts that lldb will be attempting to load, so the progress of the job overall can be visualized.

@medismailben
Copy link
Member

The one thing I was uncertain of is when I load kexts -- I have a known number of kexts that I will be attempting to load -- and I wasn't sure if I should log them individually as they happen (what I did), or if I should initialize a progress category with the number of kexts that lldb will be attempting to load, so the progress of the job overall can be visualized.

If we know ahead of time how many kext will be loaded, I think it would be better to provide that number to the progress object constructor, and pass the name of the kext (and other relevant information) in the Increment upgraded_details string argument. Otherwise this is great! LGTM!

@medismailben medismailben self-requested a review July 15, 2024 00:51
LoadImageUsingMemoryModule.  Remove the other progress reporting
from ParseKextSummaries.  Add 0x prefix manually before PutHex64.

In LoadImageUsingMemoryModule add the kext/kernel name to the
ModuleSpec so it will be printed by the "Loading symbol file for
..." progress report in the generic DebugSymbols printer.  Remove
the logic that was only calling the Platform GetSharedModule with
the kext/kernel filename if it was PlatformDarwinKernel.  First,
it should always be PlatformDarwinKernel already, but even if it's
not, the kext bundle ID or placeholder "mach_kernel" will not match
a real file, the restriction to only use the name for
PlatformDarwinKernel was unnecessary.

If I have a correct UUID for the kext/kernel, don't add a possibly
incorrect ArchSpec to the ModuleSpec.
@jasonmolenda
Copy link
Collaborator Author

Ah no, I misunderstood. The Increment method of Progress is intended for use where you have one progress status display that takes multiple steps to complete. It's a separate mechanism from "finding dSYM for file x" etc. I can't use an Increment style progress reporting in this PR.

if (!uuid.IsValid() && !value_is_offset) {
Progress progress_memread("Reading load commands from memory",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this string to "Reading object file from memory" instead of "Reading load commands from memory". This function is currently only used by Mach-O core files and the GDB remote stub, but this could be used for loading ELF files from memory as well.

Maybe this Progress dialog should go into ReadUnnamedMemoryModule so anyone that calls this function gets progress events when loading an object file from memory?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I think I'll conditionalize this progress with a if (!process->GetCoreFile()) -- reading a binary out of a local corefile will be very fast. Over a live gdb-remote connection, not so much.

Comment on lines 228 to 232
Progress progress("Locating external symbol file",
prog_str.GetString().str());

// Has lldb already seen a module with this UUID?
// Or have external lookup enabled in DebugSymbols on macOS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go into ModuleList::GetSharedModule() or at some function that is called by ModuleList::GetSharedModule()? If the ModuleList::GetSharedModule() function is really quick, I hate to add really spammy progress notifications that aren't useful. If there is code down in ModuleList::GetSharedModule() that locates an external symbol file, then the progress should go in that function that is known to take some amount of time, but not here in a general area IMHO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little tricky because we do call ModuleList::GetSharedModule which will return a module if it's already in the global module cache, and it will call into the DebugSymbols framework on macOS, where it might call an external program to do a slow copy of a binary to the local computer. But after that, it then goes on to call LocateExecutableSymbolFile LocateExecutableObjectFile looking in known local filesystem locations. If that fails, then it calls DownloadObjectAndSymbolFile (force_symbol_search=true) which can call out to an external program to do a slow copy of a binary to the local computer.

I wanted to log one message to cover possibly all of these being run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern here is this will spam the progress notifications and also be a bit misleading. The title says "Locating external symbol file" when what this is actually doing is loading the module from a module_spec. It might end up locating an external symbol file. We know reading modules from memory is slow, so fine to make a progress for this, but It would be better to actually put this in the DownloadObjectAndSymbolFile() with this label, or change the label to something more like "Loading module" with the detail being the prog_str

Comment on lines +784 to +785
if (!m_uuid.IsValid())
module_spec.GetArchitecture() = target.GetArchitecture();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to check if the UUID is valid in the if statement and then set the architecture?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have a UUID, it's authoritative, whereas the ArchSpec might be heuristically determined. I don't like setting both in a ModuleSpec if the UUID is valid, it it noramlly fine but it's a little footgun waiting for some unusual combination where the heuristically determined ArchSpec is not quite the same ("compatible") with the arch of the UUID specified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I shouldn't be making changes like this at the same time as adding progress status logging, but it irked me when I saw it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be clear, it's almost always the "environment" or "os" part of the triple, which is nearly an entire fiction with firmware style debugging, that is the problem. One binary will say "hey I'm iOS" and another binary that needs to be loaded also is like "I'm something else" and lldb will reject the module load even though the UUIDs match.

Move the logging about reading a binary from memory from
DynamicLoader::LoadBinaryWithUUIDAndAddress into
Process::ReadModuleFromMemory.  Rename the progress title
to not be mach-o specific.  Only issue the progress update
if this is a non-corefile Process; lldb will read the
binary out of the corefile so fast that logging is not
useful.
@jasonmolenda
Copy link
Collaborator Author

Thanks for the helpful feedback @clayborg pushed an update.

In `ParseKextSummaries()` I know how many kexts will be loaded,
create a Progress object there with that number, pass it in to
`LoadImageUsingMemoryModule()`, and issue progressive updates if
the caller passed in an existing Progress object.  Else do a one-off
progress update.  This give us a 1-out-of-n number for the updates,
e.g.

[30/180] Loading kext: com.apple.Libm 5D39F799-ADEA-3CAE-8A7B-830CE3EAEF72 at 0xfffffe002aa6b98..

Thanks to Chelsea for pointing out how I could use the API more
correctly.
@@ -2545,6 +2546,11 @@ ModuleSP Process::ReadModuleFromMemory(const FileSpec &file_spec,
ModuleSP module_sp(new Module(file_spec, ArchSpec()));
if (module_sp) {
Status error;
std::unique_ptr<Progress> progress_up;
if (!GetCoreFile())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to use the bool accessor function:

if (IsLiveDebugSession())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is a little misleading, IsLiveDebugSession is only false for PostMortemProcess. Maybe we should have a Process::IsCorefileProcess or something, or I could put a comment there about what I'm doing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostMortemProcess is what all of the core file processes (elf, mach-o and minidump) inherit from. That way all core file processes don't forget to override this function. I agree the name isn't the most clear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops, my bad on that one, thanks. I'll change to use this instead, it's clearer.

Comment on lines 228 to 232
Progress progress("Locating external symbol file",
prog_str.GetString().str());

// Has lldb already seen a module with this UUID?
// Or have external lookup enabled in DebugSymbols on macOS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern here is this will spam the progress notifications and also be a bit misleading. The title says "Locating external symbol file" when what this is actually doing is loading the module from a module_spec. It might end up locating an external symbol file. We know reading modules from memory is slow, so fine to make a progress for this, but It would be better to actually put this in the DownloadObjectAndSymbolFile() with this label, or change the label to something more like "Loading module" with the detail being the prog_str

Copy link

github-actions bot commented Jul 16, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -2545,6 +2546,11 @@ ModuleSP Process::ReadModuleFromMemory(const FileSpec &file_spec,
ModuleSP module_sp(new Module(file_spec, ArchSpec()));
if (module_sp) {
Status error;
std::unique_ptr<Progress> progress_up;
if (!GetCoreFile())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostMortemProcess is what all of the core file processes (elf, mach-o and minidump) inherit from. That way all core file processes don't forget to override this function. I agree the name isn't the most clear.

@jasonmolenda jasonmolenda merged commit 86ef699 into llvm:main Jul 17, 2024
4 of 5 checks passed
@jasonmolenda jasonmolenda deleted the add-darwin-kernel-debugging-and-firmware-progress-updates branch July 17, 2024 17:05
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
…m#98845)

When doing firmware/kernel debugging, it is frequent that binaries and
debug info need to be retrieved / downloaded, and the lack of progress
reports made for a poor experience, with lldb seemingly hung while
downloading things over the network. This PR adds progress reports to
the critical sites for these use cases.
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
)

Summary:
When doing firmware/kernel debugging, it is frequent that binaries and
debug info need to be retrieved / downloaded, and the lack of progress
reports made for a poor experience, with lldb seemingly hung while
downloading things over the network. This PR adds progress reports to
the critical sites for these use cases.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251595
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants