Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Implement WebAssembly debugging #77949

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

paolosevMSFT
Copy link
Contributor

Add support for source-level debugging of WebAssembly code that runs in a WebAssembly engine.

The idea is to use the GDB-remote protocol to connect to a Wasm engine that implements a GDB-remote stub that offers the ability to access the engine runtime internal state. This protocol is already supported by the WebAssembly Micro Runtime and the WAMR team greatly contributed to this work.

A WebAssembly debugging session can be started using the new command:
wasm [<hostname>:]<portnum>

The implementation is almost completely contained in Plugin classes and the only Wasm-specific changes to Core classes is in class DWARFExpression that handles a new Wasm-specific DWARF opcode (DW_OP_WASM_location) added to the DWARF standard.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2024

@llvm/pr-subscribers-lldb

Author: Paolo Severini (paolosevMSFT)

Changes

Add support for source-level debugging of WebAssembly code that runs in a WebAssembly engine.

The idea is to use the GDB-remote protocol to connect to a Wasm engine that implements a GDB-remote stub that offers the ability to access the engine runtime internal state. This protocol is already supported by the WebAssembly Micro Runtime and the WAMR team greatly contributed to this work.

A WebAssembly debugging session can be started using the new command:
wasm [&lt;hostname&gt;:]&lt;portnum&gt;

The implementation is almost completely contained in Plugin classes and the only Wasm-specific changes to Core classes is in class DWARFExpression that handles a new Wasm-specific DWARF opcode (DW_OP_WASM_location) added to the DWARF standard.


Patch is 38.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77949.diff

16 Files Affected:

  • (modified) lldb/include/lldb/Target/Process.h (+40)
  • (modified) lldb/source/Core/Value.cpp (+1-1)
  • (modified) lldb/source/Expression/DWARFExpression.cpp (+41)
  • (modified) lldb/source/Interpreter/CommandInterpreter.cpp (+18)
  • (modified) lldb/source/Plugins/Process/CMakeLists.txt (+1)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp (+6-1)
  • (modified) lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h (+2)
  • (added) lldb/source/Plugins/Process/wasm/CMakeLists.txt (+12)
  • (added) lldb/source/Plugins/Process/wasm/ProcessWasm.cpp (+291)
  • (added) lldb/source/Plugins/Process/wasm/ProcessWasm.h (+129)
  • (added) lldb/source/Plugins/Process/wasm/ThreadWasm.cpp (+55)
  • (added) lldb/source/Plugins/Process/wasm/ThreadWasm.h (+44)
  • (added) lldb/source/Plugins/Process/wasm/UnwindWasm.cpp (+76)
  • (added) lldb/source/Plugins/Process/wasm/UnwindWasm.h (+55)
  • (added) lldb/source/Plugins/Process/wasm/wasmRegisterContext.cpp (+108)
  • (added) lldb/source/Plugins/Process/wasm/wasmRegisterContext.h (+75)
diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index 24c599e044c78f..587ae085b479b7 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -1548,6 +1548,46 @@ class Process : public std::enable_shared_from_this<Process>,
   virtual size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
                             Status &error);
 
+  /// Read of memory from a process.
+  ///
+  /// This function will read memory from the current process's address space
+  /// and remove any traps that may have been inserted into the memory.
+  ///
+  /// This overloads accepts an ExecutionContext as additional argument. By
+  /// default, it calls the previous overload without the ExecutionContext
+  /// argument, but it can be overridden by Process subclasses.
+  ///
+  /// \param[in] vm_addr
+  ///     A virtual load address that indicates where to start reading
+  ///     memory from.
+  ///
+  /// \param[out] buf
+  ///     A byte buffer that is at least \a size bytes long that
+  ///     will receive the memory bytes.
+  ///
+  /// \param[in] size
+  ///     The number of bytes to read.
+  ///
+  /// \param[in] exe_ctx
+  ///    The current execution context, if available.
+  ///
+  /// \param[out] error
+  ///     An error that indicates the success or failure of this
+  ///     operation. If error indicates success (error.Success()),
+  ///     then the value returned can be trusted, otherwise zero
+  ///     will be returned.
+  ///
+  /// \return
+  ///     The number of bytes that were actually read into \a buf. If
+  ///     the returned number is greater than zero, yet less than \a
+  ///     size, then this function will get called again with \a
+  ///     vm_addr, \a buf, and \a size updated appropriately. Zero is
+  ///     returned in the case of an error.
+  virtual size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+                            ExecutionContext *exe_ctx, Status &error) {
+    return ReadMemory(vm_addr, buf, size, error);
+  }
+
   /// Read of memory from a process.
   ///
   /// This function has the same semantics of ReadMemory except that it
diff --git a/lldb/source/Core/Value.cpp b/lldb/source/Core/Value.cpp
index 995cc934c82044..47a5fdee773886 100644
--- a/lldb/source/Core/Value.cpp
+++ b/lldb/source/Core/Value.cpp
@@ -552,7 +552,7 @@ Status Value::GetValueAsData(ExecutionContext *exe_ctx, DataExtractor &data,
 
         if (process) {
           const size_t bytes_read =
-              process->ReadMemory(address, dst, byte_size, error);
+              process->ReadMemory(address, dst, byte_size, exe_ctx, error);
           if (bytes_read != byte_size)
             error.SetErrorStringWithFormat(
                 "read memory from 0x%" PRIx64 " failed (%u of %u bytes read)",
diff --git a/lldb/source/Expression/DWARFExpression.cpp b/lldb/source/Expression/DWARFExpression.cpp
index fe4928d4f43a43..ca24611724dc7c 100644
--- a/lldb/source/Expression/DWARFExpression.cpp
+++ b/lldb/source/Expression/DWARFExpression.cpp
@@ -346,6 +346,16 @@ static offset_t GetOpcodeDataSize(const DataExtractor &data,
     return (offset - data_offset) + subexpr_len;
   }
 
+  case DW_OP_WASM_location: {
+    uint8_t wasm_op = data.GetU8(&offset);
+    if (wasm_op == 3) {
+      data.GetU32(&offset);
+    } else {
+      data.GetULEB128(&offset);
+    }
+    return offset - data_offset;
+  }
+
   default:
     if (!dwarf_cu) {
       return LLDB_INVALID_OFFSET;
@@ -2595,6 +2605,37 @@ bool DWARFExpression::Evaluate(
       break;
     }
 
+    case DW_OP_WASM_location: {
+      uint8_t wasm_op = opcodes.GetU8(&offset);
+      uint32_t index;
+
+      /* LLDB doesn't have an address space to represents WebAssembly locals,
+       * globals and operand stacks.
+       * We encode these elements into virtual registers:
+       *   | tag: 2 bits | index: 30 bits |
+       *   where tag is:
+       *    0: Not a WebAssembly location
+       *    1: Local
+       *    2: Global
+       *    3: Operand stack value
+       */
+      if (wasm_op == 3) {
+        index = opcodes.GetU32(&offset);
+        wasm_op = 1;
+      } else {
+        index = opcodes.GetULEB128(&offset);
+      }
+
+      reg_num = (((wasm_op + 1) & 0x03) << 30) | (index & 0x3fffffff);
+
+      if (ReadRegisterValueAsScalar(reg_ctx, reg_kind, reg_num, error_ptr, tmp))
+        stack.push_back(tmp);
+      else
+        return false;
+
+      break;
+    }
+
     default:
       if (dwarf_cu) {
         if (dwarf_cu->GetSymbolFileDWARF().ParseVendorDWARFOpcode(
diff --git a/lldb/source/Interpreter/CommandInterpreter.cpp b/lldb/source/Interpreter/CommandInterpreter.cpp
index 00651df48b6224..bcacc7aabb66ca 100644
--- a/lldb/source/Interpreter/CommandInterpreter.cpp
+++ b/lldb/source/Interpreter/CommandInterpreter.cpp
@@ -794,6 +794,24 @@ void CommandInterpreter::LoadCommandDictionary() {
     }
   }
 
+  std::unique_ptr<CommandObjectRegexCommand> connect_wasm_cmd_up(
+      new CommandObjectRegexCommand(
+          *this, "wasm",
+          "Connect to a WebAssembly process via remote GDB server.  "
+          "If no host is specifed, localhost is assumed.",
+          "wasm [<hostname>:]<portnum>", 0, false));
+  if (connect_wasm_cmd_up) {
+    if (connect_wasm_cmd_up->AddRegexCommand(
+            "^([^:]+|\\[[0-9a-fA-F:]+.*\\]):([0-9]+)$",
+            "process connect --plugin wasm connect://%1:%2") &&
+        connect_wasm_cmd_up->AddRegexCommand(
+            "^([[:digit:]]+)$",
+            "process connect --plugin wasm connect://localhost:%1")) {
+      CommandObjectSP command_sp(connect_wasm_cmd_up.release());
+      m_command_dict[std::string(command_sp->GetCommandName())] = command_sp;
+    }
+  }
+
   std::unique_ptr<CommandObjectRegexCommand> connect_kdp_remote_cmd_up(
       new CommandObjectRegexCommand(
           *this, "kdp-remote",
diff --git a/lldb/source/Plugins/Process/CMakeLists.txt b/lldb/source/Plugins/Process/CMakeLists.txt
index a51d0f7afd1759..be109a303e8669 100644
--- a/lldb/source/Plugins/Process/CMakeLists.txt
+++ b/lldb/source/Plugins/Process/CMakeLists.txt
@@ -19,3 +19,4 @@ add_subdirectory(elf-core)
 add_subdirectory(mach-core)
 add_subdirectory(minidump)
 add_subdirectory(FreeBSDKernel)
+add_subdirectory(wasm)
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index 316be471df9295..674bbc9ff4fd0b 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -1628,6 +1628,11 @@ void ProcessGDBRemote::ParseExpeditedRegisters(
   }
 }
 
+std::shared_ptr<ThreadGDBRemote>
+ProcessGDBRemote::CreateThread(lldb::tid_t tid) {
+  return std::make_shared<ThreadGDBRemote>(*this, tid);
+}
+
 ThreadSP ProcessGDBRemote::SetThreadStopInfo(
     lldb::tid_t tid, ExpeditedRegisterMap &expedited_register_map,
     uint8_t signo, const std::string &thread_name, const std::string &reason,
@@ -1652,7 +1657,7 @@ ThreadSP ProcessGDBRemote::SetThreadStopInfo(
 
     if (!thread_sp) {
       // Create the thread if we need to
-      thread_sp = std::make_shared<ThreadGDBRemote>(*this, tid);
+      thread_sp = CreateThread(tid);
       m_thread_list_real.AddThread(thread_sp);
     }
   }
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
index c1ea1cc7905587..0463e39b7a63a4 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
@@ -355,6 +355,8 @@ class ProcessGDBRemote : public Process,
   MonitorDebugserverProcess(std::weak_ptr<ProcessGDBRemote> process_wp,
                             lldb::pid_t pid, int signo, int exit_status);
 
+  virtual std::shared_ptr<ThreadGDBRemote> CreateThread(lldb::tid_t tid);
+
   lldb::StateType SetThreadStopInfo(StringExtractor &stop_packet);
 
   bool
diff --git a/lldb/source/Plugins/Process/wasm/CMakeLists.txt b/lldb/source/Plugins/Process/wasm/CMakeLists.txt
new file mode 100644
index 00000000000000..ef2bfc634962d5
--- /dev/null
+++ b/lldb/source/Plugins/Process/wasm/CMakeLists.txt
@@ -0,0 +1,12 @@
+add_lldb_library(lldbPluginProcessWasm PLUGIN
+  ProcessWasm.cpp
+  ThreadWasm.cpp
+  UnwindWasm.cpp
+  wasmRegisterContext.cpp
+
+  LINK_LIBS
+    lldbCore
+    ${LLDB_PLUGINS}
+  LINK_COMPONENTS
+    Support
+  )
diff --git a/lldb/source/Plugins/Process/wasm/ProcessWasm.cpp b/lldb/source/Plugins/Process/wasm/ProcessWasm.cpp
new file mode 100644
index 00000000000000..6e633e69b9767a
--- /dev/null
+++ b/lldb/source/Plugins/Process/wasm/ProcessWasm.cpp
@@ -0,0 +1,291 @@
+//===-- ProcessWasm.cpp ---------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ProcessWasm.h"
+#include "ThreadWasm.h"
+#include "lldb/Core/Module.h"
+#include "lldb/Core/PluginManager.h"
+#include "lldb/Utility/DataBufferHeap.h"
+
+#include "lldb/Target/UnixSignals.h"
+#include "llvm/ADT/ArrayRef.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace lldb_private::process_gdb_remote;
+using namespace lldb_private::wasm;
+
+LLDB_PLUGIN_DEFINE(ProcessWasm)
+
+// ProcessGDBRemote constructor
+ProcessWasm::ProcessWasm(lldb::TargetSP target_sp, ListenerSP listener_sp)
+    : ProcessGDBRemote(target_sp, listener_sp) {
+  /* always use linux signals for wasm process */
+  m_unix_signals_sp =
+      UnixSignals::Create(ArchSpec{"wasm32-unknown-unknown-wasm"});
+}
+
+void ProcessWasm::Initialize() {
+  static llvm::once_flag g_once_flag;
+
+  llvm::call_once(g_once_flag, []() {
+    PluginManager::RegisterPlugin(GetPluginNameStatic(),
+                                  GetPluginDescriptionStatic(), CreateInstance,
+                                  DebuggerInitialize);
+  });
+}
+
+void ProcessWasm::DebuggerInitialize(Debugger &debugger) {
+  ProcessGDBRemote::DebuggerInitialize(debugger);
+}
+
+llvm::StringRef ProcessWasm::GetPluginName() { return GetPluginNameStatic(); }
+
+llvm::StringRef ProcessWasm::GetPluginNameStatic() {
+  static ConstString g_name("wasm");
+  return g_name;
+}
+
+llvm::StringRef ProcessWasm::GetPluginDescriptionStatic() {
+  return "GDB Remote protocol based WebAssembly debugging plug-in.";
+}
+
+void ProcessWasm::Terminate() {
+  PluginManager::UnregisterPlugin(ProcessWasm::CreateInstance);
+}
+
+lldb::ProcessSP ProcessWasm::CreateInstance(lldb::TargetSP target_sp,
+                                            ListenerSP listener_sp,
+                                            const FileSpec *crash_file_path,
+                                            bool can_connect) {
+  lldb::ProcessSP process_sp;
+  if (crash_file_path == nullptr)
+    process_sp = std::make_shared<ProcessWasm>(target_sp, listener_sp);
+  return process_sp;
+}
+
+bool ProcessWasm::CanDebug(lldb::TargetSP target_sp,
+                           bool plugin_specified_by_name) {
+  if (plugin_specified_by_name)
+    return true;
+
+  Module *exe_module = target_sp->GetExecutableModulePointer();
+  if (exe_module) {
+    ObjectFile *exe_objfile = exe_module->GetObjectFile();
+    return exe_objfile->GetArchitecture().GetMachine() == llvm::Triple::wasm32;
+  }
+  // However, if there is no wasm module, we return false, otherwise,
+  // we might use ProcessWasm to attach gdb remote.
+  return false;
+}
+
+std::shared_ptr<ThreadGDBRemote> ProcessWasm::CreateThread(lldb::tid_t tid) {
+  return std::make_shared<ThreadWasm>(*this, tid);
+}
+
+size_t ProcessWasm::ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+                               Status &error) {
+  wasm_addr_t wasm_addr(vm_addr);
+
+  switch (wasm_addr.GetType()) {
+  case WasmAddressType::Memory:
+  case WasmAddressType::Object:
+    return ProcessGDBRemote::ReadMemory(vm_addr, buf, size, error);
+  case WasmAddressType::Invalid:
+  default:
+    error.SetErrorStringWithFormat(
+        "Wasm read failed for invalid address 0x%" PRIx64, vm_addr);
+    return 0;
+  }
+}
+
+size_t ProcessWasm::ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+                               ExecutionContext *exe_ctx, Status &error) {
+  wasm_addr_t wasm_addr(vm_addr);
+
+  switch (wasm_addr.GetType()) {
+  case WasmAddressType::Memory: {
+    // If we don't have a valid module_id, this is actually a read from the
+    // Wasm memory space. We can calculate the module_id from the execution
+    // context.
+    if (wasm_addr.module_id == 0 && exe_ctx != nullptr) {
+      StackFrame *frame = exe_ctx->GetFramePtr();
+      assert(frame->CalculateTarget()->GetArchitecture().GetMachine() ==
+             llvm::Triple::wasm32);
+      wasm_addr.module_id = wasm_addr_t(frame->GetStackID().GetPC()).module_id;
+      wasm_addr.type = WasmAddressType::Memory;
+    }
+    if (WasmReadMemory(wasm_addr.module_id, wasm_addr.offset, buf, size))
+      return size;
+    error.SetErrorStringWithFormat("Wasm memory read failed for 0x%" PRIx64,
+                                   vm_addr);
+    return 0;
+  }
+  case WasmAddressType::Object:
+    return ProcessGDBRemote::ReadMemory(vm_addr, buf, size, error);
+  case WasmAddressType::Invalid:
+  default:
+    error.SetErrorStringWithFormat(
+        "Wasm read failed for invalid address 0x%" PRIx64, vm_addr);
+    return 0;
+  }
+}
+
+size_t ProcessWasm::WasmReadMemory(uint32_t wasm_module_id, lldb::addr_t addr,
+                                   void *buf, size_t buffer_size) {
+  char packet[64];
+  int packet_len =
+      ::snprintf(packet, sizeof(packet), "qWasmMem:%d;%" PRIx64 ";%" PRIx64,
+                 wasm_module_id, static_cast<uint64_t>(addr),
+                 static_cast<uint64_t>(buffer_size));
+  assert(packet_len + 1 < (int)sizeof(packet));
+  UNUSED_IF_ASSERT_DISABLED(packet_len);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet, response,
+                                              GetInterruptTimeout()) ==
+      GDBRemoteCommunication::PacketResult::Success) {
+    if (response.IsNormalResponse()) {
+      return response.GetHexBytes(llvm::MutableArrayRef<uint8_t>(
+                                      static_cast<uint8_t *>(buf), buffer_size),
+                                  '\xdd');
+    }
+  }
+  return 0;
+}
+
+size_t ProcessWasm::WasmReadData(uint32_t wasm_module_id, lldb::addr_t addr,
+                                 void *buf, size_t buffer_size) {
+  char packet[64];
+  int packet_len =
+      ::snprintf(packet, sizeof(packet), "qWasmData:%d;%" PRIx64 ";%" PRIx64,
+                 wasm_module_id, static_cast<uint64_t>(addr),
+                 static_cast<uint64_t>(buffer_size));
+  assert(packet_len + 1 < (int)sizeof(packet));
+  UNUSED_IF_ASSERT_DISABLED(packet_len);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet, response,
+                                              GetInterruptTimeout()) ==
+      GDBRemoteCommunication::PacketResult::Success) {
+    if (response.IsNormalResponse()) {
+      return response.GetHexBytes(llvm::MutableArrayRef<uint8_t>(
+                                      static_cast<uint8_t *>(buf), buffer_size),
+                                  '\xdd');
+    }
+  }
+  return 0;
+}
+
+bool ProcessWasm::GetWasmLocal(int frame_index, int index, void *buf,
+                               size_t buffer_size, size_t &size) {
+  StreamString packet;
+  packet.Printf("qWasmLocal:");
+  packet.Printf("%d;%d", frame_index, index);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+      GDBRemoteCommunication::PacketResult::Success) {
+    return false;
+  }
+
+  if (!response.IsNormalResponse()) {
+    return false;
+  }
+
+  WritableDataBufferSP buffer_sp(
+      new DataBufferHeap(response.GetStringRef().size() / 2, 0));
+  response.GetHexBytes(buffer_sp->GetData(), '\xcc');
+  size = buffer_sp->GetByteSize();
+  if (size <= buffer_size) {
+    memcpy(buf, buffer_sp->GetBytes(), size);
+    return true;
+  }
+
+  return false;
+}
+
+bool ProcessWasm::GetWasmGlobal(int frame_index, int index, void *buf,
+                                size_t buffer_size, size_t &size) {
+  StreamString packet;
+  packet.PutCString("qWasmGlobal:");
+  packet.Printf("%d;%d", frame_index, index);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+      GDBRemoteCommunication::PacketResult::Success) {
+    return false;
+  }
+
+  if (!response.IsNormalResponse()) {
+    return false;
+  }
+
+  WritableDataBufferSP buffer_sp(
+      new DataBufferHeap(response.GetStringRef().size() / 2, 0));
+  response.GetHexBytes(buffer_sp->GetData(), '\xcc');
+  size = buffer_sp->GetByteSize();
+  if (size <= buffer_size) {
+    memcpy(buf, buffer_sp->GetBytes(), size);
+    return true;
+  }
+
+  return false;
+}
+
+bool ProcessWasm::GetWasmStackValue(int frame_index, int index, void *buf,
+                                    size_t buffer_size, size_t &size) {
+  StreamString packet;
+  packet.PutCString("qWasmStackValue:");
+  packet.Printf("%d;%d", frame_index, index);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+      GDBRemoteCommunication::PacketResult::Success) {
+    return false;
+  }
+
+  if (!response.IsNormalResponse()) {
+    return false;
+  }
+
+  WritableDataBufferSP buffer_sp(
+      new DataBufferHeap(response.GetStringRef().size() / 2, 0));
+  response.GetHexBytes(buffer_sp->GetData(), '\xcc');
+  size = buffer_sp->GetByteSize();
+  if (size <= buffer_size) {
+    memcpy(buf, buffer_sp->GetBytes(), size);
+    return true;
+  }
+
+  return false;
+}
+
+bool ProcessWasm::GetWasmCallStack(lldb::tid_t tid,
+                                   std::vector<lldb::addr_t> &call_stack_pcs) {
+  call_stack_pcs.clear();
+  StreamString packet;
+  packet.Printf("qWasmCallStack:");
+  packet.Printf("%llx", tid);
+  StringExtractorGDBRemote response;
+  if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+      GDBRemoteCommunication::PacketResult::Success) {
+    return false;
+  }
+
+  if (!response.IsNormalResponse()) {
+    return false;
+  }
+
+  addr_t buf[1024 / sizeof(addr_t)];
+  size_t bytes = response.GetHexBytes(
+      llvm::MutableArrayRef<uint8_t>((uint8_t *)buf, sizeof(buf)), '\xdd');
+  if (bytes == 0) {
+    return false;
+  }
+
+  for (size_t i = 0; i < bytes / sizeof(addr_t); i++) {
+    call_stack_pcs.push_back(buf[i]);
+  }
+  return true;
+}
diff --git a/lldb/source/Plugins/Process/wasm/ProcessWasm.h b/lldb/source/Plugins/Process/wasm/ProcessWasm.h
new file mode 100644
index 00000000000000..fc9736ec0684d2
--- /dev/null
+++ b/lldb/source/Plugins/Process/wasm/ProcessWasm.h
@@ -0,0 +1,129 @@
+//===-- ProcessWasm.h -------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+
+#include "Plugins/Process/gdb-remote/ProcessGDBRemote.h"
+#include "lldb/Target/RegisterContext.h"
+
+namespace lldb_private {
+namespace wasm {
+
+/// Each WebAssembly module has separated address spaces for Code and Memory.
+/// A WebAssembly module also has a Data section which, when the module is
+/// loaded, gets mapped into a region in the module Memory.
+/// For the purpose of debugging, we can represent all these separated 32-bit
+/// address spaces with a single virtual 64-bit address space.
+///
+/// Struct wasm_addr_t provides this encoding using bitfields
+///
+enum WasmAddressType { Memory = 0x00, Object = 0x01, Invalid = 0x03 };
+
+struct wasm_addr_t {
+  uint64_t offset : 32;
+  uint64_t module_id : 30;
+  uint64_t type : 2;
+
+  wasm_addr_t(lldb::addr_t addr)...
[truncated]

@paolosevMSFT paolosevMSFT requested review from JDevlieghere and removed request for JDevlieghere January 12, 2024 17:21
@paolosevMSFT
Copy link
Contributor Author

@JDevlieghere I created a new PR and closed #76683 because I made a mess with a git rebase, and since the other PR had just started I thought it made sense to start anew. I apologize for the confusion. Here I have already addressed most of your comments from 76683.

What is left to do is to add tests with GdbRemoteTestCaseBase as you suggested.

I've kept the support for DW_OP_WASM_location in this PR to have it fully working, but I'll be happy to hoist it out in a separate patch with its own dedicated test, if so you prefer.

@xujuntwt95329, from WebAssembly Micro Runtime, has hugely contributed to this work, but I cannot add his name to the reviewers or assignees.

@JDevlieghere
Copy link
Member

@JDevlieghere I created a new PR and closed #76683 because I made a mess with a git rebase, and since the other PR had just started I thought it made sense to start anew. I apologize for the confusion. Here I have already addressed most of your comments from 76683.

No worries, thanks for the quick turnaround!

What is left to do is to add tests with GdbRemoteTestCaseBase as you suggested.

👍

I've kept the support for DW_OP_WASM_location in this PR to have it fully working, but I'll be happy to hoist it out in a separate patch with its own dedicated test, if so you prefer.

I'd vote to break it out, it can stand on its own and needs a test anyway. As an added benefit that probably means it gets merged sooner than this larger patch.

@xujuntwt95329, from WebAssembly Micro Runtime, has hugely contributed to this work, but I cannot add his name to the reviewers or assignees.

Yeah, I believe you must be a member of the project to be listed there. @xujuntwt95329 if reply or interact with the thread they should become a subscriber and get notifications. Maybe they already do given the mention. Regardless, thank you for your contribution!

@paolosevMSFT
Copy link
Contributor Author

@JDevlieghere Thank you!

I will remove the code that handles DW_OP_WASM_location as you suggest.

For what concerns testing, I took a very quick look at GdbRemoteTestCaseBase and if I am not wrong, it works by spawning an inferior processor attaching to an inferior process to debug. But it relies on the fact that it can use lldb-server, which implements the server-side of the gdb-remote protocol.
But here we need to attach to a Wasm engine, and obviously, we cannot implement a Wasm engine as part of the LLDB test code.
Maybe we should implement a process that mocks the gdb-stub logic for a Wasm engine? Is there anything we could reuse in the LLDB test framework?

@JDevlieghere
Copy link
Member

@JDevlieghere Thank you!

I will remove the code that handles DW_OP_WASM_location as you suggest.

Given we don't have a good solution for stacked commits, you don't necessarily need to remove it here, and rebase it once the other PR has been merged.

For what concerns testing, I took a very quick look at GdbRemoteTestCaseBase and if I am not wrong, it works by spawning an inferior processor attaching to an inferior process to debug. But it relies on the fact that it can use lldb-server, which implements the server-side of the gdb-remote protocol. But here we need to attach to a Wasm engine, and obviously, we cannot implement a Wasm engine as part of the LLDB test code. Maybe we should implement a process that mocks the gdb-stub logic for a Wasm engine? Is there anything we could reuse in the LLDB test framework?

You're right that there are tests that work like that and indeed, that wouldn't work for this. If you look for tests that use the GdbRemoteTestCaseBase you can basically spoof the other end of the GDB remote connection. You can have it respond to packets the way your real GDB stub would. You can make it as advanced or simple as you want. There's a bunch of examples in test/API/functionalities/gdb_remote_client.

What I would do is debug a simple program using the real GDB stub and look at the gdb packet log in lldb to give you an idea what the server responds. Then I would try to mimic the debug session in the tests and implement responses from the mock server. You want to keep things as small as possible so you don't want to avoid just copy/pasting full responses (they likely wouldn't work anyway) but the real packets can serve as inspiration.

Comment on lines +28 to +30
/* always use linux signals for wasm process */
m_unix_signals_sp =
UnixSignals::Create(ArchSpec{"wasm32-unknown-unknown-wasm"});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and the creation of the signals object don't match up to me. How does wasm32-unknown-unknown-wasm yield Linux signals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was made originally by @xujuntwt95329, who can provide more context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebAssembly doesn't define its own signal system, here we simply use UnixSignals and expect the WebAssembly runtime always use UnixSignals on all platforms.

@DavidSpickett
Copy link
Collaborator

A WebAssembly debugging session can be started using the new command:
wasm [:]

What about wasm requires a new command, given that you are connecting to a GDB server as existing targets do.

@medismailben
Copy link
Member

A WebAssembly debugging session can be started using the new command:
wasm [:]

What about wasm requires a new command, given that you are connecting to a GDB server as existing targets do.

+1! I don't think this needs a new command, it could just use gdb-remote [<hostname>:]<portnum> command.

Copy link

github-actions bot commented Jan 22, 2024

✅ With the latest revision this PR passed the Python code formatter.

@paolosevMSFT
Copy link
Contributor Author

paolosevMSFT commented Jan 22, 2024

  • Looking more into testing I realized that I had already implemented tests for loading Wasm modules in lldb/test/API/functionalities/gdb_remote_client/TestWasm.py.
    So I added a "full Wasm-debugging" test to TestWasm.py, which uses a Wasm module represented in file calc-cpp-o0.yaml, that contains DWARF symbols, and that was generated by compiling a tiny C++ app (calc.cpp) with Emscripten.

  • Created a separate [lldb] Support DW_OP_WASM_location in DWARFExpression #78977 for the changes to support DW_OP_WASM_location in DWARFExpression.

 

  • About:

A WebAssembly debugging session can be started using the new command:
wasm [:]

What about wasm requires a new command, given that you are connecting to a GDB server as existing targets do.

+1! I don't think this needs a new command, it could just use gdb-remote [<hostname>:]<portnum> command.

It is true that a new command is not really necessary. This command also works:
process connect --plugin wasm connect://localhost:8765

But currently a simple gdb-remote [<hostname>:]<portnum> does not enable the "wasm' plugin. Is there a way to detect that we are debugging WebAssembly at connect time?

Copy link

github-actions bot commented Jan 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@mh4ck-Thales
Copy link
Contributor

Seems like support for reading Wasm local and global variables is available in your code, but I don't understand how can one effectively read these variables from the lldb command line. Maybe adding commands to access these can be useful ?

@jimingham
Copy link
Collaborator

jimingham commented Jan 22, 2024 via email

@mh4ck-Thales
Copy link
Contributor

@jimingham sorry if I wasn't clear. frame var and target var are commands for viewing language local and global variables (be it C or another one). They are indeed working with this patch. These commands are using the Wasm linear memory, where most of the data about the program is stored, and is similar to classical binaries memory. This linear memory can be accessed using Wasm memory instructions.

However for the sake of debugging Wasm, I am interested in accessing Wasm local and global variables (these are the variables I meant to address in my first comment). These variables are used, for example, to manage the stack pointer. They are accessed using the Wasm variable instructions.

The current patch allows us to inspect the Wasm linear memory as one would inspect memory in a classic binary. Likewise, the Wasm code memory (containing the actual Wasm bytecode, which in Wasm is separated from the linear memory) is accessible with this patch with an offset of $2^{62}$. I guess this is a workaround that may be explained in this comment. However, I'm not sure this patch provides an access to Wasm local and global variables. I thought it was implemented as I saw mentions of WasmLocal and WasmGlobal, but I may have mixed up with language variables.

If Wasm variables support is indeed implemented in this patch as I first thought, I'm not able to see which lldb command would allow to see them. If not, I think it would be a nice addition to this patch (or another patch in the future) to allow one to debug Wasm instructions and variables directly. A way I can see to implement that would be to use the same workaround used to display the code memory with an offset, and utility commands allowing to easily access variables from this offset (something like wasm local 0 to dump the value of local variable n°0).

@DavidSpickett
Copy link
Collaborator

But currently a simple gdb-remote [:] does not enable the "wasm' plugin. Is there a way to detect that we are debugging WebAssembly at connect time?

Tracing down from platform connect <no plugin option>, I get to Process::FindPlugin, which creates a process of all the registered types then asks that process if it can debug that target.

Though, a lot of them just return true and you already have:

bool ProcessWasm::CanDebug(lldb::TargetSP target_sp,
                           bool plugin_specified_by_name) {

So I wonder if some other plugin is saying "yes I can debug this" before the wasm plugin has been asked. We must register the plugins in some order but not sure where that is done.

@xujuntwt95329
Copy link
Contributor

Thanks @paolosevMSFT for the great work! We have implemented debugger server in WebAssembly Micro Runtime (WAMR) to work with this, and it works well!
image

The experience of debugging dwarf-based WebAssembly modules is similar to the native elf, and with lldb‘s DAP tool, we can easily debug this in a UI friendly environment such as VSCode. This is really exciting and greatly improves the usability of WebAssembly :)

Previously we maintain a huge patch file (which was created from @paolosevMSFT's patches) and tell the users to patch lldb source and build the customized lldb binary. We hope this can be upstream and shipped with official build, then people can enjoy debugging WebAssembly more conveniently

@yowl
Copy link

yowl commented Jan 23, 2024

might be a noobie question, but is compatible with split dwarf ?

@jimingham
Copy link
Collaborator

jimingham commented Jan 23, 2024 via email

@mh4ck-Thales
Copy link
Contributor

I already tried to use register read to access Wasm variables without success. But it was the patch available here is part of WAMR, maybe this patch is different and will make it work. I tried using this patch without success.

It seems like compiling lldb with this patch and use it to debug Wasm with the latest version of WAMR / iwasm do not work correctly (at least for me). lldb can connect to the server embedded into iwasm but doesn't seem to be able to disassemble Wasm bytecode or set breakpoints. Just giving the info here because I'm not sure this patch is supposed to be straightaway compatible with WAMR.

@xujuntwt95329
Copy link
Contributor

I already tried to use register read to access Wasm variables without success. But it was the patch available here is part of WAMR, maybe this patch is different and will make it work. I tried using this patch without success.

It seems like compiling lldb with this patch and use it to debug Wasm with the latest version of WAMR / iwasm do not work correctly (at least for me). lldb can connect to the server embedded into iwasm but doesn't seem to be able to disassemble Wasm bytecode or set breakpoints. Just giving the info here because I'm not sure this patch is supposed to be straightaway compatible with WAMR.

Hi @mh4ck-Thales this is caused by #77949 (comment), currently we need to modify it manually.

And since this PR is not merged, you need to use the [wasm_debug_2024] branch in my forked WAMR repo for testing: https://github.com/xujuntwt95329/wasm-micro-runtime/tree/wasm_debug_2024

@mh4ck-Thales
Copy link
Contributor

Hi @mh4ck-Thales this is caused by #77949 (comment), currently we need to modify it manually.

Thanks! That did the trick for the breakpoint and disassembly problems. When using read register I can only see pc and nothing else though. I'm not sure assimilating Wasm variables to registers is the good way to go anyway, because the number of Wasm variables is not fixed in advance, and subject to the context of execution (with local variables). This is not the case at all for classic CPU registers, and I'm not sure the generic code managing registers in lldb will support that. Using a solution like @jimingham proposed with subcommands of language wasm may be a easier, less bug prone way to implement this.

@xujuntwt95329
Copy link
Contributor

Hi @mh4ck-Thales this is caused by #77949 (comment), currently we need to modify it manually.

Thanks! That did the trick for the breakpoint and disassembly problems. When using read register I can only see pc and nothing else though. I'm not sure assimilating Wasm variables to registers is the good way to go anyway, because the number of Wasm variables is not fixed in advance, and subject to the context of execution (with local variables). This is not the case at all for classic CPU registers, and I'm not sure the generic code managing registers in lldb will support that. Using a solution like @jimingham proposed with subcommands of language wasm may be a easier, less bug prone way to implement this.

Yes, the virtual register concepts is just used internally to allow us to process wasm locals/globals/operand stack values in a plugin without a huge modification to LLDB core part. This allows users to debug their source code, but doesn't support reading specific local/global values.

Currently lldb doesn't have an exist model for representing wasm's local/global/operand stack values, I think encoding them as virtual registers is an easy approach for enabling WebAssembly debugging and without a huge modification to LLDB core part.

On the other hand, as a managed language, I think most people won't need to read specific local/global of an instance, just like most of the JavaScript developer won't care about the value of a register in V8 during debugging, they just care about the value of their variables.

So I think the virtual register approach should be okay for the developer usage, with small engineering effort and limited quality influence to LLDB.

@mh4ck-Thales
Copy link
Contributor

On the other hand, as a managed language, I think most people won't need to read specific local/global of an instance, just like most of the JavaScript developer won't care about the value of a register in V8 during debugging, they just care about the value of their variables.

I agree that the average developer will not have much use of the possibility of accessing Wasm locals and globals. I also understand that Wasm locals and global are a very specific, Wasm-only concept and that adding this concept in LLVM core is not pertinent. However for people working on the inner workings of Wasm (like me), it may be very useful. I'm not sure it belongs to this PR but implementing support for accessing Wasm locals/globals in the future (using perhaps a command like language wasm) seems an interesting thing to do to complete the debugging experience on Wasm with lldb.

What I'm wondering is, does these potential added commands have an impact on this PR ? Or can we imagine commands that will send a specific, Wasm-only request to the debugging server to get the values of the variables without using the implementation of locals and globals of this PR ?

@paolosevMSFT
Copy link
Contributor Author

might be a noobie question, but is compatible with split dwarf ?

Yes, it is :-).

For that to work, it expects that the Wasm module contains a custom section named .external_debug_info that contains the path to another Wasm module that contains the DWARF symbols sections.
Emscripten does split symbols in this way with the -gseparate-dwarf flag.

@paolosevMSFT paolosevMSFT reopened this Feb 2, 2024
@paolosevMSFT
Copy link
Contributor Author

paolosevMSFT commented Feb 2, 2024

But currently a simple gdb-remote [:] does not enable the "wasm' plugin. Is there a way to detect that we are debugging WebAssembly at connect time?

Tracing down from platform connect <no plugin option>, I get to Process::FindPlugin, which creates a process of all the registered types then asks that process if it can debug that target.

Though, a lot of them just return true and you already have:

bool ProcessWasm::CanDebug(lldb::TargetSP target_sp,
                           bool plugin_specified_by_name) {

So I wonder if some other plugin is saying "yes I can debug this" before the wasm plugin has been asked. We must register the plugins in some order but not sure where that is done.

What happens now is that there is a ProcessGDBRemote that registers for gdb-remote and it is returned by Process::FindPlugin. Then there is the new ProcessWasm which might also register for the same gdb-remote maybe.
But the problem is that at the moment when we start the connection we do not know that we are connecting to a Wasm target.

Here, with the following call stack, target_sp->m_arch is still not initialized in Process::FindPlugin:

 	liblldb.dll!lldb_private::Process::FindPlugin(std::shared_ptr<lldb_private::Target> target_sp, llvm::StringRef plugin_name, std::shared_ptr<lldb_private::Listener> listener_sp, const lldb_private::FileSpec * crash_file_path, bool can_connect) Line 382	C++
 	liblldb.dll!lldb_private::Target::CreateProcess(std::shared_ptr<lldb_private::Listener> listener_sp, llvm::StringRef plugin_name, const lldb_private::FileSpec * crash_file, bool can_connect) Line 215	C++
 	liblldb.dll!lldb_private::Platform::DoConnectProcess(llvm::StringRef connect_url, llvm::StringRef plugin_name, lldb_private::Debugger & debugger, lldb_private::Stream * stream, lldb_private::Target * target, lldb_private::Status & error) Line 1948	C++

This is why I introduced a new command wasm.

@paolosevMSFT
Copy link
Contributor Author

I see, thanks for the clarification. In the patch, the WasmLocal and WasmGlobal calls are done in ReadRegister, so it seems like those are being presented as register values? register read should show them. BTW, we shouldn't make a top level wasm command, we really try hard not to occupy more of the "easy to type" parts of the lldb command set than we can help, so there is lots left free for users to customize. There's a language cplusplus command (and on the swift fork language swift, so it would make sense to have wasm commands be vended as language wasm. Then people who do a lot of WebAssembly debugging can make an alias to wasm if that helps. Jim

However, language plugins do not seem to work for the Wasm case here. When we are debugging Wasm code, WebAssembly is the architecture and the source language is the whatever was compiled to Wasm, commonly C/C++ or Rust.

If we implement a WasmLanguage plugin we should override virtual methods like bool IsTopLevelFunction(Function &function), bool IsSourceFile(StringRef &file_path) GetFormatters()` which do not really apply here.

But I agree that it is not nice to make a top level wasm command, so maybe the best way is to leave everything as it is and to just specify the plugin name at connection time:
process connect --plugin wasm connect://localhost:<PORT>

@paolosevMSFT
Copy link
Contributor Author

paolosevMSFT commented Feb 2, 2024

Seems like support for reading Wasm local and global variables is available in your code, but I don't understand how can one effectively read these variables from the lldb command line. Maybe adding commands to access these can be useful ?

The idea for this patch was only to support source-level debugging of code compiled to Wasm.
I do agree that it might also be useful to add better low-level Wasm debugging functionalities, like the ability of examining Wasm locals, globals, but this is beyond the scope of this initial patch and can certainly be added late, to be on par with Chromium DevTools, which offer this functionality when debugging Wasm code.
With this patch, it should already be possible to examine the value of Wasm memories and to show Wasm disassembly code.

When using read register I can only see pc and nothing else though. I'm not sure assimilating Wasm variables to registers is the good way to go anyway, because the number of Wasm variables is not fixed in advance, and subject to the context of execution (with local variables). This is not the case at all for classic CPU registers, and I'm not sure the generic code managing registers in lldb will support that.

As Jun says, we are using registers here just a nice technique, devised by @xujuntwt95329, to minimize Wasm-specific changes to LLDB Core classes. The idea was not really to make these registers available to the user, even though even this can certainly be improved in the future.

@paolosevMSFT
Copy link
Contributor Author

paolosevMSFT commented Feb 2, 2024

The logic to manage DW_OP_WASM_location in DWARFExpression is moved to a separate PR: #78977, which should be reviewed (and hopefully approved :-)) before we can complete work on this PR.

Please, let me and Jun know what our next steps should be to move this forward. Thanks! :)

@DavidSpickett
Copy link
Collaborator

What happens now is that there is a ProcessGDBRemote that registers for gdb-remote and it is returned by Process::FindPlugin. > Then there is the new ProcessWasm which might also register for the same gdb-remote maybe.
But the problem is that at the moment when we start the connection we do not know that we are connecting to a Wasm target.

Here, with the following call stack, target_sp->m_arch is still not initialized in Process::FindPlugin:

This feels like a bug or at least a design flaw in lldb then. ABI plugins for example clearly need to know the architecture and they work with remotes just fine, by reading the architecture from the object file (somehow).

But I'm not expecting you to address or even find that here, I would just prefer that this went ahead without the wasm top level command. If you still want to add that it can be a follow up where we can focus on that.

When using read register I can only see pc and nothing else though. I'm not sure assimilating Wasm variables to registers is > the good way to go anyway, because the number of Wasm variables is not fixed in advance, and subject to the context of > execution (with local variables). This is not the case at all for classic CPU registers, and I'm not sure the generic code managing > registers in lldb will support that.

We have some registers on Linux that could be considered part of the kernel interface not the hardware, so there is precedent for these "pseudo" registers. Examples are the memory tagging and scalable matrix control registers. However those features are enabled at kernel boot, so they don't come and go at runtime. Scalable vectors (SVE) can change size at runtime but they don't come and go either.

The one register that does that is the scalable matrix array register, which can be disabled by software. But, and this should tell you something, both I and my colleagues on GDB decided to just show it as all 0s when it's disabled. Instead of re-configuring the registers on every stop. It becomes hard to keep track when a register in the middle of the context disappears.

So I don't disagree with the concept of showing them as registers. They could be compared to the Linux kernel pseudo registers we have. However it will be more difficult than those registers, due to the dynamism. So if you're going to do it, please open a PR focused on that.

@xujuntwt95329
Copy link
Contributor

On the other hand, as a managed language, I think most people won't need to read specific local/global of an instance, just like most of the JavaScript developer won't care about the value of a register in V8 during debugging, they just care about the value of their variables.

I agree that the average developer will not have much use of the possibility of accessing Wasm locals and globals. I also understand that Wasm locals and global are a very specific, Wasm-only concept and that adding this concept in LLVM core is not pertinent. However for people working on the inner workings of Wasm (like me), it may be very useful. I'm not sure it belongs to this PR but implementing support for accessing Wasm locals/globals in the future (using perhaps a command like language wasm) seems an interesting thing to do to complete the debugging experience on Wasm with lldb.

What I'm wondering is, does these potential added commands have an impact on this PR ? Or can we imagine commands that will send a specific, Wasm-only request to the debugging server to get the values of the variables without using the implementation of locals and globals of this PR ?

I agree that implementing support for accessing Wasm locals/globals would be a very interesting and exciting thing, and as a WebAssembly runtime developer, such features will be extremely useful to me (currently I need to debug the runtime itself to retrieve the internal data structures to find the value of locals/globals). But since there isn't an existing concepts in lldb to support these, we think it should be better to focus this PR on the source level debugging support to control the complexity, and leave the local/global support as further enhancement.

What I'm wondering is, does these potential added commands have an impact on this PR ? Or can we imagine commands that will send a specific, Wasm-only request to the debugging server to get the values of the variables without using the implementation of locals and globals of this PR ?

I think a wasm-specific command which directly send message to debug server for getting local/global values would be a simple approach to support the nice features, maybe we can consider such solution in the future to improve the debugging experience.

@xujuntwt95329
Copy link
Contributor

I see, thanks for the clarification. In the patch, the WasmLocal and WasmGlobal calls are done in ReadRegister, so it seems like those are being presented as register values? register read should show them. BTW, we shouldn't make a top level wasm command, we really try hard not to occupy more of the "easy to type" parts of the lldb command set than we can help, so there is lots left free for users to customize. There's a language cplusplus command (and on the swift fork language swift, so it would make sense to have wasm commands be vended as language wasm. Then people who do a lot of WebAssembly debugging can make an alias to wasm if that helps. Jim

However, language plugins do not seem to work for the Wasm case here. When we are debugging Wasm code, WebAssembly is the architecture and the source language is the whatever was compiled to Wasm, commonly C/C++ or Rust.

If we implement a WasmLanguage plugin we should override virtual methods like bool IsTopLevelFunction(Function &function), bool IsSourceFile(StringRef &file_path) GetFormatters()` which do not really apply here.

But I agree that it is not nice to make a top level wasm command, so maybe the best way is to leave everything as it is and to just specify the plugin name at connection time: process connect --plugin wasm connect://localhost:<PORT>

Yes, the top level wasm command may be useful to WebAssembly users but may cause confusion to other users who doesn't use WebAssembly (Just like their isn't a top level arm command, wasm here should be conceptually equivalent with arm). Maybe we can remove this command from this PR, and users can add their own alias for convenience.

@paolosevMSFT
Copy link
Contributor Author

Removed the 'wasm' command and addressed all the other review comments.

@xwang98
Copy link

xwang98 commented Feb 21, 2024

There is a strong need for this capability of LLDB from the WebAssembly Micro Runtime community. It will be awesome to have it in the LLDB.

lldb/source/Expression/DWARFExpression.cpp Outdated Show resolved Hide resolved
/// Some targets might use bits in a code address to represent additional
/// information; for example WebAssembly targets have a different memory space
/// per module and have a different address space per memory and code.
virtual lldb::addr_t FixMemoryAddress(lldb::addr_t address,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above we have FixCodeAddres() and FixDataAddress(). Will this function fix any kind of address, or just a code address? If it is just code, then maybe we can just modify the above FixCodeAddress(...) and add the StackFrame to that API which can default to nullptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought to add a new function because Process::FixCodeAddress, Process::FixDataAddress, Process::FixAnyAddress by default forward the call to ABI::FixCodeAddress, ABI::FixDataAddress, ABI::FixAnyAddress, which are implemented by plugin-specific ABIs like ABISysV_arm, ABIMacOSX_arm. But here we are not really dealing with a different ABI, it is more a custom representation of Wasm memory addresses used between the debugger and the Wasm engine.

If you prefer, we could reuse Process::FixAnyAddress, making it virtual and overriding it in ProcessWasm.

lldb/source/Expression/DWARFExpression.cpp Outdated Show resolved Hide resolved
lldb/source/Expression/DWARFExpression.cpp Outdated Show resolved Hide resolved
lldb/source/Expression/DWARFExpression.cpp Outdated Show resolved Hide resolved
lldb/source/Plugins/Process/wasm/UnwindWasm.cpp Outdated Show resolved Hide resolved
lldb/source/Plugins/Process/wasm/wasmRegisterContext.cpp Outdated Show resolved Hide resolved
lldb/source/Plugins/Process/wasm/wasmRegisterContext.cpp Outdated Show resolved Hide resolved
lldb/source/Plugins/Process/wasm/wasmRegisterContext.cpp Outdated Show resolved Hide resolved
lldb/source/Plugins/Process/wasm/wasmRegisterContext.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.