-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] Implement WebAssembly debugging #77949
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lldb Author: Paolo Severini (paolosevMSFT) ChangesAdd support for source-level debugging of WebAssembly code that runs in a WebAssembly engine. The idea is to use the GDB-remote protocol to connect to a Wasm engine that implements a GDB-remote stub that offers the ability to access the engine runtime internal state. This protocol is already supported by the WebAssembly Micro Runtime and the WAMR team greatly contributed to this work. A WebAssembly debugging session can be started using the new command: The implementation is almost completely contained in Plugin classes and the only Wasm-specific changes to Core classes is in Patch is 38.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77949.diff 16 Files Affected:
diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index 24c599e044c78f..587ae085b479b7 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -1548,6 +1548,46 @@ class Process : public std::enable_shared_from_this<Process>,
virtual size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
Status &error);
+ /// Read of memory from a process.
+ ///
+ /// This function will read memory from the current process's address space
+ /// and remove any traps that may have been inserted into the memory.
+ ///
+ /// This overloads accepts an ExecutionContext as additional argument. By
+ /// default, it calls the previous overload without the ExecutionContext
+ /// argument, but it can be overridden by Process subclasses.
+ ///
+ /// \param[in] vm_addr
+ /// A virtual load address that indicates where to start reading
+ /// memory from.
+ ///
+ /// \param[out] buf
+ /// A byte buffer that is at least \a size bytes long that
+ /// will receive the memory bytes.
+ ///
+ /// \param[in] size
+ /// The number of bytes to read.
+ ///
+ /// \param[in] exe_ctx
+ /// The current execution context, if available.
+ ///
+ /// \param[out] error
+ /// An error that indicates the success or failure of this
+ /// operation. If error indicates success (error.Success()),
+ /// then the value returned can be trusted, otherwise zero
+ /// will be returned.
+ ///
+ /// \return
+ /// The number of bytes that were actually read into \a buf. If
+ /// the returned number is greater than zero, yet less than \a
+ /// size, then this function will get called again with \a
+ /// vm_addr, \a buf, and \a size updated appropriately. Zero is
+ /// returned in the case of an error.
+ virtual size_t ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+ ExecutionContext *exe_ctx, Status &error) {
+ return ReadMemory(vm_addr, buf, size, error);
+ }
+
/// Read of memory from a process.
///
/// This function has the same semantics of ReadMemory except that it
diff --git a/lldb/source/Core/Value.cpp b/lldb/source/Core/Value.cpp
index 995cc934c82044..47a5fdee773886 100644
--- a/lldb/source/Core/Value.cpp
+++ b/lldb/source/Core/Value.cpp
@@ -552,7 +552,7 @@ Status Value::GetValueAsData(ExecutionContext *exe_ctx, DataExtractor &data,
if (process) {
const size_t bytes_read =
- process->ReadMemory(address, dst, byte_size, error);
+ process->ReadMemory(address, dst, byte_size, exe_ctx, error);
if (bytes_read != byte_size)
error.SetErrorStringWithFormat(
"read memory from 0x%" PRIx64 " failed (%u of %u bytes read)",
diff --git a/lldb/source/Expression/DWARFExpression.cpp b/lldb/source/Expression/DWARFExpression.cpp
index fe4928d4f43a43..ca24611724dc7c 100644
--- a/lldb/source/Expression/DWARFExpression.cpp
+++ b/lldb/source/Expression/DWARFExpression.cpp
@@ -346,6 +346,16 @@ static offset_t GetOpcodeDataSize(const DataExtractor &data,
return (offset - data_offset) + subexpr_len;
}
+ case DW_OP_WASM_location: {
+ uint8_t wasm_op = data.GetU8(&offset);
+ if (wasm_op == 3) {
+ data.GetU32(&offset);
+ } else {
+ data.GetULEB128(&offset);
+ }
+ return offset - data_offset;
+ }
+
default:
if (!dwarf_cu) {
return LLDB_INVALID_OFFSET;
@@ -2595,6 +2605,37 @@ bool DWARFExpression::Evaluate(
break;
}
+ case DW_OP_WASM_location: {
+ uint8_t wasm_op = opcodes.GetU8(&offset);
+ uint32_t index;
+
+ /* LLDB doesn't have an address space to represents WebAssembly locals,
+ * globals and operand stacks.
+ * We encode these elements into virtual registers:
+ * | tag: 2 bits | index: 30 bits |
+ * where tag is:
+ * 0: Not a WebAssembly location
+ * 1: Local
+ * 2: Global
+ * 3: Operand stack value
+ */
+ if (wasm_op == 3) {
+ index = opcodes.GetU32(&offset);
+ wasm_op = 1;
+ } else {
+ index = opcodes.GetULEB128(&offset);
+ }
+
+ reg_num = (((wasm_op + 1) & 0x03) << 30) | (index & 0x3fffffff);
+
+ if (ReadRegisterValueAsScalar(reg_ctx, reg_kind, reg_num, error_ptr, tmp))
+ stack.push_back(tmp);
+ else
+ return false;
+
+ break;
+ }
+
default:
if (dwarf_cu) {
if (dwarf_cu->GetSymbolFileDWARF().ParseVendorDWARFOpcode(
diff --git a/lldb/source/Interpreter/CommandInterpreter.cpp b/lldb/source/Interpreter/CommandInterpreter.cpp
index 00651df48b6224..bcacc7aabb66ca 100644
--- a/lldb/source/Interpreter/CommandInterpreter.cpp
+++ b/lldb/source/Interpreter/CommandInterpreter.cpp
@@ -794,6 +794,24 @@ void CommandInterpreter::LoadCommandDictionary() {
}
}
+ std::unique_ptr<CommandObjectRegexCommand> connect_wasm_cmd_up(
+ new CommandObjectRegexCommand(
+ *this, "wasm",
+ "Connect to a WebAssembly process via remote GDB server. "
+ "If no host is specifed, localhost is assumed.",
+ "wasm [<hostname>:]<portnum>", 0, false));
+ if (connect_wasm_cmd_up) {
+ if (connect_wasm_cmd_up->AddRegexCommand(
+ "^([^:]+|\\[[0-9a-fA-F:]+.*\\]):([0-9]+)$",
+ "process connect --plugin wasm connect://%1:%2") &&
+ connect_wasm_cmd_up->AddRegexCommand(
+ "^([[:digit:]]+)$",
+ "process connect --plugin wasm connect://localhost:%1")) {
+ CommandObjectSP command_sp(connect_wasm_cmd_up.release());
+ m_command_dict[std::string(command_sp->GetCommandName())] = command_sp;
+ }
+ }
+
std::unique_ptr<CommandObjectRegexCommand> connect_kdp_remote_cmd_up(
new CommandObjectRegexCommand(
*this, "kdp-remote",
diff --git a/lldb/source/Plugins/Process/CMakeLists.txt b/lldb/source/Plugins/Process/CMakeLists.txt
index a51d0f7afd1759..be109a303e8669 100644
--- a/lldb/source/Plugins/Process/CMakeLists.txt
+++ b/lldb/source/Plugins/Process/CMakeLists.txt
@@ -19,3 +19,4 @@ add_subdirectory(elf-core)
add_subdirectory(mach-core)
add_subdirectory(minidump)
add_subdirectory(FreeBSDKernel)
+add_subdirectory(wasm)
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
index 316be471df9295..674bbc9ff4fd0b 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
@@ -1628,6 +1628,11 @@ void ProcessGDBRemote::ParseExpeditedRegisters(
}
}
+std::shared_ptr<ThreadGDBRemote>
+ProcessGDBRemote::CreateThread(lldb::tid_t tid) {
+ return std::make_shared<ThreadGDBRemote>(*this, tid);
+}
+
ThreadSP ProcessGDBRemote::SetThreadStopInfo(
lldb::tid_t tid, ExpeditedRegisterMap &expedited_register_map,
uint8_t signo, const std::string &thread_name, const std::string &reason,
@@ -1652,7 +1657,7 @@ ThreadSP ProcessGDBRemote::SetThreadStopInfo(
if (!thread_sp) {
// Create the thread if we need to
- thread_sp = std::make_shared<ThreadGDBRemote>(*this, tid);
+ thread_sp = CreateThread(tid);
m_thread_list_real.AddThread(thread_sp);
}
}
diff --git a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
index c1ea1cc7905587..0463e39b7a63a4 100644
--- a/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
+++ b/lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.h
@@ -355,6 +355,8 @@ class ProcessGDBRemote : public Process,
MonitorDebugserverProcess(std::weak_ptr<ProcessGDBRemote> process_wp,
lldb::pid_t pid, int signo, int exit_status);
+ virtual std::shared_ptr<ThreadGDBRemote> CreateThread(lldb::tid_t tid);
+
lldb::StateType SetThreadStopInfo(StringExtractor &stop_packet);
bool
diff --git a/lldb/source/Plugins/Process/wasm/CMakeLists.txt b/lldb/source/Plugins/Process/wasm/CMakeLists.txt
new file mode 100644
index 00000000000000..ef2bfc634962d5
--- /dev/null
+++ b/lldb/source/Plugins/Process/wasm/CMakeLists.txt
@@ -0,0 +1,12 @@
+add_lldb_library(lldbPluginProcessWasm PLUGIN
+ ProcessWasm.cpp
+ ThreadWasm.cpp
+ UnwindWasm.cpp
+ wasmRegisterContext.cpp
+
+ LINK_LIBS
+ lldbCore
+ ${LLDB_PLUGINS}
+ LINK_COMPONENTS
+ Support
+ )
diff --git a/lldb/source/Plugins/Process/wasm/ProcessWasm.cpp b/lldb/source/Plugins/Process/wasm/ProcessWasm.cpp
new file mode 100644
index 00000000000000..6e633e69b9767a
--- /dev/null
+++ b/lldb/source/Plugins/Process/wasm/ProcessWasm.cpp
@@ -0,0 +1,291 @@
+//===-- ProcessWasm.cpp ---------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ProcessWasm.h"
+#include "ThreadWasm.h"
+#include "lldb/Core/Module.h"
+#include "lldb/Core/PluginManager.h"
+#include "lldb/Utility/DataBufferHeap.h"
+
+#include "lldb/Target/UnixSignals.h"
+#include "llvm/ADT/ArrayRef.h"
+
+using namespace lldb;
+using namespace lldb_private;
+using namespace lldb_private::process_gdb_remote;
+using namespace lldb_private::wasm;
+
+LLDB_PLUGIN_DEFINE(ProcessWasm)
+
+// ProcessGDBRemote constructor
+ProcessWasm::ProcessWasm(lldb::TargetSP target_sp, ListenerSP listener_sp)
+ : ProcessGDBRemote(target_sp, listener_sp) {
+ /* always use linux signals for wasm process */
+ m_unix_signals_sp =
+ UnixSignals::Create(ArchSpec{"wasm32-unknown-unknown-wasm"});
+}
+
+void ProcessWasm::Initialize() {
+ static llvm::once_flag g_once_flag;
+
+ llvm::call_once(g_once_flag, []() {
+ PluginManager::RegisterPlugin(GetPluginNameStatic(),
+ GetPluginDescriptionStatic(), CreateInstance,
+ DebuggerInitialize);
+ });
+}
+
+void ProcessWasm::DebuggerInitialize(Debugger &debugger) {
+ ProcessGDBRemote::DebuggerInitialize(debugger);
+}
+
+llvm::StringRef ProcessWasm::GetPluginName() { return GetPluginNameStatic(); }
+
+llvm::StringRef ProcessWasm::GetPluginNameStatic() {
+ static ConstString g_name("wasm");
+ return g_name;
+}
+
+llvm::StringRef ProcessWasm::GetPluginDescriptionStatic() {
+ return "GDB Remote protocol based WebAssembly debugging plug-in.";
+}
+
+void ProcessWasm::Terminate() {
+ PluginManager::UnregisterPlugin(ProcessWasm::CreateInstance);
+}
+
+lldb::ProcessSP ProcessWasm::CreateInstance(lldb::TargetSP target_sp,
+ ListenerSP listener_sp,
+ const FileSpec *crash_file_path,
+ bool can_connect) {
+ lldb::ProcessSP process_sp;
+ if (crash_file_path == nullptr)
+ process_sp = std::make_shared<ProcessWasm>(target_sp, listener_sp);
+ return process_sp;
+}
+
+bool ProcessWasm::CanDebug(lldb::TargetSP target_sp,
+ bool plugin_specified_by_name) {
+ if (plugin_specified_by_name)
+ return true;
+
+ Module *exe_module = target_sp->GetExecutableModulePointer();
+ if (exe_module) {
+ ObjectFile *exe_objfile = exe_module->GetObjectFile();
+ return exe_objfile->GetArchitecture().GetMachine() == llvm::Triple::wasm32;
+ }
+ // However, if there is no wasm module, we return false, otherwise,
+ // we might use ProcessWasm to attach gdb remote.
+ return false;
+}
+
+std::shared_ptr<ThreadGDBRemote> ProcessWasm::CreateThread(lldb::tid_t tid) {
+ return std::make_shared<ThreadWasm>(*this, tid);
+}
+
+size_t ProcessWasm::ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+ Status &error) {
+ wasm_addr_t wasm_addr(vm_addr);
+
+ switch (wasm_addr.GetType()) {
+ case WasmAddressType::Memory:
+ case WasmAddressType::Object:
+ return ProcessGDBRemote::ReadMemory(vm_addr, buf, size, error);
+ case WasmAddressType::Invalid:
+ default:
+ error.SetErrorStringWithFormat(
+ "Wasm read failed for invalid address 0x%" PRIx64, vm_addr);
+ return 0;
+ }
+}
+
+size_t ProcessWasm::ReadMemory(lldb::addr_t vm_addr, void *buf, size_t size,
+ ExecutionContext *exe_ctx, Status &error) {
+ wasm_addr_t wasm_addr(vm_addr);
+
+ switch (wasm_addr.GetType()) {
+ case WasmAddressType::Memory: {
+ // If we don't have a valid module_id, this is actually a read from the
+ // Wasm memory space. We can calculate the module_id from the execution
+ // context.
+ if (wasm_addr.module_id == 0 && exe_ctx != nullptr) {
+ StackFrame *frame = exe_ctx->GetFramePtr();
+ assert(frame->CalculateTarget()->GetArchitecture().GetMachine() ==
+ llvm::Triple::wasm32);
+ wasm_addr.module_id = wasm_addr_t(frame->GetStackID().GetPC()).module_id;
+ wasm_addr.type = WasmAddressType::Memory;
+ }
+ if (WasmReadMemory(wasm_addr.module_id, wasm_addr.offset, buf, size))
+ return size;
+ error.SetErrorStringWithFormat("Wasm memory read failed for 0x%" PRIx64,
+ vm_addr);
+ return 0;
+ }
+ case WasmAddressType::Object:
+ return ProcessGDBRemote::ReadMemory(vm_addr, buf, size, error);
+ case WasmAddressType::Invalid:
+ default:
+ error.SetErrorStringWithFormat(
+ "Wasm read failed for invalid address 0x%" PRIx64, vm_addr);
+ return 0;
+ }
+}
+
+size_t ProcessWasm::WasmReadMemory(uint32_t wasm_module_id, lldb::addr_t addr,
+ void *buf, size_t buffer_size) {
+ char packet[64];
+ int packet_len =
+ ::snprintf(packet, sizeof(packet), "qWasmMem:%d;%" PRIx64 ";%" PRIx64,
+ wasm_module_id, static_cast<uint64_t>(addr),
+ static_cast<uint64_t>(buffer_size));
+ assert(packet_len + 1 < (int)sizeof(packet));
+ UNUSED_IF_ASSERT_DISABLED(packet_len);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet, response,
+ GetInterruptTimeout()) ==
+ GDBRemoteCommunication::PacketResult::Success) {
+ if (response.IsNormalResponse()) {
+ return response.GetHexBytes(llvm::MutableArrayRef<uint8_t>(
+ static_cast<uint8_t *>(buf), buffer_size),
+ '\xdd');
+ }
+ }
+ return 0;
+}
+
+size_t ProcessWasm::WasmReadData(uint32_t wasm_module_id, lldb::addr_t addr,
+ void *buf, size_t buffer_size) {
+ char packet[64];
+ int packet_len =
+ ::snprintf(packet, sizeof(packet), "qWasmData:%d;%" PRIx64 ";%" PRIx64,
+ wasm_module_id, static_cast<uint64_t>(addr),
+ static_cast<uint64_t>(buffer_size));
+ assert(packet_len + 1 < (int)sizeof(packet));
+ UNUSED_IF_ASSERT_DISABLED(packet_len);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet, response,
+ GetInterruptTimeout()) ==
+ GDBRemoteCommunication::PacketResult::Success) {
+ if (response.IsNormalResponse()) {
+ return response.GetHexBytes(llvm::MutableArrayRef<uint8_t>(
+ static_cast<uint8_t *>(buf), buffer_size),
+ '\xdd');
+ }
+ }
+ return 0;
+}
+
+bool ProcessWasm::GetWasmLocal(int frame_index, int index, void *buf,
+ size_t buffer_size, size_t &size) {
+ StreamString packet;
+ packet.Printf("qWasmLocal:");
+ packet.Printf("%d;%d", frame_index, index);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+ GDBRemoteCommunication::PacketResult::Success) {
+ return false;
+ }
+
+ if (!response.IsNormalResponse()) {
+ return false;
+ }
+
+ WritableDataBufferSP buffer_sp(
+ new DataBufferHeap(response.GetStringRef().size() / 2, 0));
+ response.GetHexBytes(buffer_sp->GetData(), '\xcc');
+ size = buffer_sp->GetByteSize();
+ if (size <= buffer_size) {
+ memcpy(buf, buffer_sp->GetBytes(), size);
+ return true;
+ }
+
+ return false;
+}
+
+bool ProcessWasm::GetWasmGlobal(int frame_index, int index, void *buf,
+ size_t buffer_size, size_t &size) {
+ StreamString packet;
+ packet.PutCString("qWasmGlobal:");
+ packet.Printf("%d;%d", frame_index, index);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+ GDBRemoteCommunication::PacketResult::Success) {
+ return false;
+ }
+
+ if (!response.IsNormalResponse()) {
+ return false;
+ }
+
+ WritableDataBufferSP buffer_sp(
+ new DataBufferHeap(response.GetStringRef().size() / 2, 0));
+ response.GetHexBytes(buffer_sp->GetData(), '\xcc');
+ size = buffer_sp->GetByteSize();
+ if (size <= buffer_size) {
+ memcpy(buf, buffer_sp->GetBytes(), size);
+ return true;
+ }
+
+ return false;
+}
+
+bool ProcessWasm::GetWasmStackValue(int frame_index, int index, void *buf,
+ size_t buffer_size, size_t &size) {
+ StreamString packet;
+ packet.PutCString("qWasmStackValue:");
+ packet.Printf("%d;%d", frame_index, index);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+ GDBRemoteCommunication::PacketResult::Success) {
+ return false;
+ }
+
+ if (!response.IsNormalResponse()) {
+ return false;
+ }
+
+ WritableDataBufferSP buffer_sp(
+ new DataBufferHeap(response.GetStringRef().size() / 2, 0));
+ response.GetHexBytes(buffer_sp->GetData(), '\xcc');
+ size = buffer_sp->GetByteSize();
+ if (size <= buffer_size) {
+ memcpy(buf, buffer_sp->GetBytes(), size);
+ return true;
+ }
+
+ return false;
+}
+
+bool ProcessWasm::GetWasmCallStack(lldb::tid_t tid,
+ std::vector<lldb::addr_t> &call_stack_pcs) {
+ call_stack_pcs.clear();
+ StreamString packet;
+ packet.Printf("qWasmCallStack:");
+ packet.Printf("%llx", tid);
+ StringExtractorGDBRemote response;
+ if (m_gdb_comm.SendPacketAndWaitForResponse(packet.GetString(), response) !=
+ GDBRemoteCommunication::PacketResult::Success) {
+ return false;
+ }
+
+ if (!response.IsNormalResponse()) {
+ return false;
+ }
+
+ addr_t buf[1024 / sizeof(addr_t)];
+ size_t bytes = response.GetHexBytes(
+ llvm::MutableArrayRef<uint8_t>((uint8_t *)buf, sizeof(buf)), '\xdd');
+ if (bytes == 0) {
+ return false;
+ }
+
+ for (size_t i = 0; i < bytes / sizeof(addr_t); i++) {
+ call_stack_pcs.push_back(buf[i]);
+ }
+ return true;
+}
diff --git a/lldb/source/Plugins/Process/wasm/ProcessWasm.h b/lldb/source/Plugins/Process/wasm/ProcessWasm.h
new file mode 100644
index 00000000000000..fc9736ec0684d2
--- /dev/null
+++ b/lldb/source/Plugins/Process/wasm/ProcessWasm.h
@@ -0,0 +1,129 @@
+//===-- ProcessWasm.h -------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+#define LLDB_SOURCE_PLUGINS_PROCESS_WASM_PROCESSWASM_H
+
+#include "Plugins/Process/gdb-remote/ProcessGDBRemote.h"
+#include "lldb/Target/RegisterContext.h"
+
+namespace lldb_private {
+namespace wasm {
+
+/// Each WebAssembly module has separated address spaces for Code and Memory.
+/// A WebAssembly module also has a Data section which, when the module is
+/// loaded, gets mapped into a region in the module Memory.
+/// For the purpose of debugging, we can represent all these separated 32-bit
+/// address spaces with a single virtual 64-bit address space.
+///
+/// Struct wasm_addr_t provides this encoding using bitfields
+///
+enum WasmAddressType { Memory = 0x00, Object = 0x01, Invalid = 0x03 };
+
+struct wasm_addr_t {
+ uint64_t offset : 32;
+ uint64_t module_id : 30;
+ uint64_t type : 2;
+
+ wasm_addr_t(lldb::addr_t addr)...
[truncated]
|
@JDevlieghere I created a new PR and closed #76683 because I made a mess with a git rebase, and since the other PR had just started I thought it made sense to start anew. I apologize for the confusion. Here I have already addressed most of your comments from 76683. What is left to do is to add tests with I've kept the support for @xujuntwt95329, from WebAssembly Micro Runtime, has hugely contributed to this work, but I cannot add his name to the reviewers or assignees. |
No worries, thanks for the quick turnaround!
👍
I'd vote to break it out, it can stand on its own and needs a test anyway. As an added benefit that probably means it gets merged sooner than this larger patch.
Yeah, I believe you must be a member of the project to be listed there. @xujuntwt95329 if reply or interact with the thread they should become a subscriber and get notifications. Maybe they already do given the mention. Regardless, thank you for your contribution! |
@JDevlieghere Thank you! I will remove the code that handles DW_OP_WASM_location as you suggest. For what concerns testing, I took a very quick look at |
Given we don't have a good solution for stacked commits, you don't necessarily need to remove it here, and rebase it once the other PR has been merged.
You're right that there are tests that work like that and indeed, that wouldn't work for this. If you look for tests that use the What I would do is debug a simple program using the real GDB stub and look at the gdb packet log in lldb to give you an idea what the server responds. Then I would try to mimic the debug session in the tests and implement responses from the mock server. You want to keep things as small as possible so you don't want to avoid just copy/pasting full responses (they likely wouldn't work anyway) but the real packets can serve as inspiration. |
/* always use linux signals for wasm process */ | ||
m_unix_signals_sp = | ||
UnixSignals::Create(ArchSpec{"wasm32-unknown-unknown-wasm"}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment and the creation of the signals object don't match up to me. How does wasm32-unknown-unknown-wasm
yield Linux signals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was made originally by @xujuntwt95329, who can provide more context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebAssembly doesn't define its own signal system, here we simply use UnixSignals and expect the WebAssembly runtime always use UnixSignals on all platforms.
What about wasm requires a new command, given that you are connecting to a GDB server as existing targets do. |
+1! I don't think this needs a new command, it could just use |
✅ With the latest revision this PR passed the Python code formatter. |
It is true that a new command is not really necessary. This command also works: But currently a simple |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Seems like support for reading Wasm local and global variables is available in your code, but I don't understand how can one effectively read these variables from the lldb command line. Maybe adding commands to access these can be useful ? |
Not sure what you mean by this. "frame var" and "target var" are the lldb command line commands for viewing local and global variables respectively. Why are these not appropriate for Wasm?
Jim
… On Jan 22, 2024, at 6:37 AM, Quentin Michaud ***@***.***> wrote:
Seems like support for reading Wasm local and global variables is available in your code, but I don't understand how can one effectively read these variables from the lldb command line. Maybe adding commands to access these can be useful ?
—
Reply to this email directly, view it on GitHub <#77949 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADUPVW4PNRYUFSHUC4GPP4DYPZ2RDAVCNFSM6AAAAABBYOS6L2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBUGEZTOMRQGE>.
You are receiving this because you are on a team that was mentioned.
|
@jimingham sorry if I wasn't clear. However for the sake of debugging Wasm, I am interested in accessing Wasm local and global variables (these are the variables I meant to address in my first comment). These variables are used, for example, to manage the stack pointer. They are accessed using the Wasm variable instructions. The current patch allows us to inspect the Wasm linear memory as one would inspect memory in a classic binary. Likewise, the Wasm code memory (containing the actual Wasm bytecode, which in Wasm is separated from the linear memory) is accessible with this patch with an offset of If Wasm variables support is indeed implemented in this patch as I first thought, I'm not able to see which lldb command would allow to see them. If not, I think it would be a nice addition to this patch (or another patch in the future) to allow one to debug Wasm instructions and variables directly. A way I can see to implement that would be to use the same workaround used to display the code memory with an offset, and utility commands allowing to easily access variables from this offset (something like |
Tracing down from Though, a lot of them just return true and you already have:
So I wonder if some other plugin is saying "yes I can debug this" before the wasm plugin has been asked. We must register the plugins in some order but not sure where that is done. |
Thanks @paolosevMSFT for the great work! We have implemented debugger server in WebAssembly Micro Runtime (WAMR) to work with this, and it works well! The experience of debugging dwarf-based WebAssembly modules is similar to the native elf, and with lldb‘s DAP tool, we can easily debug this in a UI friendly environment such as VSCode. This is really exciting and greatly improves the usability of WebAssembly :) Previously we maintain a huge patch file (which was created from @paolosevMSFT's patches) and tell the users to patch lldb source and build the customized lldb binary. We hope this can be upstream and shipped with official build, then people can enjoy debugging WebAssembly more conveniently |
might be a noobie question, but is compatible with split dwarf ? |
I see, thanks for the clarification.
In the patch, the WasmLocal and WasmGlobal calls are done in ReadRegister, so it seems like those are being presented as register values? `register read` should show them.
BTW, we shouldn't make a top level `wasm` command, we really try hard not to occupy more of the "easy to type" parts of the lldb command set than we can help, so there is lots left free for users to customize. There's a `language cplusplus` command (and on the swift fork `language swift`, so it would make sense to have wasm commands be vended as `language wasm`. Then people who do a lot of WebAssembly debugging can make an alias to wasm if that helps.
Jim
… On Jan 23, 2024, at 5:51 AM, Quentin Michaud ***@***.***> wrote:
@jimingham <https://github.com/jimingham> sorry if I wasn't clear. frame var and target var are commands for viewing language local and global variables (be it C or another one). They are indeed working with this patch. These commands are using the Wasm linear memory, where most of the data about the program is stored, and is similar to classical binaries memory. This linear memory can be accessed using Wasm memory instructions <https://webassembly.github.io/spec/core/syntax/instructions.html#memory-instructions>.
However for the sake of debugging Wasm, I am interested in accessing Wasm local and global variables (these are the variables I meant to address in my first comment). These variables are used, for example, to manage the stack pointer. They are accessed using the Wasm variable instructions <https://webassembly.github.io/spec/core/syntax/instructions.html#variable-instructions>.
The current patch allows us to inspect the Wasm linear memory as one would inspect memory in a classic binary. Likewise, the Wasm code memory (containing the actual Wasm bytecode, which in Wasm is separated from the linear memory) is accessible with this patch with an offset of $2^{62}$. I guess this is a workaround that may be explained in this comment <https://github.com/llvm/llvm-project/pull/77949/files#diff-ab1275eefb2cafbfa3bb7a9093caf11615867bd7ff76db85949db6811b616f3cR18>. However, I'm not sure this patch provides an access to Wasm local and global variables. I thought it was implemented as I saw mentions of WasmLocal and WasmGlobal, but I may have mixed up with language variables.
If Wasm variables support is indeed implemented in this patch as I first thought, I'm not able to see which lldb command would allow to see them. If not, I think it would be a nice addition to this patch (or another patch in the future) to allow one to debug Wasm instructions and variables directly. A way I can see to implement that would be to use the same workaround used to display the code memory with an offset, and utility commands allowing to easily access variables from this offset (something like wasm local 0 to dump the value of local variable n°0).
—
Reply to this email directly, view it on GitHub <#77949 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADUPVW2HLQMCXODX2S7W3H3YP657NAVCNFSM6AAAAABBYOS6L2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBWGEYDGOJSHA>.
You are receiving this because you were mentioned.
|
I already tried to use It seems like compiling lldb with this patch and use it to debug Wasm with the latest version of WAMR / iwasm do not work correctly (at least for me). lldb can connect to the server embedded into iwasm but doesn't seem to be able to disassemble Wasm bytecode or set breakpoints. Just giving the info here because I'm not sure this patch is supposed to be straightaway compatible with WAMR. |
Hi @mh4ck-Thales this is caused by #77949 (comment), currently we need to modify it manually. And since this PR is not merged, you need to use the [wasm_debug_2024] branch in my forked WAMR repo for testing: https://github.com/xujuntwt95329/wasm-micro-runtime/tree/wasm_debug_2024 |
Thanks! That did the trick for the breakpoint and disassembly problems. When using |
Yes, the Currently lldb doesn't have an exist model for representing wasm's local/global/operand stack values, I think encoding them as virtual registers is an easy approach for enabling WebAssembly debugging and without a huge modification to LLDB core part. On the other hand, as a managed language, I think most people won't need to read specific local/global of an instance, just like most of the JavaScript developer won't care about the value of a register in V8 during debugging, they just care about the value of their variables. So I think the virtual register approach should be okay for the developer usage, with small engineering effort and limited quality influence to LLDB. |
I agree that the average developer will not have much use of the possibility of accessing Wasm locals and globals. I also understand that Wasm locals and global are a very specific, Wasm-only concept and that adding this concept in LLVM core is not pertinent. However for people working on the inner workings of Wasm (like me), it may be very useful. I'm not sure it belongs to this PR but implementing support for accessing Wasm locals/globals in the future (using perhaps a command like What I'm wondering is, does these potential added commands have an impact on this PR ? Or can we imagine commands that will send a specific, Wasm-only request to the debugging server to get the values of the variables without using the implementation of locals and globals of this PR ? |
Yes, it is :-). For that to work, it expects that the Wasm module contains a custom section named |
What happens now is that there is a Here, with the following call stack,
This is why I introduced a new command |
However, If we implement a But I agree that it is not nice to make a top level |
The idea for this patch was only to support source-level debugging of code compiled to Wasm.
As Jun says, we are using registers here just a nice technique, devised by @xujuntwt95329, to minimize Wasm-specific changes to LLDB Core classes. The idea was not really to make these registers available to the user, even though even this can certainly be improved in the future. |
The logic to manage Please, let me and Jun know what our next steps should be to move this forward. Thanks! :) |
This feels like a bug or at least a design flaw in lldb then. ABI plugins for example clearly need to know the architecture and they work with remotes just fine, by reading the architecture from the object file (somehow). But I'm not expecting you to address or even find that here, I would just prefer that this went ahead without the
We have some registers on Linux that could be considered part of the kernel interface not the hardware, so there is precedent for these "pseudo" registers. Examples are the memory tagging and scalable matrix control registers. However those features are enabled at kernel boot, so they don't come and go at runtime. Scalable vectors (SVE) can change size at runtime but they don't come and go either. The one register that does that is the scalable matrix array register, which can be disabled by software. But, and this should tell you something, both I and my colleagues on GDB decided to just show it as all 0s when it's disabled. Instead of re-configuring the registers on every stop. It becomes hard to keep track when a register in the middle of the context disappears. So I don't disagree with the concept of showing them as registers. They could be compared to the Linux kernel pseudo registers we have. However it will be more difficult than those registers, due to the dynamism. So if you're going to do it, please open a PR focused on that. |
I agree that implementing support for accessing Wasm locals/globals would be a very interesting and exciting thing, and as a WebAssembly runtime developer, such features will be extremely useful to me (currently I need to debug the runtime itself to retrieve the internal data structures to find the value of locals/globals). But since there isn't an existing concepts in lldb to support these, we think it should be better to focus this PR on the source level debugging support to control the complexity, and leave the local/global support as further enhancement.
I think a wasm-specific command which directly send message to debug server for getting local/global values would be a simple approach to support the nice features, maybe we can consider such solution in the future to improve the debugging experience. |
Yes, the top level |
Removed the 'wasm' command and addressed all the other review comments. |
There is a strong need for this capability of LLDB from the WebAssembly Micro Runtime community. It will be awesome to have it in the LLDB. |
/// Some targets might use bits in a code address to represent additional | ||
/// information; for example WebAssembly targets have a different memory space | ||
/// per module and have a different address space per memory and code. | ||
virtual lldb::addr_t FixMemoryAddress(lldb::addr_t address, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above we have FixCodeAddres() and FixDataAddress(). Will this function fix any kind of address, or just a code address? If it is just code, then maybe we can just modify the above FixCodeAddress(...)
and add the StackFrame to that API which can default to nullptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought to add a new function because Process::FixCodeAddress
, Process::FixDataAddress
, Process::FixAnyAddress
by default forward the call to ABI::FixCodeAddress
, ABI::FixDataAddress
, ABI::FixAnyAddress
, which are implemented by plugin-specific ABIs like ABISysV_arm
, ABIMacOSX_arm
. But here we are not really dealing with a different ABI, it is more a custom representation of Wasm memory addresses used between the debugger and the Wasm engine.
If you prefer, we could reuse Process::FixAnyAddress
, making it virtual and overriding it in ProcessWasm
.
Add support for source-level debugging of WebAssembly code that runs in a WebAssembly engine.
The idea is to use the GDB-remote protocol to connect to a Wasm engine that implements a GDB-remote stub that offers the ability to access the engine runtime internal state. This protocol is already supported by the WebAssembly Micro Runtime and the WAMR team greatly contributed to this work.
A WebAssembly debugging session can be started using the new command:
wasm [<hostname>:]<portnum>
The implementation is almost completely contained in Plugin classes and the only Wasm-specific changes to Core classes is in
class DWARFExpression
that handles a new Wasm-specific DWARF opcode (DW_OP_WASM_location
) added to the DWARF standard.