Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed May 25, 2024
1 parent 6746b27 commit a195567
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 3 deletions.
1 change: 1 addition & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ libstdc
linearizing
LOC
LUID
ltrace
Malloc
malloc
multicore
Expand Down
6 changes: 3 additions & 3 deletions docs/how-to/debugging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,10 +94,10 @@ Debugging
You can use ROCgdb for debugging and profiling.

ROCgdb is the ROCm source-level debugger for Linux and is based on GNU Project debugger (GDB).
the GNU source-level debugger, equivalent of cuda-gdb, can be used with debugger frontends, such as eclipse, vscode, or gdb-dashboard.
the GNU source-level debugger, equivalent of cuda-gdb, can be used with debugger frontends, such as Eclipse, Visual Studio Code, or GDB dashboard.
For details, see (https://github.com/ROCm/ROCgdb).

Below is a sample how to use ROCgdb run and debug HIP application, rocgdb is installed with ROCM package in the folder /opt/rocm/bin.
Below is a sample how to use ROCgdb run and debug HIP application, ROCgdb is installed with ROCM package in the folder /opt/rocm/bin.

.. code-block:: console
Expand Down Expand Up @@ -379,7 +379,7 @@ General debugging tips
This ``gdb`` command does not use an equal (=) sign.

* The GDB backtrace shows a path in the runtime. This is because a fault is caught by the runtime, but it is generated by an asynchronous command running on the GPU.
* To determine the true location of a fault, you can force the kernels to run synchronously by setting the environment variables ``AMD_SERIALIZE_KERNEL=3`` and ``AMD_SERIALIZE_COPY=3``. This forces HIP runtime to wait for the kernel to finish running before retuning. If the fault occurs when a kernel is running, you can see the code that launched the kernel inside the backtrace. The thread that's causing the issue is typically the one inside ``libhsa-runtime64.so``.
* To determine the true location of a fault, you can force the kernels to run synchronously by setting the environment variables ``AMD_SERIALIZE_KERNEL=3`` and ``AMD_SERIALIZE_COPY=3``. This forces HIP runtime to wait for the kernel to finish running before returning. If the fault occurs when a kernel is running, you can see the code that launched the kernel inside the backtrace. The thread that's causing the issue is typically the one inside ``libhsa-runtime64.so``.
* VM faults inside kernels can be caused by:

* Incorrect code (e.g., a for loop that extends past array boundaries)
Expand Down

0 comments on commit a195567

Please sign in to comment.