Skip to content

Commit

Permalink
Fix the documentation build
Browse files Browse the repository at this point in the history
  • Loading branch information
neon60 committed Aug 19, 2024
1 parent 1b94970 commit 845316e
Show file tree
Hide file tree
Showing 5 changed files with 9 additions and 5 deletions.
2 changes: 2 additions & 0 deletions docs/how-to/virtual_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Global memory allocation in HIP uses the C language style allocation function. T
Virtual memory management solves these memory management problems. It helps to reduce memory usage and unnecessary ``memcpy`` calls.

.. _memory_allocation_virtual_memory:

Memory allocation
=================

Expand Down Expand Up @@ -75,6 +76,7 @@ To free the memory allocated in this manner, use the corresponding free function
hipMemAddressFree(ptr, size);
.. _usage_virtual_memory:

Memory usage
============

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ On non-AMD platforms, like NVIDIA, HIP provides header files required to support
* [HSA runtime API for ROCm](./reference/virtual_rocr)
* [HIP managed memory allocation API](./reference/unified_memory_reference)
* [HIP virtual memory management API](./reference/virtual_memory_reference)
* [HIP Cooperative groups API](./reference/cooperative_groups)
* [HIP Cooperative groups API](./reference/cooperative_groups_reference)
* [List of deprecated APIs](./reference/deprecated_api_list)

:::
Expand Down
4 changes: 3 additions & 1 deletion docs/reference/cpp_language_support.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. meta::
:description: This chapter describes the C++ support of the HIP ecosystem
ROCm software.
ROCm software.
:keywords: AMD, ROCm, HIP, C++

*******************************************************************************
Expand All @@ -20,6 +20,7 @@ code. This is due to the fact that by default a function is considered to run on
except for ``constexpr`` functions, which can run on host and device as well.

.. _language_modern_cpp_support:

Modern C++ support
===============================================================================

Expand Down Expand Up @@ -69,6 +70,7 @@ The three-way comparison operator (spaceship operator ``<=>``) works with host a
code.

.. _language_restrictions:

Extensions and restrictions
===============================================================================

Expand Down
4 changes: 2 additions & 2 deletions docs/tutorial/reduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ There are multiple variations of reduction that allow parallel processing. The a
Reduction on GPUs
=================

Implementing reductions on GPUs requires a basic understanding of the :doc:`/understand/programming_model_reference`. The document explores aspects of low-level optimization best discussed through the :ref:`inherent_thread_model`, and refrains from using cooperative groups.
Implementing reductions on GPUs requires a basic understanding of the :doc:`/understand/programming_model`. The document explores aspects of low-level optimization best discussed through the :ref:`inherent_thread_model`, and refrains from using cooperative groups.

Synchronizing parallel threads of execution across a GPU is crucial for correctness as the partial results can't be synchronized before they manifest. Synchronizing all the threads running on a GPU at any given time is possible, however, it is a costly and intricate operation. If synchronization is not absolutely necessary, map the parallel algorithm so that multiprocessors and blocks can make independent progress and need not sync frequently.

Expand Down Expand Up @@ -602,7 +602,7 @@ This kernel variant utilizes another generally applicable utility known as ``hip
template<uint32_t BlockSize, uint32_t WarpSize, uint32_t ItemsPerThread>
__global__ static __launch_bounds__(BlockSize) void kernel(...)

The kernel now has three compile-time configurable parameters. The only part of the kernel that changes depends on how you load data from global and perform the binary operation on those loaded values. So, the following step to read input from front buffer to global is now split into two steps: :ref:`reading ``ItemsPerThread`` <reading-items>`and :ref:`processing ``ItemsPerThread`` <processing-items>`.
The kernel now has three compile-time configurable parameters. The only part of the kernel that changes depends on how you load data from global and perform the binary operation on those loaded values. So, the following step to read input from front buffer to global is now split into two steps: :ref:`reading-items` and :ref:`processing-items` .

.. code-block:: C++

Expand Down
2 changes: 1 addition & 1 deletion docs/understand/programming_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ HIP programs consist of two distinct scopes:
towards implementing abstractions atop, such as the runtime API itself.
Offers two additional pieces of functionality not provided by the Runtime
API: ``hipModule`` and ``hipCtx`` APIs. For further details, check
:doc:`HIP driver API </how-to/driver_api>`.
:doc:`HIP driver API </how-to/hip_porting_driver_api>`.

* The device-side kernels running on GPUs. Both the host and the device-side
APIs have synchronous and asynchronous functions in them.
Expand Down

0 comments on commit 845316e

Please sign in to comment.