Skip to content

Commit

Permalink
Update docs/tutorials/cooperative_groups_tutorial.rst
Browse files Browse the repository at this point in the history
Co-authored-by: Leo Paoletti <[email protected]>
  • Loading branch information
neon60 and lpaoletti authored Jun 7, 2024
1 parent 3519879 commit a03e943
Showing 1 changed file with 4 additions and 8 deletions.
12 changes: 4 additions & 8 deletions docs/tutorials/cooperative_groups_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,15 +61,11 @@ Device side code

.. TODO: Add link here to reduction tutorial and subsections
To be able to calculate the sum of the sets of numbers, the tutorial using the
shared memory based reduction at device side. In this tutorial the warp level
intrinsics usage not covered like in the reduction tutorial. The ``x`` input
variable is a shared pointer, which needs to be synchronized after every value
changes. The ``thread_group`` input parameter can be ``thread_block_tile`` or
``thread_block`` also, because the ``thread_group`` is the parent class of these
types. The ``val`` is the numbers, which we wants to calculate the sum of. The
To calculate the sum of the sets of numbers, the tutorial uses the shared memory-based reduction on the device side. The warp level intrinsics usage is not covered in this tutorial, unlike in the Reduction tutorial. The x input variable is a shared pointer, which needs to be synchronized after every value changes. The ``thread_group`` input parameter can be ``thread_block_tile`` or
``thread_block`` because the ``thread_group`` is the parent class of these
types. The ``val`` are the numbers to calculate the sum of. The
returned results of this function return the final results of the reduction on
thread ID 0 of the ``thread_group`` and at every other threads the function
thread ID 0 of the ``thread_group``, and at every other thread, the function
results are 0.

.. code-block:: cpp
Expand Down

0 comments on commit a03e943

Please sign in to comment.