-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add expected behavior to unified memory #3617
base: hip_runtime_api_how-to
Are you sure you want to change the base?
Conversation
da1233f
to
267b2f2
Compare
98f9599
to
73814d1
Compare
7e27ba9
to
84f9097
Compare
0049c2b
to
e7486be
Compare
934e4d3
to
4baf272
Compare
546b8d9
to
6244fc2
Compare
f4ab47d
to
74ebf79
Compare
docs/how-to/hip_runtime_api/memory_management/unified_memory.rst
Outdated
Show resolved
Hide resolved
38d2787
to
8d5b771
Compare
6879e02
to
ab44e41
Compare
docs/how-to/hip_runtime_api.rst
Outdated
******************************************************************************** | ||
|
||
The HIP runtime API provides C and C++ functionalities to manage event, stream, | ||
and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Common |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Common | |
and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Compute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently CLR is supposed to be Compute Language Runtime
The runtime offers functions for allocating, freeing, and copying device memory, | ||
along with transferring data between host and device memory. | ||
|
||
Here are the various memory management techniques: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like these techniques could use a brief description?
.. figure:: ../../../data/how-to/hip_runtime_api/memory_management/pageable_pinned.svg | ||
|
||
The pageable and pinned memory allow you to exercise direct control over | ||
memory operations, which is known as explicit memory management. When using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In memory_management.rst we list four memory management techniques, none of which is "explicit memory management". Is it possible to relate this explicit memory management to one of the four techniques? Or should it be listed as one of the techniques?
// Run the kernel | ||
// ... | ||
|
||
HIP_CHECK(hipMemcpy(device_input, host_input, element_number * sizeof(int), hipMemcpyHostToDevice)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right to me. We have device_output and host_output, shouldn't those come into play here?
Pinned memory | ||
================================================================================ | ||
|
||
Pinned memory or page-locked memory is stored in pages that are locked in specific sectors in RAM and can't be migrated. The pointer can be used on both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should note that pinned memory is allocated by hipMalloc() and hipHostMalloc().
// Run the kernel | ||
// ... | ||
|
||
HIP_CHECK(hipMemcpy(device_input, host_input, element_number * sizeof(int), hipMemcpyHostToDevice)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be DeviceToHost?
as HBM2e. Device memory can be allocated as global memory, constant, texture or | ||
surface memory. | ||
|
||
Global memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest making this a bulleted list rather than four separate subheads.
docs/how-to/hip_runtime_api/memory_management/device_memory.rst
Outdated
Show resolved
Hide resolved
:sup:`1` The :cpp:func:`hipHostMalloc` memory allocation coherence mode can be | ||
affected by the ``HIP_HOST_COHERENT`` environment variable, if the | ||
``hipHostMallocCoherent``, ``hipHostMallocNonCoherent``, and | ||
``hipHostMallocMapped`` are unset. If neither these flags nor the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the footnote to the table references hipHostMallocMapped flag, but the table above does not show this flag. This also is mentioned in the following note.
docs/how-to/hip_runtime_api/memory_management/unified_memory.rst
Outdated
Show resolved
Hide resolved
Unified memory can simplify the complexities of memory management in GPU | ||
computing, by not requiring explicit copies between the host and the devices. It | ||
can be particularly useful in use cases with sparse memory accesses from both | ||
the CPU and the GPU, as not the whole memory region needs to be transferred to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the CPU and the GPU, as not the whole memory region needs to be transferred to | |
the CPU and the GPU, as the whole memory region does not need to be transferred to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a logical difference between "not the whole memory region needs to be transferred" and "the whole memory region does not need to be transferred"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the meaning of the sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the sentence to make it easier to understand.
docs/how-to/hip_runtime_api/memory_management/unified_memory.rst
Outdated
Show resolved
Hide resolved
docs/how-to/hip_runtime_api/memory_management/unified_memory.rst
Outdated
Show resolved
Hide resolved
docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst
Outdated
Show resolved
Hide resolved
docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst
Outdated
Show resolved
Hide resolved
docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst
Outdated
Show resolved
Hide resolved
docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst
Outdated
Show resolved
Hide resolved
functions. To unmap the memory, use :cpp:func:`hipMemUnmap`. To release the | ||
virtual address range, use :cpp:func:`hipMemAddressFree`. Finally, to release | ||
the physical memory, use :cpp:func:`hipMemRelease`. A side effect of these | ||
functions is the lack of synchronization when memory is released. If you call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call :cpp:func:hipFree
when you have multiple streams running in parallel, it
synchronizes the device. This causes worse resource usage and performance.
This seems like it could be an important note?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On line 24: - Learning curve: Requires additional effort to understand and utilize SOMA effectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comments
78c8a5a
to
aa3584c
Compare
aa3584c
to
7c6949e
Compare
eb41cb1
to
7cb026c
Compare
7c6949e
to
0c4abec
Compare
0c4abec
to
07a7fdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
No description provided.