Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: atomicAdd() to support half2 #3573

Open
ZJLi2013 opened this issue Aug 14, 2024 · 5 comments
Open

[Feature Request]: atomicAdd() to support half2 #3573

ZJLi2013 opened this issue Aug 14, 2024 · 5 comments

Comments

@ZJLi2013
Copy link

Suggestion Description

hi, hip team,

here is cuda version,

void atomic_add_gmem_h2(half2* addr, half2 in) {
	atomicAdd(addr, in);
}

looks there's non hip alternative yet, if built with hipcc, it gives:

/opt/rocm-6.1.3/include/hip/amd_detail/amd_hip_atomic.h:216:5: note: candidate function not viable: no known conversion from 'half2 *' (aka '__half2 *') to 'int *' for 1st argument
  216 | int atomicAdd(int* address, int val) {
      |     ^         ~~~~~~~~~~~~
/opt/rocm-6.1.3/include/hip/amd_detail/amd_hip_atomic.h:228:14: note: candidate function not viable: no known conversion from 'half2 *' (aka '__half2 *') to 'unsigned int *' for 1st argument
  228 | unsigned int atomicAdd(unsigned int* address, unsigned int val) {
      |              ^         ~~~~~~~~~~~~~~~~~~~~~
/opt/rocm-6.1.3/include/hip/amd_detail/amd_hip_atomic.h:240:15: note: candidate function not viable: no known conversion from 'half2 *' (aka '__half2 *') to 'unsigned long *' for 1st argument
  240 | unsigned long atomicAdd(unsigned long* address, unsigned long val) {
      |               ^         ~~~~~~~~~~~~~~~~~~~~~~
/opt/rocm-6.1.3/include/hip/amd_detail/amd_hip_atomic.h:252:20: note: candidate function not viable: no known conversion from 'half2 *' (aka '__half2 *') to 'unsigned long long *' for 1st argument
  252 | unsigned long long atomicAdd(unsigned long long* address, unsigned long long val) {
      |                    ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/rocm-6.1.3/include/hip/amd_detail/amd_hip_atomic.h:264:7: note: candidate function not viable: no known conversion from 'half2 *' (aka '__half2 *') to 'float *' for 1st argument
  264 | float atomicAdd(float* address, float val) {
      |       ^         ~~~~~~~~~~~~~~
/opt/rocm-6.1.3/include/hip/amd_detail/amd_hip_atomic.h:290:8: note: candidate function not viable: no known conversion from 'half2 *' (aka '__half2 *') to 'double *' for 1st argument
  290 | double atomicAdd(double* address, double val) {

Operating System

Ubuntu 22.04

GPU

mi300

ROCm Component

6.1.3 + rocblas + rocwmma

@ZJLi2013 ZJLi2013 changed the title atomicAdd() to support half2 [Feature Request]: atomicAdd() to support half2 Aug 14, 2024
@cjatin
Copy link
Contributor

cjatin commented Aug 14, 2024

For half, we have unsafeAtomicAdd instead of atomicAdd.

https://github.com/ROCm/clr/blob/aa6d07518fdb211c49fd617ee9f69408f1acddfd/hipamd/include/hip/amd_detail/amd_hip_fp16.h#L1511

@ZJLi2013 ZJLi2013 reopened this Aug 15, 2024
@ZJLi2013
Copy link
Author

For half, we have unsafeAtomicAdd instead of atomicAdd.

https://github.com/ROCm/clr/blob/aa6d07518fdb211c49fd617ee9f69408f1acddfd/hipamd/include/hip/amd_detail/amd_hip_fp16.h#L1511

is there any risk concern for unsafeAtomicAdd, just wonder in which way it's unsafe

@b-sumner
Copy link
Contributor

Its unsafe because it causes the fast HW instruction to be generated, but those instructions don't work if they act on memory that is not cached, e.g. across a PCIe bus. The developer needs to assert that they are willing to take that risk.

@jinz2014
Copy link

jinz2014 commented Sep 5, 2024

Does ROCm 6.2 support it ?

/opt/rocm-6.2.0/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp16.h does not contain the function.

@jinz2014
Copy link

jinz2014 commented Sep 5, 2024

Do you think it is better to have two types of atomic add functions than a single function in CUDA ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants