For a 2-D grid kernel launch, does blockIdx.y change faster than blockIdx.x? #1058
-
When I know in the above case, only I know thread blocks can execute independently, but are they scheduled to run randomly? Does |
Beta Was this translation helpful? Give feedback.
Replies: 0 comments 1 reply
-
Hello, @zhaolianshuizls! There isn't a guarantee on the order. The thread blocks are scheduled independently. I think due to the current |
Beta Was this translation helpful? Give feedback.
Hello, @zhaolianshuizls! There isn't a guarantee on the order. The thread blocks are scheduled independently. I think due to the current
num_items
restriction, we haven't actually tested the scenario whereblockIdx.y
is larger than 1. We should take a look once we get to supporting 64-bit offsets in select / partition. If this is critical to you, please, feel free to open an issue against https://github.com/NVIDIA/cccl.