Skip to content

Commit

Permalink
Limit z-threads in columnwise_partition
Browse files Browse the repository at this point in the history
  • Loading branch information
charleskawczynski committed Sep 16, 2024
1 parent 4e7c44d commit 2c7b404
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion ext/cuda/data_layouts_threadblock.jl
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,11 @@ end
n_max_threads::Integer,
)
(Nij, _, _, _, Nh) = DataLayouts.universal_size(us)
Nh_thread = min(Int(fld(n_max_threads, Nij * Nij)), Nh)
Nh_thread = min(
Int(fld(n_max_threads, Nij * Nij)),
maximum_allowable_threads()[3],
Nh,
)
Nh_blocks = cld(Nh, Nh_thread)
@assert prod((Nij, Nij, Nh_thread)) n_max_threads "threads,n_max_threads=($(prod((Nij, Nij, Nh_thread))),$n_max_threads)"
return (; threads = (Nij, Nij, Nh_thread), blocks = (Nh_blocks,))
Expand Down

0 comments on commit 2c7b404

Please sign in to comment.