Skip to content

Commit

Permalink
[BugFix] Fix a bug in Static shape build (#53)
Browse files Browse the repository at this point in the history
* improve e4m3 decoding.

* append fp16xint1

* Update submodule commit reference

* chore: Update shared memory scope for float32 output dtype

* BUGFIX: UINT8/INT8 Decoding

* feat: Add rasterization options for roller module

* Refactor tensorcore_legalization method to optimize tensor core usage

* feat: Add function to collect variables from expression, improve for splitk

* chore: Update typing import in __init__.py

* chore: Refactor CPU execution of operators

* Refactor matmul implementation for splitk layout

* Refactor matmul implementation for splitk layout

* Refactor matmul implementation for splitk layout

* chore: Update version to 0.0.1.dev8

* chore: Enable debug output in bitblas.set_debug_level()

* Refactor Linear module matmul implementation for splitk layout

* Refactor matmul implementation for splitk layout

* Refactor CUDA kernel launch string for dynamic symbolic set

* Bumpt version to v0.0.1.dev9

* Refactor CUDA kernel launch string for dynamic symbolic set

* Bump version to v0.0.1.dev10

* Refactor CUDA kernel launch string for dynamic symbolic set

---------

Co-authored-by: LeiWang199 <leiwang199>
  • Loading branch information
LeiWang1999 authored Jun 6, 2024
1 parent 1057b07 commit 857732b
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion python/bitblas/wrapper/general.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,7 +287,10 @@ def legalize_c(p):
# Determine the shared memory size, defaulting to 0 if not specified
smem_str = 0 if self.dynamic_smem_buf is None else self.dynamic_smem_buf
# Format the CUDA kernel launch string
call_str = "if ({} == 0) return; \n\t\t".format(list(dynamic_symbolic_set)[0])
if len(dynamic_symbolic_set) != 0:
call_str = "if ({} == 0) return; \n\t\t".format(list(dynamic_symbolic_set)[0])
else:
call_str = ""
call_str += "{}<<<{}, {}, {}, stream>>>({});".format(function_name, grid_str, block_str, smem_str,
call_args)
# Create the host function wrapper for the CUDA kernel
Expand Down

0 comments on commit 857732b

Please sign in to comment.