Skip to content

Commit

Permalink
[Dev] Transform 3rdparty tvm from bitblas into bitblas_tl (#95)
Browse files Browse the repository at this point in the history
* Refactor BatchMatMulEmitter and BatchMatMulSelector for improved readability and maintainability

* Refactor import statements for improved readability and maintainability

* Refactor import statements for improved readability and maintainability

* disable failure email for ci

* remove email notifications.

* move relax pass from testing to mlc_llm

* Refactor scripts with se check_eual_ref_scripts_with_emitter function

* Lint Fix

* Refactor scripts with se check_eual_ref_scripts_with_emitter function

* bug fix in test

* lint fix.

* test cuda i4 kernel

* Refactor copyright notice in i4matmul.hpp

* Refactor BitBLASLinear test module for improved readability and maintainability

* refactor test as version below python 3.9 cannot handle int32 overflow.

* format lint for test

* Refactor test_int4b_fp16_convert.py for improved readability and maintainability

* remove unused design file

* move tile device from package to base

* dummy impl for codegen

* Refactor file structure for ladder_permutate module

* Refactor backend class and fix typos in comments

* Deep refactor Lib related code.

* remove ci pull.

* LintFix

* refactor builder for whl build

* Refactor TIRWrapper.wrap() method to include an assertion for the optimized module

* Refactor lib_generator to set library and source paths

* lint fix

* BitNet vllm integration

* chore: update codespell to version 2.3.0

* Lintfix

* Bump version to 0.0.1.dev13

* lint fix

* disable fast decoding [u]int4xint8 by default.

* optimize from dict design in Hint

* Implement SplitK

* bitnet benchmark generation.

* Add benchmark script for BitNet integration

* AtomicAdd Support

* LintFix

* ci fix when 3rdparty tvm is initialized.

* bug fix for setup

* fix a bug in block reduce

* typo fix

* BUG Fix for block reduce.

* Lint fix

* Refactor block reduce schedule template

* transform branch from bitblas to bitblas_tl

* Fix subproject commit reference in 3rdparty/tvm

* chore: update submodule branch from bitblas to bitblas_tl

* force update config.cmake

* Bug fix
  • Loading branch information
LeiWang1999 authored Jul 22, 2024
1 parent 9bdcbf8 commit e1e30d7
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "3rdparty/tvm"]
path = 3rdparty/tvm
url = https://github.com/LeiWang1999/tvm
branch = bitblas
branch = bitblas_tl
2 changes: 1 addition & 1 deletion 3rdparty/tvm
Submodule tvm updated 88 files
+10 −0 CMakeLists.txt
+63 −35 README.md
+3 −31 cmake/config.cmake
+1 −0 include/tvm/runtime/c_runtime_api.h
+2 −0 include/tvm/runtime/data_type.h
+1 −1 python/tvm/script/parser/core/evaluator.py
+20 −0 python/tvm/tl/__init__.py
+21 −0 python/tvm/tl/_ffi_api.py
+99 −0 python/tvm/tl/autotuner.py
+142 −0 python/tvm/tl/engine.py
+281 −0 python/tvm/tl/language.py
+94 −0 python/tvm/tl/layout.py
+108 −0 python/tvm/tl/transform.py
+249 −0 python/tvm/tl/utils.py
+9 −1 src/runtime/pack_args.h
+18 −2 src/tir/analysis/block_access_region_detector.cc
+1 −1 src/tir/transforms/lower_device_kernel_launch.cc
+7 −3 src/tir/transforms/merge_shared_memory_allocations.cc
+1 −1 src/tir/transforms/storage_access.h
+52 −7 src/tir/transforms/thread_storage_sync.cc
+135 −0 src/tl/ir.cc
+348 −0 src/tl/layout/gemm_layouts.cc
+412 −0 src/tl/layout/layout.cc
+167 −0 src/tl/layout/layout.h
+116 −0 src/tl/layout/swizzle.cc
+91 −0 src/tl/layout/swizzle.h
+262 −0 src/tl/layout/utils.cc
+76 −0 src/tl/layout/utils.h
+98 −0 src/tl/op/builtin.cc
+152 −0 src/tl/op/builtin.h
+393 −0 src/tl/op/bulk_copy.cc
+82 −0 src/tl/op/bulk_copy.h
+355 −0 src/tl/op/elem.cc
+82 −0 src/tl/op/elem.h
+207 −0 src/tl/op/gemm.cc
+62 −0 src/tl/op/gemm.h
+102 −0 src/tl/op/op.cc
+113 −0 src/tl/op/op.h
+190 −0 src/tl/op/parallel.cc
+88 −0 src/tl/op/parallel.h
+222 −0 src/tl/op/reduce.cc
+61 −0 src/tl/op/reduce.h
+203 −0 src/tl/runtime/runtime.cc
+37 −0 src/tl/runtime/runtime.h
+1,058 −0 src/tl/target/codegen.cc
+88 −0 src/tl/target/codegen.h
+104 −0 src/tl/target/rt_mod.cc
+85 −0 src/tl/target/utils.cc
+48 −0 src/tl/target/utils.h
+41 −0 src/tl/tl_templates/common.h
+73 −0 src/tl/tl_templates/copy.h
+217 −0 src/tl/tl_templates/copy_sm90.h
+10 −0 src/tl/tl_templates/gemm.h
+160 −0 src/tl/tl_templates/gemm_sm70.h
+314 −0 src/tl/tl_templates/gemm_sm80.h
+147 −0 src/tl/tl_templates/gemm_sm90.h
+100 −0 src/tl/tl_templates/ldsm.h
+53 −0 src/tl/tl_templates/reduce.h
+39 −0 src/tl/tl_templates/threadblock_swizzle.h
+133 −0 src/tl/transform/cluster_planning.cc
+94 −0 src/tl/transform/frontend_legalize.cc
+934 −0 src/tl/transform/inject_pipeline.cc
+291 −0 src/tl/transform/layout_inference.cc
+164 −0 src/tl/transform/loop_partition.cc
+46 −0 src/tl/transform/loop_partition.h
+451 −0 src/tl/transform/loop_vectorize.cc
+45 −0 src/tl/transform/loop_vectorize.h
+157 −0 src/tl/transform/lower_hopper_intrin.cc
+185 −0 src/tl/transform/lower_tile_op.cc
+242 −0 src/tl/transform/pipeline_planning.cc
+842 −0 src/tl/transform/warp_specialized_pipeline.cc
+25 −0 tl_doc/flash_perf.md
+61 −0 tl_doc/language_ref.md
+82 −0 tl_scripts/conv_example.py
+86 −0 tl_scripts/gemm_example.py
+48 −0 tl_scripts/layout_anno_example.py
+103 −0 tl_scripts/mamba_example.py
+321 −0 tl_scripts/mha_bwd_example.py
+120 −0 tl_scripts/mha_example.py
+61 −0 tl_scripts/profile.py
+41 −0 tl_scripts/reduce_example.py
+103 −0 tl_scripts/retnet_example.py
+75 −0 tl_scripts/rms_norm.py
+59 −0 tl_scripts/splitk_example.py
+54 −0 tl_scripts/test.py
+191 −0 tl_scripts/test_gemm.py
+255 −0 tl_scripts/triton_gemm.py
+673 −0 tl_scripts/triton_mha.py

0 comments on commit e1e30d7

Please sign in to comment.