-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mlas int4 int8 with avx2/512 #20687
Mlas int4 int8 with avx2/512 #20687
Commits on May 3, 2024
-
quick adapt llama.cpp to experiment performance. Only works with blkl…
…en32, symmetric1 hasBias0 Int8 Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 293f121 - Browse repository at this point
Copy the full SHA 293f121View commit details
Commits on May 6, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 04c2e56 - Browse repository at this point
Copy the full SHA 04c2e56View commit details
Commits on May 7, 2024
-
tile 2x4 SQNBITGEMM<4>/BlkLen:32/M:2048/N:4096/K:4096/Threads:1/Symme…
…tric:1/ComputeType:4/real_time_mean 1542487160 ns 1539062500 ns Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cdfda6f - Browse repository at this point
Copy the full SHA cdfda6fView commit details
Commits on May 8, 2024
-
use one_16_epi16 and accumulate_2blk_dot: SQNBITGEMM<4>/BlkLen:32/M:2…
…048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 1434872720 ns Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 92dad97 - Browse repository at this point
Copy the full SHA 92dad97View commit details
Commits on May 9, 2024
-
apply to M1, BQuant layout pack block (subblk) larger than blklen: SQ…
…NBITGEMM<4>/BlkLen:32/M:2048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 1265060620 ns 1265625000 ns Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5418e9c - Browse repository at this point
Copy the full SHA 5418e9cView commit details
Commits on May 10, 2024
-
use new AQuant layout (not work if total M is not RangeCountM): SQNBI…
…TGEMM<4>/BlkLen:32/M:2048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 1214042220 ns Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0401f72 - Browse repository at this point
Copy the full SHA 0401f72View commit details
Commits on May 13, 2024
-
apply blksum to blklen32 and 64: SQNBITGEMM<4>/BlkLen:32/M:2048/N:409…
…6/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 784668090 ns; SQNBITGEMM<4>/BlkLen:64/M:2048/N:4096/K:4096/Threads:1/Symmetric:1/ComputeType:4/real_time_mean 754939430 ns Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a57eeba - Browse repository at this point
Copy the full SHA a57eebaView commit details
Commits on May 15, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f2c33af - Browse repository at this point
Copy the full SHA f2c33afView commit details
Commits on May 26, 2024
-
impl avx512: SQNBITGEMM<4>/BlkLen:32/M:2048/N:4096/K:4096/Threads:1/S…
…ymmetric:1/ComputeType:4/real_time_mean 664029830 ns Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0ca24f4 - Browse repository at this point
Copy the full SHA 0ca24f4View commit details
Commits on Jun 1, 2024
-
matmul_nbit & fix alignment for sgemm
Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7f89d5f - Browse repository at this point
Copy the full SHA 7f89d5fView commit details
Commits on Jun 4, 2024
-
Configuration menu - View commit details
-
Copy full SHA for ed0e666 - Browse repository at this point
Copy the full SHA ed0e666View commit details
Commits on Jun 10, 2024
-
fix mlas benchmark not using multi threads
Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 35d02a6 - Browse repository at this point
Copy the full SHA 35d02a6View commit details -
Configuration menu - View commit details
-
Copy full SHA for b9493ad - Browse repository at this point
Copy the full SHA b9493adView commit details -
Merge branch 'liqun/mlas-q4-tile-avx' of https://github.com/microsoft…
…/onnxruntime into liqun/mlas-q4-tile-avx
Configuration menu - View commit details
-
Copy full SHA for c443eb5 - Browse repository at this point
Copy the full SHA c443eb5View commit details
Commits on Jun 16, 2024
-
Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ac66951 - Browse repository at this point
Copy the full SHA ac66951View commit details
Commits on Jun 17, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 42a1305 - Browse repository at this point
Copy the full SHA 42a1305View commit details
Commits on Jun 27, 2024
-
layout to follow compute, M1 separate with M > 1
Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 740031a - Browse repository at this point
Copy the full SHA 740031aView commit details
Commits on Jun 28, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1a6031e - Browse repository at this point
Copy the full SHA 1a6031eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 283fd2d - Browse repository at this point
Copy the full SHA 283fd2dView commit details
Commits on Jul 4, 2024
-
Configuration menu - View commit details
-
Copy full SHA for d035939 - Browse repository at this point
Copy the full SHA d035939View commit details
Commits on Jul 5, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f329d2d - Browse repository at this point
Copy the full SHA f329d2dView commit details -
pass avx512 blklen 16, 128, 256
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 27cfd9c - Browse repository at this point
Copy the full SHA 27cfd9cView commit details
Commits on Jul 11, 2024
-
pass fp32, refactor sqnbitgemm
Signed-off-by: Liqun Fu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for edee319 - Browse repository at this point
Copy the full SHA edee319View commit details
Commits on Jul 12, 2024
-
Configuration menu - View commit details
-
Copy full SHA for fb9221a - Browse repository at this point
Copy the full SHA fb9221aView commit details
Commits on Jul 18, 2024
-
Configuration menu - View commit details
-
Copy full SHA for c109b4b - Browse repository at this point
Copy the full SHA c109b4bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6654d22 - Browse repository at this point
Copy the full SHA 6654d22View commit details
Commits on Jul 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 4b91bed - Browse repository at this point
Copy the full SHA 4b91bedView commit details
Commits on Jul 23, 2024
-
rm unused ComputeParallelTasksSGemm
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8674b9f - Browse repository at this point
Copy the full SHA 8674b9fView commit details
Commits on Jul 24, 2024
-
avoid _mm256_dpbusds_avx_epi32 in avx512vnni
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e26e29e - Browse repository at this point
Copy the full SHA e26e29eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2b0307e - Browse repository at this point
Copy the full SHA 2b0307eView commit details
Commits on Jul 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 40df782 - Browse repository at this point
Copy the full SHA 40df782View commit details -
Configuration menu - View commit details
-
Copy full SHA for 51e97c8 - Browse repository at this point
Copy the full SHA 51e97c8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 48e8639 - Browse repository at this point
Copy the full SHA 48e8639View commit details
Commits on Jul 29, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 705aa1f - Browse repository at this point
Copy the full SHA 705aa1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 012e9c4 - Browse repository at this point
Copy the full SHA 012e9c4View commit details
Commits on Jul 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 21b9138 - Browse repository at this point
Copy the full SHA 21b9138View commit details -
CMAKE_CXX_COMPILER_VERSION VERSION_GREATER 10
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1fb1c83 - Browse repository at this point
Copy the full SHA 1fb1c83View commit details -
missed 2 files from (__GNUC__ > 10)
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 85918e9 - Browse repository at this point
Copy the full SHA 85918e9View commit details -
missed _mm256_dpbusds_avx_epi32 and print out cmake msgs
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9530ac5 - Browse repository at this point
Copy the full SHA 9530ac5View commit details -
Configuration menu - View commit details
-
Copy full SHA for f77cffd - Browse repository at this point
Copy the full SHA f77cffdView commit details -
Configuration menu - View commit details
-
Copy full SHA for a6fd378 - Browse repository at this point
Copy the full SHA a6fd378View commit details -
Configuration menu - View commit details
-
Copy full SHA for c875e5c - Browse repository at this point
Copy the full SHA c875e5cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3b56710 - Browse repository at this point
Copy the full SHA 3b56710View commit details -
Configuration menu - View commit details
-
Copy full SHA for 746562f - Browse repository at this point
Copy the full SHA 746562fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 52fc7fa - Browse repository at this point
Copy the full SHA 52fc7faView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0933a6b - Browse repository at this point
Copy the full SHA 0933a6bView commit details
Commits on Jul 31, 2024
-
Signed-off-by: liqunfu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2b35c82 - Browse repository at this point
Copy the full SHA 2b35c82View commit details
Commits on Aug 1, 2024
-
Configuration menu - View commit details
-
Copy full SHA for caeb35e - Browse repository at this point
Copy the full SHA caeb35eView commit details