Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FP8] Support Weight Dequantize FP16xFP8_E4M3 #42

Merged
merged 3 commits into from
May 20, 2024

Conversation

LeiWang1999
Copy link
Contributor

This pull request primarily focuses on expanding the functionality of the existing codebase to include support for new formats and simplifying the existing code. The most significant changes include the addition of new formats (FP8_E4M3, FP_E5M2) in the check_weight_decode_info function, simplification of code in general_matmul.py, and the addition of new conversion functions in quantization.py.

Addition of new formats:

Code simplification:

Addition of new conversion functions:

@LeiWang1999 LeiWang1999 merged commit 42f379c into microsoft:main May 20, 2024
3 checks passed
@LeiWang1999 LeiWang1999 deleted the dev/fp8 branch July 6, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant