[Typo] Fix missing links in the bitnet integration's docs (#136)

* fix install with absolute path * efficient inference with torch compile * update vllm ckpt tutorial for bitnet * ReadME Fix.
microsoft · Aug 9, 2024 · d52f93d · d52f93d
1 parent 7c6bccf
commit d52f93d
Showing 1 changed file with 4 additions and 3 deletions.
diff --git a/integration/BitNet/README.md b/integration/BitNet/README.md
@@ -2,12 +2,13 @@
 license: mit
 ---
 
-## Latest News
-
-- 08/09/2024 ✨: We provide a more efficient implementation for bitnet with vLLM, which should use special model checkpoints, to make the ckpt, please reach [].
 
 This is a BitBLAS Implementation for the reproduced 1.58bit model from [1bitLLM/bitnet_b1_58-3B](https://huggingface.co/1bitLLM/bitnet_b1_58-3B). We replaced the original simulated Int8x3bit Quantized Inference Kernel with BitBLAS INT8xINT2 Kernel. We also evaluated the model's correctness and performance through `eval_correctness.py` and `benchmark_inference_latency.py`.
 
+## Latest News
+
+- 08/09/2024 ✨: We provide a more efficient implementation for bitnet with vLLM, which should use special model checkpoints, to make the ckpt and study how to deploy, please checkout [Make Checkpoints for vLLM](#make-checkpoints-for-vllm).
+
 ## Make Checkpoints for vLLM
 
 We provide two scripts to make the checkpoints for vLLM. The first script is `generate_bitnet_model_native_format.sh`, which is used to make a checkpoint with fp16 uncompressed metaadta, the main difference with the original checkpoint is the `quant_config.json`, which allow vLLM to load the model and execute with a quant extension.