Support newer versions of mistral (e.g. mistralai/Mistral-7B-Instruct-v0.2)? #41

spring1915 · 2024-01-10T04:54:17Z

No description provided.

tomaarsen · 2024-01-10T08:53:41Z

Hello!

mistralai/Mistral-7B-Instruct-v0.2 should be supported in the same way that Mistral-7B-v0.1 is :)

Also, consider using the new Attention Sinks implementation in transformers directly: The SinkCache. See how to use it here:
https://colab.research.google.com/drive/1S0oIPaqxAVp0oWEwTadhZXDjhWiTyF12?usp=sharing

You should be able to replace HuggingFaceH4/zephyr-7b-beta with mistralai/Mistral-7B-Instruct-v0.2.

Tom Aarsen

spring1915 · 2024-01-10T15:25:14Z

Great! Thanks @tomaarsen for sharing.

I have another question as you're an expert in the field. I used the standard streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True) in mode.generate() for llama2-7b in Colab, and the model streaming worked well but often stopped in the process or when I ran a second request. I also encountered this issue when inferring with the model on AWS large instances (ml.g5.48x) with DeepSpeed. Can you give me a hint into the causes? I googled but haven't found a satisfactory answer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support newer versions of mistral (e.g. mistralai/Mistral-7B-Instruct-v0.2)? #41

Support newer versions of mistral (e.g. mistralai/Mistral-7B-Instruct-v0.2)? #41

spring1915 commented Jan 10, 2024

tomaarsen commented Jan 10, 2024

spring1915 commented Jan 10, 2024 •

edited

Loading

Support newer versions of mistral (e.g. mistralai/Mistral-7B-Instruct-v0.2)? #41

Support newer versions of mistral (e.g. mistralai/Mistral-7B-Instruct-v0.2)? #41

Comments

spring1915 commented Jan 10, 2024

tomaarsen commented Jan 10, 2024

spring1915 commented Jan 10, 2024 • edited Loading

spring1915 commented Jan 10, 2024 •

edited

Loading