RuntimeError: CUDA error: device-side assert triggered #220

hessaAlawwad · 2024-11-14T08:49:04Z

Hello,
I am trying the following code to test sending multiple images:

import requests
import torch
from PIL import Image
from transformers import MllamaForConditionalGeneration, AutoProcessor

model_id = "meta-llama/Llama-3.2-11B-Vision-Instruct"

# model = MllamaForConditionalGeneration.from_pretrained(
#     model_id,
#     torch_dtype=torch.bfloat16,
#     device_map="auto",
# )
# processor = AutoProcessor.from_pretrained(model_id)

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/0052a70beed5bf71b92610a43a52df6d286cd5f3/diffusers/rabbit.jpg"
image = Image.open(requests.get(url, stream=True).raw)

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "If I had to write a haiku for this one, it would be: "},
        {"type": "image"},
        {"type": "text", "text": "If I had to write a haiku for this one, it would be: "}
    ]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(
    [image,image],
    input_text,
    add_special_tokens=False,
    return_tensors="pt"
).to(model.device)

output = model.generate(**inputs, max_new_tokens=30)
print(processor.decode(output[0]))

and got the error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-23-5e73b30f8d1d>](https://localhost:8080/#) in <cell line: 34>()
     32 ).to(model.device)
     33 
---> 34 output = model.generate(**inputs, max_new_tokens=30)
     35 print(processor.decode(output[0]))

3 frames
[/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in _has_unfinished_sequences(self, this_peer_finished, synced_gpus, device, cur_len, max_length)
   2411                 if this_peer_finished_flag.item() == 0.0:
   2412                     return False
-> 2413             elif this_peer_finished:
   2414                 return False
   2415             return True

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

how can I solve it?

The text was updated successfully, but these errors were encountered:

ashwinb · 2024-11-14T19:19:04Z

cc @init27, this is a huggingface specific issue.

init27 · 2024-11-14T20:17:50Z

Thanks Ashwin!
@hessaAlawwad-this is by design, for the current model, we only recommend chatting with one image in a session.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: device-side assert triggered #220

RuntimeError: CUDA error: device-side assert triggered #220

hessaAlawwad commented Nov 14, 2024

ashwinb commented Nov 14, 2024

init27 commented Nov 14, 2024

RuntimeError: CUDA error: device-side assert triggered #220

RuntimeError: CUDA error: device-side assert triggered #220

Comments

hessaAlawwad commented Nov 14, 2024

ashwinb commented Nov 14, 2024

init27 commented Nov 14, 2024