question about the image understanding #25

df2046df · 2024-07-17T13:45:13Z

Does this model support multiple image inputs?

JoyBoy-Su · 2024-07-17T16:07:06Z

Hi, thanks for your interest!
Anole can support multiple input images. You can do this by adjusting the structure of input.json and refer to the instruction to run. Here's an example:

[
    {
        "type": "image",
        "content": "image1.png"
    },
    {
        "type": "image",
        "content": "image2.png"
    },
    {
        "type": "text",
        "content": "your instruction"
    }
]

And it's important to note that the performance of Anole depends on the multiple image input task, and Anole may perform differently on different tasks.

df2046df · 2024-07-18T02:23:38Z

Hi, thanks for your interest! Anole can support multiple input images. You can do this by adjusting the structure of input.json and refer to the instruction to run. Here's an example:
[
    {
        "type": "image",
        "content": "image1.png"
    },
    {
        "type": "image",
        "content": "image2.png"
    },
    {
        "type": "text",
        "content": "your instruction"
    }
]
And it's important to note that the performance of Anole depends on the multiple image input task, and Anole may perform differently on different tasks.

Thank you for your reply！ But I have a problem when inputting multiple images: when the number of input images is greater than or equal to four, the following error will occur:

Traceback (most recent call last):
File "/opt/data/private/code/anole/inference.py", line 133, in
main(args)
File "/opt/data/private/code/anole/inference.py", line 107, in main
segments = split_token_sequence(tokens, boi, eoi)
File "/opt/data/private/code/anole/inference.py", line 32, in split_token_sequence
batch_size, _ = tokens.shape
ValueError: not enough values to unpack (expected 2, got 1)

I output the shape of tokens and found that the result is torch.Size([0]). What is the reason for this?

JoyBoy-Su · 2024-07-18T04:24:11Z

Hi, thanks for your interest! Anole can support multiple input images. You can do this by adjusting the structure of input.json and refer to the instruction to run. Here's an example:
[
    {
        "type": "image",
        "content": "image1.png"
    },
    {
        "type": "image",
        "content": "image2.png"
    },
    {
        "type": "text",
        "content": "your instruction"
    }
]
And it's important to note that the performance of Anole depends on the multiple image input task, and Anole may perform differently on different tasks.
Thank you for your reply！ But I have a problem when inputting multiple images: when the number of input images is greater than or equal to four, the following error will occur:

Traceback (most recent call last): File "/opt/data/private/code/anole/inference.py", line 133, in main(args) File "/opt/data/private/code/anole/inference.py", line 107, in main segments = split_token_sequence(tokens, boi, eoi) File "/opt/data/private/code/anole/inference.py", line 32, in split_token_sequence batch_size, _ = tokens.shape ValueError: not enough values to unpack (expected 2, got 1)

I output the shape of tokens and found that the result is torch.Size([0]). What is the reason for this?

Probably because the default Anole context length is 4096 and the number of tokens per image is 1026 (1024 + boi + eoi), which makes the model not work properly when the number of input images is greater than or equal to 4.

YiFang99 · 2024-07-18T05:01:58Z

Is the number of tokens per image a parameter that user can set or is it fixed?

JoyBoy-Su · 2024-07-18T07:42:46Z

Is the number of tokens per image a parameter that user can set or is it fixed?

I'm sorry it's fixed.

df2046df · 2024-07-18T09:03:32Z

Hi, thanks for your interest! Anole can support multiple input images. You can do this by adjusting the structure of input.json and refer to the instruction to run. Here's an example:
[
    {
        "type": "image",
        "content": "image1.png"
    },
    {
        "type": "image",
        "content": "image2.png"
    },
    {
        "type": "text",
        "content": "your instruction"
    }
]
And it's important to note that the performance of Anole depends on the multiple image input task, and Anole may perform differently on different tasks.
Thank you for your reply！ But I have a problem when inputting multiple images: when the number of input images is greater than or equal to four, the following error will occur:
Traceback (most recent call last): File "/opt/data/private/code/anole/inference.py", line 133, in main(args) File "/opt/data/private/code/anole/inference.py", line 107, in main segments = split_token_sequence(tokens, boi, eoi) File "/opt/data/private/code/anole/inference.py", line 32, in split_token_sequence batch_size, _ = tokens.shape ValueError: not enough values to unpack (expected 2, got 1)
I output the shape of tokens and found that the result is torch.Size([0]). What is the reason for this?
Probably because the default Anole context length is 4096 and the number of tokens per image is 1026 (1024 + boi + eoi), which makes the model not work properly when the number of input images is greater than or equal to 4.

I have another question. When I use the model for batch image understanding, the output is empty.

What could be the reason for this?

JoyBoy-Su added question Further information is requested inference Something about inference labels Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the image understanding #25

question about the image understanding #25

df2046df commented Jul 17, 2024

JoyBoy-Su commented Jul 17, 2024

df2046df commented Jul 18, 2024

JoyBoy-Su commented Jul 18, 2024

YiFang99 commented Jul 18, 2024

JoyBoy-Su commented Jul 18, 2024

df2046df commented Jul 18, 2024

question about the image understanding #25

question about the image understanding #25

Comments

df2046df commented Jul 17, 2024

JoyBoy-Su commented Jul 17, 2024

df2046df commented Jul 18, 2024

JoyBoy-Su commented Jul 18, 2024

YiFang99 commented Jul 18, 2024

JoyBoy-Su commented Jul 18, 2024

df2046df commented Jul 18, 2024