Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the training data #37

Open
Epiphqny opened this issue Aug 9, 2024 · 1 comment
Open

Question about the training data #37

Epiphqny opened this issue Aug 9, 2024 · 1 comment

Comments

@Epiphqny
Copy link

Epiphqny commented Aug 9, 2024

Dear authors,Thank you for your excellent work. I have a question regarding your training methodology, specifically concerning the utilization of training data. Upon examining the code in your GitHub repository (

self.tokenized_data.append(torch.tensor(obj['image_tokens'], dtype=torch.long))
), I noticed that only image tokens appear to be fed into the network. Could you please confirm if my understanding is correct?
If so, I'm curious about how the model learns to generate images corresponding to different text inputs?

@irexyc
Copy link

irexyc commented Aug 29, 2024

It seems the training is build upon official chameleon ckpt.

I think the training doc is very clear. The build dataset with (text, image_tokens) pairs and only train the output layer that output the special image tokens (4, 8196).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants