-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditioning on image + text embedding #7
Comments
I think you can try concatenating the image directly to the video frames in the channel dim. |
Thanks @zkx06111, I checked it out, and that makes a lot of sense. Shouldn't it be along the If |
@ChintanTrivedi Did you had success with that? |
How do you condition (image/gif + text) on a custom input, the model should be loaded from already saved milestones/checkpoints in "./results/" folder. Thank you. |
Looking for pointers to get started on modifying the conditioning code below to include conditioning on an image along with text.
So far I am trying to condition on CLIP embeddings
However, is there a better way to condition on images in the pixel space rather than latent representations? This might also help to use this in an autoregressive manner for last frame of the diffusion sample as input condition for the next sample.
PS: Thanks Phil for the quick implementation of an interesting paper that doesnt have the official code out yet!
The text was updated successfully, but these errors were encountered: