Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up another image generator model that is trained on Rivers and Weezer #23

Open
riverscuomo opened this issue Jan 24, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@riverscuomo
Copy link
Owner

No description provided.

@riverscuomo riverscuomo changed the title Set up another image generator that is trained on Rivers and Weezer Set up another image generator model that is trained on Rivers and Weezer Jan 24, 2024
@riverscuomo riverscuomo added the enhancement New feature or request label Jan 24, 2024
@ArthurEmidio
Copy link

ArthurEmidio commented Jan 25, 2024

I can give it a try on it one more time (and hopefully this time to also help out on something else in the repository).

Since the last time we did this last year, I see the most interesting models nowadays are:

  1. Stable Diffusion XL (SDXL): seems to be the best open-source model right now which does well with faces. Replicate supports fine-tuning it since August 2023.
  2. DreamBooth: model by Google, but Replicate says that SDXL has been consistently giving better results compared to it.

Aside from that, there's a fine-tuning technique called LoRA which is produces lighter models (faster to train, but also to run). With Replicate, they say we wouldn't have the "warm-up" problem that we used to face with the previous model. The disadvantage is that it doesn't do very well on faces, but we could try. See: https://replicate.com/docs/guides/fine-tune-an-image-model

Do you feel that the time to generate the image was something that annoyed people? I believe quality should still be the top priority, but would be good to hear your thoughts.

@ArthurEmidio
Copy link

ArthurEmidio commented Jan 25, 2024

I ran a fine-tune training of SDXL through Replicate using 9 recent images found in Google and I'll need to play with the parameters. It's performing worse than the old model.

Prompt: A painting of Rivers Cuomo playing the acoustic guitar near the lake

out-3

Once we get satisfactory results with you, we can look into how we can introduce multiple people's faces in the same model without degrading the end result.

I'll give this a try over the weekend.

@riverscuomo
Copy link
Owner Author

Exactly: I think the main issue for me is that the model was too slow to start up. Ideally it would be as fast as Dall E, which I think is what I'm using now.

The image you posted here looks like an Improvement on DALL E.

@ArthurEmidio
Copy link

Fair, I'll try to use LoRA to verify how it helps on speed and how it affects output quality.

I'll post continuous updates in this thread in the next 2-3 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants