Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Stable Diffusion ControlNet #359

Merged
merged 44 commits into from
Apr 8, 2024

Conversation

joelpaulkoch
Copy link
Contributor

I want to share my work on using ControlNet with Stable Diffusion. These are the three parts and notes on current limitations:

  1. ControlNet model
  • there is a conditioning scale parameter in the diffusers implementation, at the moment I have a constant of 1 for this. Would you add it as (optional) input?
  • regarding testing: there is "hf-internal-testing/tiny-controlnet", but this only returned zeros for me (in diffusers and bumblebee), so I kept the test with "lllyasviel/sd-controlnet-scribble".
  1. UNet
  • I added a new model architecture :with_additional_residuals and separate input and core functions. But in the end the only difference is that additional residuals are passed in and added. So alternatively this could go in the :base architecture as optional inputs and add layers I guess.
  1. Stable Diffusion with ControlNet
  • Similarly, I've copied the existing stable diffusion implementation and adapted it to support the control net. It might be better to have it in the existing StableDiffusion module.
  • the current implementation accepts a u8 tensor of the correct size as conditioning image. Preprocessing is converting the tensor to f32. It might make sense to be more lenient e.g. resize the conditioning image as part of the preprocessing (?)

I've tried all the control nets listed here with the corresponding example and got sensible results for all but the normal map one. I'm not sure what's the issue with the normal map, but could imagine it's because of the preprocessing, or I simply did not run enough steps.

@jonatanklosko
Copy link
Member

jonatanklosko commented Mar 6, 2024

Hey @joelpaulkoch, thanks for the PR! I will have a more detailed look later, for now a couple high-level comments :)

there is a conditioning scale parameter in the dif
fusers implementation, at the moment I have a constant of 1 for this.

Having is a optional serving input sounds good (similar to how we have :seed, for example).

I added a new model architecture :with_additional_residuals and separate input and core functions.

Since the difference is only in inputs, I would totally just have them as optional inputs in the :base architecture, yeah. This also more closely matches what diffusers do.

Similarly, I've copied the existing stable diffusion implementation and adapted it to support the control net. It might be better to have it in the existing StableDiffusion module.

I think a separate module makes sense, in general I would have one module per diffusion type (SD, SD control net, SD XL, ...) and then a serving function for a task (currently only text_to_image, but could be image_to_image, and so on). This would roughly correspond to diffusers, such that a serving function corresponds to a pipeline class and a module to the pipeline grouping directory they have.

Preprocessing is converting the tensor to f32.

I see in diffusers they have VaeImageProcessor, though if in this case it always comes down to converting into f32, then it's probably fine to just be a function.

Copy link
Member

@jonatanklosko jonatanklosko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, some more comments :)

lib/bumblebee/diffusion/stable_diffusion/control_net.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/unet_2d_conditional.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/unet_2d_conditional.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/unet_2d_conditional.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/unet_2d_conditional.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/stable_diffusion_controlnet.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/stable_diffusion_controlnet.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/stable_diffusion/control_net.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/stable_diffusion/control_net.ex Outdated Show resolved Hide resolved
lib/bumblebee/diffusion/stable_diffusion/control_net.ex Outdated Show resolved Hide resolved
)
|> Nx.tile([256, 8, 3])
|> Nx.pad(0, [{192, 64, 0}, {192, 64, 0}, {0, 0, 0}])
|> Nx.transpose(axes: [1, 0, 2])
Copy link
Contributor Author

@joelpaulkoch joelpaulkoch Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example that controls the stripes of the numbat (but doesn't always work that well). We might want to rewrite this or replace it with an actual scribble image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a nice exercise to add a notebook that detects edges with Evision and uses it as edges control (like here), for the example here any tensor is fine though, just to give an idea. I also added a comment as a clarification :)

@jonatanklosko
Copy link
Member

We could share more logic between the servings, but it's fine for now, we can refactor once there are more :)

@jonatanklosko
Copy link
Member

Btw. I updated the tests to use tiny checkpoints and generated reference values using hf/diffusers :)

Copy link
Member

@jonatanklosko jonatanklosko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelpaulkoch it's good to go, thanks for working on this!

@jonatanklosko jonatanklosko merged commit be8e710 into elixir-nx:main Apr 8, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants