Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve text-to-image task page #889

Merged
merged 32 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
6e0a3bc
little change to task description
linoytsaban Sep 3, 2024
cbb3969
Merge branch 'huggingface:main' into text2img-task
linoytsaban Sep 3, 2024
0858b96
add initial task variants
linoytsaban Sep 3, 2024
23d9d44
Merge branch 'main' into text2img-task
linoytsaban Sep 4, 2024
017ff04
add image editing task
linoytsaban Sep 4, 2024
baa2c37
add personalization
linoytsaban Sep 5, 2024
d9ef465
add links
linoytsaban Sep 5, 2024
6ab90ed
Merge remote-tracking branch 'origin/text2img-task' into text2img-task
linoytsaban Sep 5, 2024
19463e4
Merge branch 'main' into text2img-task
linoytsaban Sep 5, 2024
47801f2
format
linoytsaban Sep 5, 2024
b5f8250
format
linoytsaban Sep 5, 2024
2c8b759
format
linoytsaban Sep 5, 2024
812f8e2
format
linoytsaban Sep 5, 2024
8aa6daa
Merge branch 'main' into text2img-task
linoytsaban Sep 6, 2024
9b792c1
Update packages/tasks/src/tasks/text-to-image/about.md
linoytsaban Sep 6, 2024
39bdbe1
Update packages/tasks/src/tasks/text-to-image/about.md
linoytsaban Sep 6, 2024
6a71578
simplify real image editing
linoytsaban Sep 6, 2024
532c779
Merge remote-tracking branch 'origin/text2img-task' into text2img-task
linoytsaban Sep 6, 2024
8c292c3
format
linoytsaban Sep 6, 2024
e15a303
format
linoytsaban Sep 6, 2024
36a1e3d
Merge branch 'main' into text2img-task
linoytsaban Sep 6, 2024
1f9d0af
Update packages/tasks/src/tasks/text-to-image/data.ts
linoytsaban Sep 7, 2024
dbbfd4f
changes to personalization variant
linoytsaban Sep 7, 2024
8bc84cc
fix images
linoytsaban Sep 7, 2024
4d7fef7
Merge branch 'main' into text2img-task
linoytsaban Sep 7, 2024
9754f20
format
linoytsaban Sep 7, 2024
1a6c542
Merge remote-tracking branch 'origin/text2img-task' into text2img-task
linoytsaban Sep 7, 2024
4be7f12
change to image editing description
linoytsaban Sep 8, 2024
a858026
add figure refs
linoytsaban Sep 8, 2024
144c4ca
format
linoytsaban Sep 8, 2024
43c297c
format
linoytsaban Sep 8, 2024
9c306b6
add back missing title
linoytsaban Sep 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 20 additions & 3 deletions packages/tasks/src/tasks/text-to-image/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

### Data Generation

Businesses can generate data for their their use cases by inputting text and getting image outputs.
Businesses can generate data for their use cases by inputting text and getting image outputs.

### Immersive Conversational Chatbots

Expand All @@ -16,9 +16,23 @@ Different patterns can be generated to obtain unique pieces of fashion. Text-to-

Architects can utilise the models to construct an environment based out on the requirements of the floor plan. This can also include the furniture that has to be placed in that environment.

## Task Variants
##Task Variants
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved

You can contribute variants of this task [here](https://github.com/huggingface/hub-docs/blob/main/tasks/src/text-to-image/about.md).
### Image Editing

Image editing with text-to-image models involves using text prompts to describe the wanted changes in an image and then follow them.
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved

- Synthetic image editing: using text-to-image models to make adjustments to images that were initially created using an input prompt, while preserving the overall meaning or context of the original image.
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved

![Examples](https://datasets-server.huggingface.co/assets/diffusers/diffusers-images-docs/--/b20ecaa3f61372174c854e09fc856fdcce6f8494/--/default/train/0/image/image.png?Expires=1725455983&Signature=ykj3EnAENI6goXc7qI2Toq~P8P5IdS1DqNbSfH8vhgrdwaJoGH2cUbXWRgVAndhrHvRjrTTcU3YOyoExnot7zEhauyUEcqr-evRHDmGgfar52uEmfLbLCtNAcRK9Q85QOifupIH-X9x3rBUM03B0RIkHuto6wwRBAHireqr7QcD8hYRaNzACXrTbt-U7wHosZS8R1pdc3FDt7fDc3Qwh8XL0YoJqAoK8X8JnZEXIWTfGnCpygPBDbseDlYEzegGKzClAUgigQbomUk733VNtB3ol396uYkHCcjqjtgdhtEfAWQz-xM4eAhHpI~YEn7RQqRjB0RD0bPd1nHRU0wGUqA__&Key-Pair-Id=K3EI6M078Z3AC3)
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved

- Real image editing: similar to synthetic image editing, except we're using real photos/images. This task is usually more complex, as it involves first obtaining a latent representation of the image, in the latent domain of the model that it can then manipulate.
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved

![Examples](https://datasets-server.huggingface.co/assets/diffusers/diffusers-images-docs/--/default/train/1/image/image.jpg?Expires=1725453082&Signature=MOCeELTChydgLRZT9ws8owCraSVrdcm6c7Vlnsi23rJ1Ocigl6gjRtXwmjVDCKuG2fB6Hw0Tmn8ZR0M7FPiA2fXpSuPEW4iJMoeQNiNCtkSSjjDisDXbBSRXW1TXJ-Z2c~VoJ4lmmeUdFpyFZ9W~BlI6r2xQLltfU400XKPe~UgE-vJ~xr9ni8zZmyYt1kVtV9Et~EBzWCQkKc2DO9gI9HnEg9z2hxDHp8Bak0HBRARM4ObhRYxieWqO4hOg1HVk4LSt2E8emIuDmhPUU4v8L097yFcI4D6JeoyNNn0q6nKQZqAZIzwP8iiLqqhSv~mJsO7YGnQck1-bzA~gAiVMpg__&Key-Pair-Id=K3EI6M078Z3AC3)
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved

### Personalization

Personalization refers to techniques used to customize text-to-image models, where we introduce new subjects/concepts to the model so that we can then use the model to generate new images of those subjects with a text prompt. For example, one can use these techniques to generate images of themselves, using as little as one reference image. These include teaching the model a new concept both in training free manner or through fine-tuning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Personalization refers to techniques used to customize text-to-image models, where we introduce new subjects/concepts to the model so that we can then use the model to generate new images of those subjects with a text prompt. For example, one can use these techniques to generate images of themselves, using as little as one reference image. These include teaching the model a new concept both in training free manner or through fine-tuning.
Personalization refers to techniques used to customize text-to-image models. In this technique, we introduce new subjects/concepts to the model and then use the model to generate new images of those subjects with a text prompt.
For example, one can use these techniques to generate images of themselves, using as little as one reference image. These include teaching the model a new concept both in a training-free manner or through fine-tuning.

I am actually irritated by personal DreamBooths, to be honest, especially with the recent Korean cases where it's weaponized against women. Can you change the example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. But I also feel like it's such a major use case, it's a bit odd not to mention it. Cause mentioning style Loras can also potentially be problematic on the artists angle

Copy link
Contributor Author

@linoytsaban linoytsaban Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps mentioning both briefly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good points. A couple of alternatives:

  • Use "your dog" as an example (I did this in my suggestion). I agree with Linoy that this somewhat sweeps a major use case under the rug.
  • Be open about it and include a comment about using these features responsibly, respectfully, and ethically.


## Inference

Expand Down Expand Up @@ -65,11 +79,14 @@ await inference.textToImage({
- [Introducing Würstchen: Fast Diffusion for Image Generation](https://huggingface.co/blog/wuerstchen)
- [Efficient Controllable Generation for SDXL with T2I-Adapters](https://huggingface.co/blog/t2i-sdxl-adapters)
- [Welcome aMUSEd: Efficient Text-to-Image Generation](https://huggingface.co/blog/amused)
- Image Editing Demos: [LEDITS++](https://huggingface.co/spaces/editing-images/leditsplusplus), [Turbo Edit](https://huggingface.co/spaces/turboedit/turbo_edit), [InstructPix2Pix](https://huggingface.co/spaces/timbrooks/instruct-pix2pix), [CosXL](https://huggingface.co/spaces/multimodalart/cosxl)
- Training free Personalization Demos: [Face-to-All](https://huggingface.co/spaces/multimodalart/face-to-all), [InstantStyle](https://huggingface.co/spaces/InstantX/InstantStyle), [RB-modulation](https://huggingface.co/spaces/fffiloni/RB-Modulation), [Photomaker v2](https://huggingface.co/spaces/TencentARC/PhotoMaker-V2)

### Model Fine-tuning

- [Finetune Stable Diffusion Models with DDPO via TRL](https://huggingface.co/blog/pref-tuning)
- [LoRA training scripts of the world, unite!](https://huggingface.co/blog/sdxl_lora_advanced_script)
- [Using LoRA for Efficient Stable Diffusion Fine-Tuning](https://huggingface.co/blog/lora)
- LoRA fine tuning Spaces: [FLUX.1 finetuning](https://huggingface.co/spaces/autotrain-projects/train-flux-lora-ease), [SDXL finetuning](https://huggingface.co/spaces/multimodalart/lora-ease)

This page was made possible thanks to the efforts of [Ishan Dutta](https://huggingface.co/ishandutta), [Enrique Elias Ubaldo](https://huggingface.co/herrius) and [Oğuz Akif](https://huggingface.co/oguzakif).
2 changes: 1 addition & 1 deletion packages/tasks/src/tasks/text-to-image/data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ const taskData: TaskDataCustom = {
},
],
summary:
"Generates images from input text. These models can be used to generate and modify images based on text prompts.",
"Text-to-image is the task of generating images from input text. These pipelines can also be used modify and edit images based on text prompts.",
linoytsaban marked this conversation as resolved.
Show resolved Hide resolved
widgetModels: ["black-forest-labs/FLUX.1-dev"],
youtubeId: "",
};
Expand Down
Loading