Migrate documentation of Stable Diffusion and add notebooks (#312)

* init * fix doc build? * fix doc build? * migration * migrate the doc * update notebooks * delete log * update notebooks * update notebook readme * Update docs/source/tutorials/notebooks.mdx Co-authored-by: David Corvoysier <[email protected]> * Update docs/source/tutorials/stable_diffusion.mdx Co-authored-by: David Corvoysier <[email protected]> * Update docs/source/tutorials/overview.mdx Co-authored-by: Philipp Schmid <[email protected]> * Update docs/source/tutorials/stable_diffusion.mdx Co-authored-by: Philipp Schmid <[email protected]> * add mem usage * fix typo --------- Co-authored-by: JingyaHuang <[email protected]> Co-authored-by: David Corvoysier <[email protected]> Co-authored-by: Philipp Schmid <[email protected]>
huggingface · Nov 14, 2023 · b2c6814 · b2c6814
1 parent d47741f
commit b2c6814
Show file tree

Hide file tree

Showing 9 changed files with 1,848 additions and 273 deletions.
diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
@@ -10,6 +10,10 @@
       title: Overview
     - local: tutorials/fine_tune_bert
       title: Fine-tune BERT for Text Classification on AWS Trainium
+    - local: tutorials/stable_diffusion
+      title: Generate images with Stable Diffusion models on AWS Inferentia
+    - local: tutorials/notebooks
+      title: Notebooks
     title: Tutorials
   - sections:
     - local: guides/overview

diff --git a/docs/source/guides/models.mdx b/docs/source/guides/models.mdx
@@ -213,274 +213,5 @@ Please be aware that:
 - the generation parameters can be stored in a `generation_config.json` file. When such a file is present in model directory,
 it will be parsed to set the default parameters (the values passed to the `generate` method still take precedence).
 
-## Stable Diffusion
-
-Optimum extends 🤗`Diffusers` to support inference on Neuron. To get started, make sure you have installed Diffusers:
-
-```bash
-pip install "optimum[neuronx, diffusers]"
-```
-
-You can also accelerate the inference of stable diffusion on neuronx devices (inf2 / trn1). There are four components which need to be exported to the `.neuron` format to boost the performance:
-
-* Text encoder
-* U-Net
-* VAE encoder
-* VAE decoder
-
-### Text-to-Image
-
-`NeuronStableDiffusionPipeline` class allows you to generate images from a text prompt on neuron devices similar to the experience with `diffusers`.
-
-Like for other tasks, you need to compile models before being able to perform inference. The export can be done either via the CLI or via `NeuronStableDiffusionPipeline` API. Here is an example of exporting stable diffusion components with `NeuronStableDiffusionPipeline`:
-
-<Tip>
-
-To apply optimized compute of Unet's attention score, please configure your environment variable with `export NEURON_FUSE_SOFTMAX=1`.
-
-Besides, don't hesitate to tweak the compilation configuration to find the best tradeoff between performance v.s accuracy in your use case. By default, we suggest casting FP32 matrix multiplication operations to BF16 which offers good performance with moderate sacrifice of the accuracy. Check out the guide from [AWS Neuron documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/appnotes/neuronx-cc/neuronx-cc-training-mixed-precision.html#neuronx-cc-training-mixed-precision) to better understand the options for your compilation.
-
-</Tip>
-
-```python
->>> from optimum.neuron import NeuronStableDiffusionPipeline
-
->>> model_id = "runwayml/stable-diffusion-v1-5"
->>> compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
->>> input_shapes = {"batch_size": 1, "height": 512, "width": 512}
-
->>> stable_diffusion = NeuronStableDiffusionPipeline.from_pretrained(model_id, export=True, **compiler_args, **input_shapes, device_ids=[0, 1])
-
-# Save locally or upload to the HuggingFace Hub
->>> save_directory = "sd_neuron/"
->>> stable_diffusion.save_pretrained(save_directory)
->>> stable_diffusion.push_to_hub(
-...     save_directory, repository_id="my-neuron-repo", use_auth_token=True
-... )
-```
-
-Now generate an image with a prompt on neuron:
-
-```python
->>> prompt = "a photo of an astronaut riding a horse on mars"
->>> image = stable_diffusion(prompt).images[0]
-```
-
-<img
-  src="https://raw.githubusercontent.com/huggingface/optimum-neuron/main/docs/assets/guides/models/01-sd-image.png"
-  width="256"
-  height="256"
-  alt="stable diffusion generated image"
-/>
-
-### Image-to-Image
-
-With the `NeuronStableDiffusionImg2ImgPipeline` class, you can generate a new image conditioned on a text prompt and an initial image.
-
-```python
-import requests
-from PIL import Image
-from io import BytesIO
-from optimum.neuron import NeuronStableDiffusionImg2ImgPipeline
-
-model_id = "nitrosocke/Ghibli-Diffusion"
-input_shapes = {"batch_size": 1, "height": 512, "width": 512}
-pipeline = NeuronStableDiffusionImg2ImgPipeline.from_pretrained(model_id, export=True, **input_shapes, device_ids=[0, 1])
-pipeline.save_pretrained("sd_img2img/")
-
-url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
-
-response = requests.get(url)
-init_image = Image.open(BytesIO(response.content)).convert("RGB")
-init_image = init_image.resize((512, 512))
-
-prompt = "ghibli style, a fantasy landscape with snowcapped mountains, trees, lake with detailed reflection. sunlight and cloud in the sky, warm colors, 8K"
-
-image = pipeline(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images[0]
-image.save("fantasy_landscape.png")
-```
-
-`image`          | `prompt` | output |
-:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
-<img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/03-sd-img2img-init.png" alt="landscape photo" width="256" height="256"/> | ***ghibli style, a fantasy landscape with snowcapped mountains, trees, lake with detailed reflection. warm colors, 8K*** | <img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/04-sd-img2img.png" alt="drawing" width="250"/> |
-
-### Inpaint
-
-With the `NeuronStableDiffusionInpaintPipeline` class, you can edit specific parts of an image by providing a mask and a text prompt.
-
-```python
-import requests
-from PIL import Image
-from io import BytesIO
-from optimum.neuron import NeuronStableDiffusionInpaintPipeline
-
-model_id = "runwayml/stable-diffusion-inpainting"
-input_shapes = {"batch_size": 1, "height": 512, "width": 512}
-pipeline = NeuronStableDiffusionInpaintPipeline.from_pretrained(model_id, export=True, **input_shapes, device_ids=[0, 1])
-pipeline.save_pretrained("sd_inpaint/")
-
-def download_image(url):
-    response = requests.get(url)
-    return Image.open(BytesIO(response.content)).convert("RGB")
-
-img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
-mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
-
-init_image = download_image(img_url).resize((512, 512))
-mask_image = download_image(mask_url).resize((512, 512))
-
-prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
-image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image).images[0]
-image.save("cat_on_bench.png")
-```
-
-`image`          | `mask_image` | `prompt` | output |
-:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
-<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="250"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="250"/> | ***Face of a yellow cat, high resolution, sitting on a park bench*** | <img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/05-sd-inpaint.png" alt="drawing" width="250"/> |
-
-## Stable Diffusion XL
-
-### Text-to-Image
-
-Similar to Stable Diffusion, you will be able to use `NeuronStableDiffusionXLPipeline` API to export and run inference on Neuron devices with SDXL models.
-
-```python
->>> from optimum.neuron import NeuronStableDiffusionXLPipeline
-
->>> model_id = "stabilityai/stable-diffusion-xl-base-1.0"
->>> compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
->>> input_shapes = {"batch_size": 1, "height": 1024, "width": 1024}
-
->>> stable_diffusion_xl = NeuronStableDiffusionXLPipeline.from_pretrained(model_id, export=True, **compiler_args, **input_shapes, device_ids=[0, 1])
-
-# Save locally or upload to the HuggingFace Hub
->>> save_directory = "sd_neuron_xl/"
->>> stable_diffusion_xl.save_pretrained(save_directory)
->>> stable_diffusion_xl.push_to_hub(
-...     save_directory, repository_id="my-neuron-repo", use_auth_token=True
-... )
-```
-
-Now generate an image with a text prompt on neuron:
-
-```python
->>> prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
->>> image = stable_diffusion_xl(prompt).images[0]
-```
-
-<img
-  src="https://raw.githubusercontent.com/huggingface/optimum-neuron/main/docs/assets/guides/models/02-sdxl-image.jpeg"
-  width="256"
-  height="256"
-  alt="sdxl generated image"
-/>
-
-### Image-to-Image
-
-With `NeuronStableDiffusionXLImg2ImgPipeline`, you can pass an initial image, and a text prompt to condition generated images:
-
-```python
-from optimum.neuron import NeuronStableDiffusionXLImg2ImgPipeline
-from diffusers.utils import load_image
-
-prompt = "a dog running, lake, moat"
-url = "https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/sd_xl/castle_friedrich.png"
-init_image = load_image(url).convert("RGB")
-
-pipe = NeuronStableDiffusionXLImg2ImgPipeline.from_pretrained("sd_neuron_xl/", device_ids=[0, 1])
-image = pipe(prompt=prompt, image=init_image).images[0]
-```
-
-`image`          | `prompt` | output |
-:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
-<img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/sd_xl/castle_friedrich.png" alt="castle photo" width="256" height="256"/> | ***a dog running, lake, moat*** | <img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/06-sdxl-img2img.png" alt="castle with dog" width="250"/> |
-
-### Inpaint
-
-With `NeuronStableDiffusionXLInpaintPipeline`, pass the original image and a mask of what you want to replace in the original image. Then replace the masked area with content described in a prompt.
-
-```python
-from optimum.neuron import NeuronStableDiffusionXLInpaintPipeline
-from diffusers.utils import load_image
-
-img_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-text2img.png"
-mask_url = (
-    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-inpaint-mask.png"
-)
-
-init_image = load_image(img_url).convert("RGB")
-mask_image = load_image(mask_url).convert("RGB")
-prompt = "A deep sea diver floating"
-
-pipe = NeuronStableDiffusionXLInpaintPipeline.from_pretrained("sd_neuron_xl/", device_ids=[0, 1])
-image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, strength=0.85, guidance_scale=12.5).images[0]
-```
-
-`image`          | `mask_image` | `prompt` | output |
-:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
-<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-text2img.png" alt="drawing" width="250"/> | <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-inpaint-mask.png" alt="drawing" width="250"/> | ***A deep sea diver floating*** | <img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/07-sdxl-inpaint.png" alt="drawing" width="250"/> |
-
-### Refine Image Quality
-
-SDXL includes a [refiner model](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0) to denoise low-noise stage images generated from the base model. There are two ways to use the refiner:
-
-1. use the base and refiner model together to produce a refined image.
-2. use the base model to produce an image, and subsequently use the refiner model to add more details to the image.
-
-#### Base + refiner model
-
-```python
-from optimum.neuron import NeuronStableDiffusionXLPipeline, NeuronStableDiffusionXLImg2ImgPipeline
-
-prompt = "A majestic lion jumping from a big stone at night"
-base = NeuronStableDiffusionXLPipeline.from_pretrained("sd_neuron_xl/", device_ids=[0, 1])
-image = base(
-    prompt=prompt,
-    num_images_per_prompt=num_images_per_prompt,
-    num_inference_steps=40,
-    denoising_end=0.8,
-    output_type="latent",
-).images[0]
-del base  # To avoid neuron device OOM
-
-refiner = NeuronStableDiffusionXLImg2ImgPipeline.from_pretrained("sd_neuron_xl_refiner/", device_ids=[0, 1])
-image = image = refiner(
-    prompt=prompt,
-    num_inference_steps=40,
-    denoising_start=0.8,
-    image=image,
-).images[0]
-```
-
-<img
-  src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/08-sdxl-base-refine.png"
-  width="256"
-  height="256"
-  alt="sdxl base + refiner"
-/>
-
-#### Base to refiner model
-
-```python
-from optimum.neuron import NeuronStableDiffusionXLPipeline, NeuronStableDiffusionXLImg2ImgPipeline
-
-prompt = "A majestic lion jumping from a big stone at night"
-base = NeuronStableDiffusionXLPipeline.from_pretrained("sd_neuron_xl/", device_ids=[0, 1])
-image = base(prompt=prompt, output_type="latent").images[0]
-del base  # To avoid neuron device OOM
-
-refiner = NeuronStableDiffusionXLImg2ImgPipeline.from_pretrained("sd_neuron_xl_refiner/", device_ids=[0, 1])
-image = refiner(prompt=prompt, image=image[None, :]).images[0]
-```
-
-`Base Image`         | Refined Image |
-:-------------------------:|-------------------------:|
-<img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/09-sdxl-base-full.png" alt="drawing" width="250"/> | <img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/neuron/models/010-sdxl-refiner-detailed.png" alt="drawing" width="250"/> |
-
-<Tip>
-
-To avoid Neuron device out of memory, it's suggested to finish all base inference and release the device memory before running the refiner.
-
-</Tip>
 
 Happy inference with Neuron! 🚀
diff --git a/docs/source/guides/overview.mdx b/docs/source/guides/overview.mdx
@@ -17,7 +17,8 @@ limitations under the License.
 # Overview
 
 Welcome to the 🤗 Optimum Neuron how-to guides!
-These guides tackle more advanced topics and will show you how to easily get the best from HPUs:
+
+These guides tackle more advanced topics and will show you how to easily get the best from AWS Trainium / Inferentia:
 
 - [How to setup AWS Trainium instance](./setup_aws_instance)
 - [How to fine-tune a Transformers model with AWS Trainium](./fine_tune)