`ORTDiffusionPipeline`s with IO Binding #2056

IlyasMoutawwakil · 2024-10-13T01:06:33Z

What does this PR do?

This is also my attempt to create a generalizable io binding framework, the idea is to always have output_shapes = fn(input_shapes, known_shapes) where known_shapes is mostly stuff we find in the config, we the use this information at runtime with a simple symbolic resolver, keeping the shape inference time minimal, to create output tensors in torch and thus accelerate inference without the need to pass by ort values / cupy / numpy.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

HuggingFaceDocBuilderDev · 2024-10-13T01:25:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tianleiwu · 2024-10-24T23:36:42Z

optimum/onnxruntime/modeling_ort.py

+        if self.use_io_binding is False and provider == "CUDAExecutionProvider":
            self.use_io_binding = True


This overrides use_io_binding choice from user. What if user want to run performance test with io binding disabled?

I suggest that:
if use_io_binding is None: change it to True
if not use_io_binding and it is cuda provider, log a warning.

This is already the default behavior in ORTModels, I kept it for consistency (I'm not a fan of it tbh) to not break stuff for old users.

tianleiwu · 2024-10-24T23:50:28Z

optimum/onnxruntime/modeling_diffusion.py

+    def providers(self) -> Tuple[str]:
+        return self._validate_same_attribute_value_across_components("providers")
+
+    @property
+    def provider(self) -> str:
+        return self._validate_same_attribute_value_across_components("provider")
+
+    @property
+    def providers_options(self) -> Dict[str, Dict[str, Any]]:
+        return self._validate_same_attribute_value_across_components("providers_options")
+
+    @property
+    def provider_options(self) -> Dict[str, Any]:
+        return self._validate_same_attribute_value_across_components("provider_options")


It is not necessary to validate same value across components.

I think it is feasible to use different provider and different provider options for components. For example, we can run text_encoder with CPU, and unet with CUDA provider. Or we want to enable cuda graph in one component but not the other in provider option.

May add some comments and loose the constraint later.

there's a comment in _validate_same_attribute_value_across_components definition explaining the reasoning behind these checks, which is exactly what you said. Pipeline attributes can be accessed but they only make sense when they're consistent, for now this is my proposition for multi model parts pipelines, an alternative would be to return that of the main component (unet/transformer) or not supporting these attributes at all for the main pipeline (replace them with provider_map for example like device vs device_map).

IlyasMoutawwakil added 2 commits October 13, 2024 02:58

support iobinding using generic shape inference

89bf341

fix for cpu iobinding

507a937

IlyasMoutawwakil added 9 commits October 13, 2024 10:28

fix providers

690da65

add io binding tests on cpu

b55b06f

fix

e956e81

fix

6810e4d

fix

dd8253c

revert

23a02bd

fix

6183a8b

fix

2d386d9

io binding

c1c8d0d

tianleiwu reviewed Oct 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ORTDiffusionPipeline`s with IO Binding #2056

`ORTDiffusionPipeline`s with IO Binding #2056

IlyasMoutawwakil commented Oct 13, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 13, 2024

tianleiwu Oct 24, 2024

IlyasMoutawwakil Oct 25, 2024 •

edited

Loading

tianleiwu Oct 24, 2024

IlyasMoutawwakil Oct 25, 2024 •

edited

Loading

		if self.use_io_binding is False and provider == "CUDAExecutionProvider":
		self.use_io_binding = True

ORTDiffusionPipelines with IO Binding #2056

Are you sure you want to change the base?

ORTDiffusionPipelines with IO Binding #2056

Conversation

IlyasMoutawwakil commented Oct 13, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Oct 13, 2024

tianleiwu Oct 24, 2024

Choose a reason for hiding this comment

IlyasMoutawwakil Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

tianleiwu Oct 24, 2024

Choose a reason for hiding this comment

IlyasMoutawwakil Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

`ORTDiffusionPipeline`s with IO Binding #2056

`ORTDiffusionPipeline`s with IO Binding #2056

IlyasMoutawwakil commented Oct 13, 2024 •

edited

Loading

IlyasMoutawwakil Oct 25, 2024 •

edited

Loading

IlyasMoutawwakil Oct 25, 2024 •

edited

Loading