"nodes-to-exclude" in quantization doesnt work #1418

marziye-A · 2023-09-27T10:17:16Z

hi
why "nodes-to-exclude" in quantization doesn't work?
my code is as below:

from optimum.onnxruntime.configuration import QuantizationConfig

dqconfig = QuantizationConfig(is_static=False,format=QuantFormat.QDQ,mode = QuantizationMode.IntegerOps, per_channel=True ,weights_dtype = QuantType.QUInt8,nodes_to_exclude=["Wav2Vec2FeatureEncoder","Wav2Vec2FeatureProjection","Wav2Vec2EncoderStableLayerNorm"])

does any one know the reason?
any help is really appreciated!

The text was updated successfully, but these errors were encountered:

baskrahmer · 2023-10-14T07:39:12Z

Hi @marziye-A

I took a look and I think the issue is that your nodes_to_exclude list does not correspond to valid ONNX graph nodes but instead to model submodules. You can run the below code to get all ONNX nodes for a model:

import tempfile
from pathlib import Path

from onnx import load as onnx_load
from optimum.onnxruntime import ORTModelForAudioClassification, ORTQuantizer
from optimum.onnxruntime.configuration import QuantizationConfig, QuantFormat, QuantizationMode, QuantType

qconfig = QuantizationConfig(
    is_static=False,
    format=QuantFormat.QDQ,
    mode=QuantizationMode.IntegerOps,
    per_channel=True,
    weights_dtype=QuantType.QUInt8,
    nodes_to_exclude=["/wav2vec2/feature_extractor/conv_layers.0/conv/Conv"]  # <-- Node from ONNX graph
)

with tempfile.TemporaryDirectory() as tmp_dir:
    output_dir = Path(tmp_dir)
    model = ORTModelForAudioClassification.from_pretrained("hf-internal-testing/tiny-random-wav2vec2", export=True)

    quantizer = ORTQuantizer.from_pretrained(model)
    quantizer.quantize(
        save_dir=output_dir,
        quantization_config=qconfig,
    )
    quantized_model = onnx_load(output_dir.joinpath("model_quantized.onnx"))
    node_list = [node.name for node in quantized_model.graph.node]

    print(node_list)

I don't know what your exact use-case is, but if you want to exclude the Wav2Vec2FeatureEncoder graph from quantization, you would have to find the corresponding ONNX nodes for those computations and add them to the nodes_to_exclude argument. You can use a regex to filter those nodes from the full list or maybe implement something clever to get all nodes of the submodule ;)

fxmarty · 2023-10-16T07:47:41Z

Hi @marziye-A, apology for the late reply and thank you @baskrahmer for the correct answer!

I will improve the documentation in this regard.

An alternative to make it easier to exclude part of a model from quantization may be using export_modules_as_functions during the export as suggested here huggingface/transformers#26307 (comment)

marziye-A · 2023-10-18T13:43:08Z

thank you very much for your answer,
do we have any limitation in number of nodes that we can to exclude?

baskrahmer · 2023-10-18T15:34:07Z

do we have any limitation in number of nodes that we can to exclude?

As far as I know, you can add all nodes of the model.

fxmarty mentioned this issue Oct 16, 2023

Improve ONNX quantization doc #1451

Merged

fxmarty closed this as completed in #1451 Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"nodes-to-exclude" in quantization doesnt work #1418

"nodes-to-exclude" in quantization doesnt work #1418

marziye-A commented Sep 27, 2023 •

edited

Loading

baskrahmer commented Oct 14, 2023

fxmarty commented Oct 16, 2023

marziye-A commented Oct 18, 2023

baskrahmer commented Oct 18, 2023

"nodes-to-exclude" in quantization doesnt work #1418

"nodes-to-exclude" in quantization doesnt work #1418

Comments

marziye-A commented Sep 27, 2023 • edited Loading

baskrahmer commented Oct 14, 2023

fxmarty commented Oct 16, 2023

marziye-A commented Oct 18, 2023

baskrahmer commented Oct 18, 2023

marziye-A commented Sep 27, 2023 •

edited

Loading