update doctstring for true_sequential

huggingface · Jul 21, 2023 · 7ac898a · 7ac898a
1 parent 89d18d6
commit 7ac898a
Showing 1 changed file with 2 additions and 0 deletions.
diff --git a/optimum/gptq/quantizer.py b/optimum/gptq/quantizer.py
@@ -76,6 +76,8 @@ def __init__(
                 Whether to use symetric quantization.
             true_sequential (`bool`, defaults to `True`):
                 Whether to perform sequential quantization even within a single Transformer block.
+                Instead of quantizing the entire block at once, we perform layer-wise quantization.
+                As a result, each layer undergoes quantization using inputs that have passed through the previously quantized layers.
             pack_sequentially (`bool`, defaults to `True`):
                 Whether to pack the layer just after it is quantized. If False, we will pack the model at the end.
             use_cuda_fp16 (`bool`, defaults to `True`):