diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
index e7ed353ffa..7d62f00720 100644
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -126,9 +126,9 @@
   title: BetterTransformer
   isExpanded: false
 - sections:
-  - local: optimization_toolbox/usage_guides/quantization
+  - local: llm_quantization/usage_guides/quantization
     title: GPTQ quantization
-  title: Optimization toolbox
+  title: LLM quantization
   isExpanded: false
 - sections:
   - local: utils/dummy_input_generators
diff --git a/docs/source/concept_guides/quantization.mdx b/docs/source/concept_guides/quantization.mdx
index f751e9d47a..b9aca25ee9 100644
--- a/docs/source/concept_guides/quantization.mdx
+++ b/docs/source/concept_guides/quantization.mdx
@@ -185,6 +185,7 @@ models while respecting accuracy and latency constraints.
 [PyTorch quantization functions](https://pytorch.org/docs/stable/quantization-support.html#torch-quantization-quantize-fx)
 to allow graph-mode quantization of 🤗 Transformers models in PyTorch. This is a lower-level API compared to the two
 mentioned above, giving more flexibility, but requiring more work on your end.
+- The `optimum.llm_quantization` package allows to [quantize and run LLM models](https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization)
 
 ## Going further: How do machines represent numbers?
 
diff --git a/docs/source/optimization_toolbox/usage_guides/quantization.mdx b/docs/source/llm_quantization/usage_guides/quantization.mdx
similarity index 100%
rename from docs/source/optimization_toolbox/usage_guides/quantization.mdx
rename to docs/source/llm_quantization/usage_guides/quantization.mdx