diff --git a/docs/how-to/debugging.rst b/docs/how-to/debugging.rst
index c90f7ec7d8..9fd6caa5d6 100644
--- a/docs/how-to/debugging.rst
+++ b/docs/how-to/debugging.rst
@@ -2,6 +2,8 @@
:description: How to debug using HIP.
:keywords: AMD, ROCm, HIP, debugging, ltrace, ROCgdb, WinGDB
+.. _debugging_with_hip:
+
*************************************************************************
Debugging with HIP
*************************************************************************
@@ -272,102 +274,7 @@ HIP environment variable summary
Here are some of the more commonly used environment variables:
-..
-
-.. # COMMENT: The following lines define a break for use in the table below.
-.. |break| raw:: html
-
-
-
-..
-
-.. list-table::
-
- * - **Environment variable**
- - **Default value**
- - **Usage**
-
- * - AMD_LOG_LEVEL
- |break| Enable HIP log on different Level
- - 0
- - 0: Disable log.
- |break| 1: Enable log on error level
- |break| 2: Enable log on warning and below levels
- |break| 0x3: Enable log on information and below levels
- |break| 0x4: Decode and display AQL packets
-
- * - AMD_LOG_MASK
- |break| Enable HIP log on different Level
- - 0x7FFFFFFF
- - 0x1: Log API calls
- |break| 0x02: Kernel and Copy Commands and Barriers
- |break| 0x4: Synchronization and waiting for commands to finish
- |break| 0x8: Enable log on information and below levels
- |break| 0x20: Queue commands and queue contents
- |break| 0x40: Signal creation, allocation, pool
- |break| 0x80: Locks and thread-safety code
- |break| 0x100: Copy debug
- |break| 0x200: Detailed copy debug
- |break| 0x400: Resource allocation, performance-impacting events
- |break| 0x800: Initialization and shutdown
- |break| 0x1000: Misc debug, not yet classified
- |break| 0x2000: Show raw bytes of AQL packet
- |break| 0x4000: Show code creation debug
- |break| 0x8000: More detailed command info, including barrier commands
- |break| 0x10000: Log message location
- |break| 0xFFFFFFFF: Log always even mask flag is zero
-
- * - HIP_LAUNCH_BLOCKING
- |break| Used for serialization on kernel execution.
- - 0
- - 0: Disable. Kernel executes normally.
- |break| 1: Enable. Serializes kernel enqueue, behaves the same as AMD_SERIALIZE_KERNEL.
-
- * - HIP_VISIBLE_DEVICES (or CUDA_VISIBLE_DEVICES)
- |break| Only devices whose index is present in the sequence are visible to HIP
- -
- - 0,1,2: Depending on the number of devices on the system
-
- * - GPU_DUMP_CODE_OBJECT
- |break| Dump code object
- - 0
- - 0: Disable
- |break| 1: Enable
-
- * - AMD_SERIALIZE_KERNEL
- |break| Serialize kernel enqueue
- - 0
- - 1: Wait for completion before enqueue
- |break| 2: Wait for completion after enqueue
- |break| 3: Both
-
- * - AMD_SERIALIZE_COPY
- |break| Serialize copies
- - 0
- - 1: Wait for completion before enqueue
- |break| 2: Wait for completion after enqueue
- |break| 3: Both
-
- * - HIP_HOST_COHERENT
- |break| Coherent memory in hipHostMalloc
- - 0
- - 0: memory is not coherent between host and GPU
- |break| 1: memory is coherent with host
-
- * - AMD_DIRECT_DISPATCH
- |break| Enable direct kernel dispatch (Currently for Linux; under development for Windows)
- - 1
- - 0: Disable
- |break| 1: Enable
-
- * - GPU_MAX_HW_QUEUES
- |break| The maximum number of hardware queues allocated per device
- - 4
- - The variable controls how many independent hardware queues HIP runtime can create per process,
- per device. If an application allocates more HIP streams than this number, then HIP runtime reuses
- the same hardware queues for the new streams in a round-robin manner. Note that this maximum
- number does not apply to hardware queues that are created for CU-masked HIP streams, or
- cooperative queues for HIP Cooperative Groups (single queue per device).
+.. include:: ../how-to/debugging_env.rst
General debugging tips
======================================================
diff --git a/docs/how-to/debugging_env.rst b/docs/how-to/debugging_env.rst
new file mode 100644
index 0000000000..deb2510a1f
--- /dev/null
+++ b/docs/how-to/debugging_env.rst
@@ -0,0 +1,88 @@
+.. list-table::
+ :header-rows: 1
+
+ * - **Environment variable**
+ - **Default value**
+ - **Usage**
+
+ * - | ``AMD_LOG_LEVEL``
+ | Enable HIP log on different Level
+ - 0
+ - | 0: Disable log.
+ | 1: Enable log on error level
+ | 2: Enable log on warning and below levels
+ | 0x3: Enable log on information and below levels
+ | 0x4: Decode and display AQL packets
+
+ * - | ``AMD_LOG_MASK``
+ | Enable HIP log on different Level
+ - 0x7FFFFFFF
+ - | 0x1: Log API calls
+ | 0x02: Kernel and Copy Commands and Barriers
+ | 0x4: Synchronization and waiting for commands to finish
+ | 0x8: Enable log on information and below levels
+ | 0x20: Queue commands and queue contents
+ | 0x40: Signal creation, allocation, pool
+ | 0x80: Locks and thread-safety code
+ | 0x100: Copy debug
+ | 0x200: Detailed copy debug
+ | 0x400: Resource allocation, performance-impacting events
+ | 0x800: Initialization and shutdown
+ | 0x1000: Misc debug, not yet classified
+ | 0x2000: Show raw bytes of AQL packet
+ | 0x4000: Show code creation debug
+ | 0x8000: More detailed command info, including barrier commands
+ | 0x10000: Log message location
+ | 0xFFFFFFFF: Log always even mask flag is zero
+
+ * - | ``HIP_LAUNCH_BLOCKING``
+ | Used for serialization on kernel execution.
+ - 0
+ - | 0: Disable. Kernel executes normally.
+ | 1: Enable. Serializes kernel enqueue, behaves the same as AMD_SERIALIZE_KERNEL.
+
+ * - | ``HIP_VISIBLE_DEVICES`` (or ``CUDA_VISIBLE_DEVICES``)
+ | Only devices whose index is present in the sequence are visible to HIP
+ -
+ - 0,1,2: Depending on the number of devices on the system
+
+ * - | ``GPU_DUMP_CODE_OBJECT``
+ | Dump code object
+ - 0
+ - | 0: Disable
+ | 1: Enable
+
+ * - | ``AMD_SERIALIZE_KERNEL``
+ | Serialize kernel enqueue
+ - 0
+ - | 1: Wait for completion before enqueue
+ | 2: Wait for completion after enqueue
+ | 3: Both
+
+ * - | ``AMD_SERIALIZE_COPY``
+ | Serialize copies
+ - 0
+ - | 1: Wait for completion before enqueue
+ | 2: Wait for completion after enqueue
+ | 3: Both
+
+ * - | ``HIP_HOST_COHERENT``
+ | Coherent memory in hipHostMalloc
+ - 0
+ - | 0: memory is not coherent between host and GPU
+ | 1: memory is coherent with host
+
+ * - | ``AMD_DIRECT_DISPATCH``
+ | Enable direct kernel dispatch (Currently for Linux; under development for Windows)
+ - 1
+ - | 0: Disable
+ | 1: Enable
+
+ * - | ``GPU_MAX_HW_QUEUES``
+ | The maximum number of hardware queues allocated per device
+ - 4
+ - The variable controls how many independent hardware queues HIP runtime can create per process,
+ per device. If an application allocates more HIP streams than this number, then HIP runtime reuses
+ the same hardware queues for the new streams in a round-robin manner. Note that this maximum
+ number does not apply to hardware queues that are created for CU-masked HIP streams, or
+ cooperative queues for HIP Cooperative Groups (single queue per device).
\ No newline at end of file
diff --git a/docs/index.md b/docs/index.md
index 094f29758c..fc27ede88f 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -48,6 +48,7 @@ The CUDA enabled NVIDIA GPUs are supported by HIP. For more information, see [GP
* {doc}`/doxygen/html/index`
* [C++ language extensions](./reference/kernel_language)
+* [HIP environment variables](./reference/env_variables)
* [Comparing Syntax for different APIs](./reference/terms)
* [HSA Runtime API for ROCm](./reference/virtual_rocr)
* [List of deprecated APIs](./reference/deprecated_api_list)
diff --git a/docs/reference/env_variables.rst b/docs/reference/env_variables.rst
new file mode 100644
index 0000000000..7149d251bd
--- /dev/null
+++ b/docs/reference/env_variables.rst
@@ -0,0 +1,154 @@
+.. meta::
+ :description: HIP environment variables reference
+ :keywords: AMD, HIP, environment variables, environment, reference
+
+*************************************************************
+HIP environment variables
+*************************************************************
+
+In this section the reader can find all the important HIP environment variables.
+The full collection of the environment variables can be found at
+:doc:`ROCm environment variables page`
+
+GPU isolation
+=============
+
+The GPU isolation environment variables in HIP is collected in the next table.
+For details how to use the variables check the :doc:`GPU isolation page `
+
+.. list-table::
+ :header-rows: 1
+
+ * - **Environment variable**
+ - **Example value**
+
+ * - | ``ROCR_VISIBLE_DEVICES``
+ | A list of device indices or UUIDs that will be exposed to applications.
+ - ``0,GPU-DEADBEEFDEADBEEF``
+
+ * - | ``GPU_DEVICE_ORDINAL``
+ | Devices indices exposed to OpenCL and HIP applications.
+ - ``0,2``
+
+ * - | ``HIP_VISIBLE_DEVICES`` or ``CUDA_VISIBLE_DEVICES``
+ | Device indices exposed to HIP applications.
+ - ``0,2``
+
+Profiling environment variables
+===============================
+
+The profiling environment variables in HIP is collected in the next table. For
+details how to use the variables check the :doc:`Setting the number of CUs page `
+
+.. list-table::
+ :header-rows: 1
+
+ * - **Environment variable**
+ - **Example value**
+
+ * - | ``HSA_CU_MASK``
+ | Sets the mask on a lower level of queue creation in the driver,
+ | this mask will also be set for queues being profiled.
+ -
+
+ * - | ``ROC_GLOBAL_CU_MASK``
+ | Sets the mask on queues created by the HIP or the OpenCL runtimes,
+ | this mask will also be set for queues being profiled.
+ -
+
+ * - | ``ROCR_VISIBLE_DEVICES``
+ | A list of device indices or UUIDs that will be exposed to applications.
+ - ``0,GPU-DEADBEEFDEADBEEF``
+
+Debug environment variables
+===========================
+
+The debuging environment variables in HIP is collected in the next table. For
+details how to use the debug variables check the :ref:`debugging_with_hip`
+
+.. include:: ../how-to/debugging_env.rst
+
+Memory management related environment variables
+===============================================
+
+The memory management related environment variables in HIP is collected in the
+next table.
+
+.. list-table::
+ :widths: 70,15,15
+ :header-rows: 1
+
+ * - Environment variable
+ - Variable type
+ - Default value
+
+ * - | ``HIP_HIDDEN_FREE_MEM``
+ | Reserve free mem reporting in Mb, 0 = Disable
+ - ``uint``
+ - 0
+
+ * - | ``HIP_HOST_COHERENT``
+ | Coherent memory in ``hipHostMalloc``
+ - ``uint``
+ - 0
+
+ * - | ``HIP_INITIAL_DM_SIZE``
+ | Set initial heap size for device malloc. The default value corresponds to 8 MiB
+ - ``size_t``
+ - 8388608
+
+ * - | ``HIP_MEM_POOL_SUPPORT``
+ | Enables memory pool support in HIP
+ - ``bool``
+ - ``false``
+
+ * - | ``HIP_MEM_POOL_USE_VM``
+ | Enables memory pool support in HIP
+ - ``bool``
+ - | ``true`` on Windows,
+ | ``false`` on other OS
+
+ * - | ``HIP_VMEM_MANAGE_SUPPORT``
+ | Virtual Memory Management Support
+ - ``bool``
+ - ``true``
+
+ * - | ``GPU_MAX_HEAP_SIZE``
+ | Set maximum size of the GPU heap to % of board memory
+ - ``uint``
+ - 100
+
+ * - | ``GPU_MAX_REMOTE_MEM_SIZE``
+ | Maximum size , in Ki that allows device memory substitution with system
+ - ``uint``
+ - 2
+
+ * - | ``GPU_NUM_MEM_DEPENDENCY``
+ | Number of memory objects for dependency tracking
+ - ``size_t``
+ - 256
+
+ * - | ``GPU_STREAMOPS_CP_WAIT``
+ | Force the stream wait memory operation to wait on CP.
+ - ``bool``
+ - ``false``
+
+ * - | ``HSA_LOCAL_MEMORY_ENABLE``
+ | Enable HSA device local memory usage
+ - ``bool``
+ - ``true``
+
+ * - | ``PAL_ALWAYS_RESIDENT``
+ | Force memory resources to become resident at allocation time
+ - ``bool``
+ - ``false``
+
+ * - | ``PAL_PREPINNED_MEMORY_SIZE``
+ | Size in KBytes of prepinned memory
+ - ``size_t``
+ - 64
+
+ * - | ``REMOTE_ALLOC``
+ | Use remote memory for the global heap allocation
+ - ``bool``
+ - ``false``
diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in
index 17af3731fc..7acbe69519 100644
--- a/docs/sphinx/_toc.yml.in
+++ b/docs/sphinx/_toc.yml.in
@@ -35,6 +35,7 @@ subtrees:
- file: doxygen/html/index
- file: reference/kernel_language
title: C++ language extensions
+ - file: reference/env_variables
- file: reference/terms
title: Comparing Syntax for different APIs
- file: reference/virtual_rocr