Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[util] remove pygloo support #46590

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 2 additions & 30 deletions doc/source/ray-more-libs/ray-collective.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Ray collective communication library

* enables 10x more efficient out-of-band collective communication between Ray actor and task processes,
* operates on both distributed CPUs and GPUs,
* uses NCCL and GLOO as the optional high-performance communication backends,
* uses NCCL as the optional high-performance communication backends,
* is suitable for distributed ML programs on Ray.

Collective Primitives Support Matrix
Expand All @@ -30,68 +30,42 @@ See below the current support matrix for all collective calls with different bac
:header-rows: 1

* - Backend
- `gloo <https://github.com/ray-project/pygloo>`_
-
- `nccl <https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/index.html>`_
-
* - Device
- CPU
- GPU
- CPU
- GPU
* - send
- ✔
- ✘
- ✘
- ✔
* - recv
- ✔
- ✘
- ✘
- ✔
* - broadcast
- ✔
- ✘
- ✘
- ✔
* - allreduce
- ✔
- ✘
- ✘
- ✔
* - reduce
- ✔
- ✘
- ✘
- ✔
* - allgather
- ✔
- ✘
- ✘
- ✔
* - gather
- ✘
- ✘
- ✘
- ✘
* - scatter
- ✘
- ✘
- ✘
- ✘
* - reduce_scatter
- ✔
- ✘
- ✘
- ✔
* - all-to-all
- ✘
- ✘
- ✘
- ✘
* - barrier
- ✔
- ✘
- ✘
- ✔

Expand All @@ -110,12 +84,10 @@ Usage
Installation and Importing
^^^^^^^^^^^^^^^^^^^^^^^^^^

Ray collective library is bundled with the released Ray wheel. Besides Ray, users need to install either `pygloo <https://github.com/ray-project/pygloo>`_
or `cupy <https://docs.cupy.dev/en/stable/install.html>`_ in order to use collective communication with the GLOO and NCCL backend, respectively.
Ray collective library is bundled with the released Ray wheel. Besides Ray, users also need to install `cupy <https://docs.cupy.dev/en/stable/install.html>`_ in order to use collective communication with the NCCL backend.

.. code-block:: python

pip install pygloo
pip install cupy-cudaxxx # replace xxx with the right cuda version in your environment

To use these APIs, import the collective package in your actor/task or driver code via:
Expand Down
20 changes: 3 additions & 17 deletions python/ray/util/collective/collective.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
from ray.util.collective import types

_NCCL_AVAILABLE = True
_GLOO_AVAILABLE = True

logger = logging.getLogger(__name__)

Expand All @@ -23,18 +22,14 @@
"https://docs.cupy.dev/en/stable/install.html."
)

try:
from ray.util.collective.collective_group.gloo_collective_group import GLOOGroup
except ImportError:
_GLOO_AVAILABLE = False


def nccl_available():
return _NCCL_AVAILABLE


def gloo_available():
return _GLOO_AVAILABLE
# pygloo has no release on Python 3.9+ anymore.
return False


class GroupManager(object):
Expand All @@ -59,16 +54,7 @@ def create_collective_group(self, backend, world_size, rank, group_name):
if backend == types.Backend.MPI:
raise RuntimeError("Ray does not support MPI.")
elif backend == types.Backend.GLOO:
logger.debug("Creating GLOO group: '{}'...".format(group_name))
g = GLOOGroup(
world_size,
rank,
group_name,
store_type="ray_internal_kv",
device_type="tcp",
)
self._name_group_map[group_name] = g
self._group_name_map[g] = group_name
raise RuntimeError("Ray does not support pygloo any more.")
elif backend == types.Backend.NCCL:
logger.debug("Creating NCCL group: '{}'...".format(group_name))
g = NCCLGroup(world_size, rank, group_name)
Expand Down
Loading