Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeepSpeed easyblock #3450

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions easybuild/easyblocks/d/deepspeed.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
from easybuild.easyblocks.generic.pythonpackage import PythonPackage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please include comment block on + you as the author

from easybuild.tools.build_log import EasyBuildError
from easybuild.tools.config import build_option
import easybuild.tools.environment as env


class EB_DeepSpeed(PythonPackage):
"""Custom easyblock for DeepSpeed"""

@staticmethod
def extra_options():
"""Change some defaults for easyconfig parameters."""
extra_vars = PythonPackage.extra_options()
extra_vars['use_pip'][0] = True
extra_vars['download_dep_fail'][0] = True
extra_vars['sanity_pip_check'][0] = True
return extra_vars

def __init__(self, *args, **kwargs):
"""Initialize DeepSpeed easyblock."""
super().__init__(*args, **kwargs)

dep_names = set(dep['name'] for dep in self.cfg.dependencies())

# require that PyTorch is listed as dependency
if 'PyTorch' not in dep_names:
raise EasyBuildError('PyTorch not found as a dependency')

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.10, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.11, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-3.2.10, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-3.2.10, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-tcl-1.147, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-tcl-1.147, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-4.1.4, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.6, modules-4.1.4, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.7, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.8, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-6.6.3, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-7.8.22, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Lua)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency

Check failure on line 27 in easybuild/easyblocks/d/deepspeed.py

View workflow job for this annotation

GitHub Actions / build (3.9, Lmod-8.1.14, Tcl)

PyTorch not found as a dependency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't error inside the init method, maybe move this to the build steps instead? The test suite doesn't like this at all.


# enable building with GPU support if CUDA is included as dependency
if 'CUDA' in dep_names:
self.with_cuda = True
else:
self.with_cuda = False

@property
def cuda_compute_capabilities(self):
return self.cfg['cuda_compute_capabilities'] or build_option('cuda_compute_capabilities')

def configure_step(self):
"""Set up DeepSpeed config"""

if self.with_cuda:
# https://github.com/microsoft/DeepSpeed/issues/3358
env.setvar('NVCC_PREPEND_FLAGS', '--forward-unknown-opts')

if self.cuda_compute_capabilities:
# specify CUDA compute capabilities via $TORCH_CUDA_ARCH_LIST
env.setvar('TORCH_CUDA_ARCH_LIST', ';'.join(self.cuda_compute_capabilities))

# By default prebuild all opts with a few exceptions
# http://www.deepspeed.ai/tutorials/advanced-install/#pre-install-deepspeed-ops
# > DeepSpeed will only install any ops that are compatible with your machine
env.setvar('DS_BUILD_OPTS', '1')

# These have bothersome dependencies
env.setvar('DS_BUILD_SPARSE_ATTN', '0') # requires PyTorch<2.0, triton==1.0.0
env.setvar('DS_BUILD_EVOFORMER_ATTN', '0') # requires PyTorch<2.0, triton==1.0.0
env.setvar('DS_BUILD_CUTLASS_OPS', '0') # requires dskernels
env.setvar('DS_BUILD_RAGGED_DEVICE_OPS', '0') # requires dskernels

super().configure_step()
Loading