Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logging issue for unsupported torch.compile devices #3077

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rghosh08
Copy link

@rghosh08 rghosh08 commented Oct 5, 2024

Fixes #137285

Description

This PR addresses the issue of insufficient logging when torch.compile is not supported on devices with CUDA capability below 7.0. Instead of exiting the program with a simple message, the code now logs detailed information about the device capabilities and reasons for the lack of support for torch.compile. It also ensures that the program continues to execute, showcasing the TORCH_LOGS logging system even when the function is not compiled.

Changes made:

  • Added detailed logging to explain why torch.compile is not supported on devices with lower CUDA capabilities.
  • Introduced a fallback mechanism where the function runs uncompiled if the device doesn't support compilation.
  • Consolidated the device selection logic for cleaner code.
  • Ensured the program runs and demonstrates TORCH_LOGS regardless of device capabilities.
  • Refactored code to adhere to PyTorch standards and PEP 8 linting guidelines.

Checklist

  • The issue that is being fixed is referred to in the description (see above "Fixes #137285").
  • Only one issue is addressed in this pull request.
  • Labels from the issue that this PR is fixing are added to this pull request.
  • No unnecessary issues are included in this pull request.

Copy link

pytorch-bot bot commented Oct 5, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3077

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

Hi @rghosh08!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@Tharusha-Lekamge
Copy link

Small suggestion - Link the PR to the issue. Currently the issue has no linked PRs

Copy link
Contributor

@svekars svekars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why everything is being deleted here.

@svekars svekars requested a review from mlazos October 7, 2024 15:13
@mlazos
Copy link
Contributor

mlazos commented Oct 8, 2024

@rghosh08 I don't think this PR makes sense, as logging is explicitly meant for torch.compile at the moment. If you run without torch.compile nothing of note is printed so it is kind of a useless tutorial. When I ran this on my machine (not compiling the function) and get:

INFO:__main__:CUDA Device Capability: (8, 0)
INFO:__main__:Device supports torch.compile.
INFO:__main__:Running the function without compilation...
===================Dynamo Tracing=========================
===================Traced Graph=========================
===================Fusion Decisions=========================
===================Output Code=========================
============================================

I also don't understand why so much of the tutorial aspect of the script is getting removed. Can you explain further?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants