Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Optional use of joblib #1989

Open
trixirt opened this issue Jul 26, 2024 · 3 comments
Open

[Feature]: Optional use of joblib #1989

trixirt opened this issue Jul 26, 2024 · 3 comments
Labels
change request A code change requested by external or internal stakeholders triaged The issue has been reviewed by a team member and prioritized

Comments

@trixirt
Copy link

trixirt commented Jul 26, 2024

Suggestion Description

For building Tensile as a RHEL package, the joblib package requires many packages that are not part of RHEL.

Instead of requiring joblib, try to import joblib and when it fails to fall back to the pre joblib code or even a single threaded option.

Operating System

RHEL

GPU

ALL

ROCm Component

ALL

@cgmb
Copy link
Contributor

cgmb commented Jul 26, 2024

Ideally, the parallelism would be managed through the build system. It is inefficient to have two resource pools (make/ninja and joblib) manage the same set of resources (CPU cores, memory).

For example, if you have a 16-core CPU, then you could give Tensile 16 threads and also give Make 16 threads. However, that means the two job queues may use 32 threads total and this could be inefficient (or use too much memory). However, if you only have 16 cores and you give 8 to Tensile and 8 to Make, then you won't get to use all your cores on the parts of the build after Tensile completes.

I expect that it's much easier just to make joblib optional using a fallback like in #1970. That's would be a good start, but I hope we can eventually do better.

@bstefanuk
Copy link
Contributor

bstefanuk commented Aug 2, 2024

Ideally, the parallelism would be managed through the build system. It is inefficient to have two resource pools (make/ninja and joblib) manage the same set of resources (CPU cores, memory).

I completely agree, however, I don't see this sort of change being implemented soon so it is may still be in our interest to support a single-threaded option. In fact, Tensile's main parallelization entry point---ParallelMap here---claims to support a single-threaded option via the enable parameter. Unfortunately, it's experimental and in need of testing.

Profilability, is another consideration. In the past I've faced issues with certain profilers that can't cope with the process churn that joblib induces. It also obscures internal function calls making it hard to profile. Having a single threaded option could help us highlight bottlenecks

Another longer-term solution that I've considered is to establish a single fan-out fan-in design with joblib so that we don't continually spawn and terminate processes.

At the very least it's worth stabilizing the enable option, falling back to that behaviour if joblib isn't found.

@bstefanuk bstefanuk added change request A code change requested by external or internal stakeholders triaged The issue has been reviewed by a team member and prioritized labels Aug 2, 2024
@ellosel
Copy link
Contributor

ellosel commented Aug 6, 2024

We keep removing "optional" dependencies from the requirements files. This hasn't been a problem up to now but if requirements.txt does not contain joblib and we use the fallback then TensileCreateLIbrary will slow down considerably.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change request A code change requested by external or internal stakeholders triaged The issue has been reviewed by a team member and prioritized
Projects
None yet
Development

No branches or pull requests

4 participants