Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter dependencies to only include those from a given package #2130

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dakl
Copy link

@dakl dakl commented Oct 26, 2024

I want to be able to output only direct dependencies, ie no transitive dependencies. This PR adds the flag --only-from like so:

pip-compile pyproject.toml --only-from="my-project (pyproject.toml)"

which will yield a requirements.txt file with only dependencies that were directly listed in the toml.

If this is interesting, I can add tests and/or refactor as desired. If not, I can just close this, that's fine too.

Contributor checklist
  • Included tests for the changes. (not yet)
  • PR title is short, clear, and ready to be included in the user-facing changelog.
Maintainer checklist
  • Verified one of these labels is present: backwards incompatible, feature, enhancement, deprecation, bug, dependency, docs or skip-changelog as they determine changelog listing.
  • Assign the PR to an existing or new milestone for the target version (following Semantic Versioning).

@dakl
Copy link
Author

dakl commented Oct 26, 2024

@nvie thoughts? Interesting, not interesting?

@WhyNotHugo
Copy link
Member

WhyNotHugo commented Oct 27, 2024 via email

@webknjaz
Copy link
Member

@dakl Vincent isn't really active in the project these days so I wouldn't bug him with notifications.

On the interest point, I would like to echo Hugo's questions about the use-case. So far, the motivation here is unclear.

P.S. It often helps to have either tests with the intended use or any other kind of demonstration of the problem you're trying to solve. Currently, it feels like you might be jumping to what you decided the solution is. And maybe you've done some thinking in your head but it's important to communicate this to the rest of us. Transparency is helpful. Please, include us in that process. When you explain such things, it's easier to "sell" the idea to whoever is going to review the PR or use the feature in the future.

@dakl
Copy link
Author

dakl commented Oct 27, 2024

@WhyNotHugo @webknjaz Thanks for your replies! Sorry for not writing about the use case.

I'm working in machine learning and many project start out with an exploratory phase, where we explore various ways of training a model, and run some training jobs to see if it seems promising at solving the task at hand. In this case, we have a set of requirements, let's say pytorch, tensorflow and lightning (usually more, let's say ≈10-20 packages that we depend directly on).

ML-heavy packages are a bit odd with their dependences, such that the list of packages depends heavily on the system where they are installed. For example, on macos, tensorflow installs tensorflow-macos so a tool like pip-compile (and poetry, pipenv etc) will add tensorflow-macos since it's a transitive dependency. When on a linux machine with a GPU, several CUDA-related python packages are installed, and listed as transitive dependencies. Therefore, the resulting requirements.txt will depend on the hardware where it is created, and won't be compatible with other hardware.

For us, in this exploratory phase, this causes unnecessary annoyance. In this phase, we'd like to be able to use pip-compile to compile a list of pinned dependences (that are mutually compatible according to their requested versions). We'd like to ignore the transitive dependencies in the final list of dependences, so in the case above, we'd only have

pytorch==2.1.0
lightning==1.3.4
tensorflow==2.15.1

and no transitive dependencies. This will then make the requirements file work on both macos and linux (with and without GPU) since transitive dependencies will be determined at install time.

Later, when a project is more mature, we typically create more extensive locks with shas (ex with pipenv/poetry), but in the exploratory phase, I'd like more strucure than manual pinning, but less structure than a complete pip freeze. This addition (that I here call --only-from would allow me to do that), with pip-compile pyproject.toml --only-from="my-project (pyproject.toml)".

It's a middle ground that I haven't been able to find any tool that covers - pinning versions but excluding transitive dependencies. Could be a USP for pip-tools :D

@dakl
Copy link
Author

dakl commented Oct 27, 2024

And of course I can/will add tests. I just want to know if you think this is interesting at all first, so I don't "waste" time writing tests for a feature that you consider out of scope for pip-tools anyway 😝

@dakl
Copy link
Author

dakl commented Nov 6, 2024

@WhyNotHugo @webknjaz is this interesting at all? I'm 100% fine with finishing up this work if you think it's an interesting use case to support. If not, I'm also fine with closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants