Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce an immutable version of packaging.requirements.Requirement #12806

Open
1 task done
ichard26 opened this issue Jun 28, 2024 · 4 comments
Open
1 task done

Introduce an immutable version of packaging.requirements.Requirement #12806

ichard26 opened this issue Jun 28, 2024 · 4 comments
Assignees
Labels
resolution: deferred till PR Further discussion will happen when a PR is made type: refactor Refactoring code

Comments

@ichard26
Copy link
Member

ichard26 commented Jun 28, 2024

What's the problem this feature will solve?

Unfortunately packaging.Requirements.Requirement is mutable.

>>> from packaging.requirements import Requirement
>>> r = Requirement("black >= 24.1.0")
>>> r.name = "six"
>>> r
<Requirement('six>=24.1.0')>
>>> 

While this is "neat", it is somewhat surprising behaviour and can easily lead to hard to understand/debug logic.1 In addition, we are introducing more and more caches across the codebase which contain requirement objects. Currently, these caches are liable to break if a cached requirement object is mutated after the fact (get_requirement("black") could end returning a Requirement("six") object!).

Describe the solution you'd like

To avoid issues with caches and generally pay down our technical debt, introduce an immutable version of packaging.requirements.Requirement.

In general, there are two approaches possible:

  • Introduce an internal Requirement class (dataclass?) that's fully immutable and switch the codebase wholesale to it
  • Introduce an immutable requirement protocol, in addition to the internal immutable Requirement class

The latter has the benefit that is that plain ol' packaging.requirements.Requirement objects are still compatible with code that uses the immutable requirement class internally. Although, this could be considered a downside as then mutable packaging.requirements.Requirement objects could continue to linger in pip's logic.

Essentially, from a type-checking perspective requirements would be always immutable, while code would have to opt into using ImmutableRequirement to be protected at runtime (e.g. the caching get_requirement() utility function).

I included a rough draft of the second approach below.

@runtime_checkable
class Requirement(Protocol):
    """Immutable packaging Requirement interface."""

    @property
    def name(self) -> str: ...
    @property
    def url(self) -> Union[str, None]: ...
    @property
    def extras(self) -> FrozenSet[str]: ...
    @property
    def specifier(self) -> SpecifierSet: ...
    @property
    def marker(self) -> Union[Marker, None]: ...

    def __str__(self) -> str: ...
    def __repr__(self) -> str: ...
    def __hash__(self) -> int: ...
    def __eq__(self, other: Any) -> bool: ...


@dataclass(frozen=True)
class ImmutableRequirement(Requirement):
    """Immutable version of packaging.requirements.Requirement."""

    __slots__ = ("name", "url", "extras", "specifier", "marker", "_parsed_req")

    name: str
    url: Union[str, None]
    extras: FrozenSet[str]
    specifier: SpecifierSet
    marker: Union[Marker, None]

    def __init__(self, requirement: str) -> None:
        self._parsed_req = requirements.Requirement(requirement)
        object.__setattr__(self, "name", self._parsed_req.name)
        object.__setattr__(self, "url", self._parsed_req.url)
        object.__setattr__(self, "extras", frozenset(self._parsed_req.extras))
        object.__setattr__(self, "specifier", self._parsed_req.specifier)
        object.__setattr__(self, "marker", self._parsed_req.marker)

    def __str__(self) -> str:
        return str(self._parsed_req)

    def __repr__(self) -> str:
        return repr(self._parsed_req)

    def __hash__(self) -> int:
        return hash(self._parsed_req)

    def __eq__(self, other: Any) -> bool:
        if isinstance(other, ImmutableRequirement):
            other = other._parsed_req

        return self._parsed_req.__eq__(other)

The amount of work would be likely similar for either approaches, so the deciding factor would be whether packaging.requirements.Requirement objects are created/used outside of pip's direct control or not.

I prefer the first approach as it's simpler and stricter, but I'm not fully aware of how packaging's Requirement object is used across the codebase. Let me know what you think!

Alternative Solutions

Alternatively, a smaller patch would be only enforcing immutability only at runtime. This has the benefit of that if we're fine with lying to mypy and (ab)using __instancecheck__, technically we could just add the immutable class and update get_requirement() to return it, while leaving all other references to packaging.requirements.Requirement alone.

This would be likely a major hack (all while introducing more technical debt) so I don't like this idea, but I thought to propose it just in case.

Additional context

See also #12663 where some early discussion on this problem occured.

Code of Conduct

Footnotes

  1. I really hope the codebase doesn't depend on mutating requirements to function...

@ichard26 ichard26 added type: refactor Refactoring code state: needs discussion This needs some more discussion labels Jun 28, 2024
@pradyunsg
Copy link
Member

I would like to make packaging.requirements.Requirement immutable honestly, but this seems like a worthwhile alternative to doing that!

@ichard26
Copy link
Member Author

ichard26 commented Jul 1, 2024

I mean, I'm totally in favour of making packaging.requirements.Requirement immutable, but that seems a major undertaking given a deprecation cycle would be needed on packaging's end. Improving the situation on pip's end felt a like a reasonable compromise.

@pradyunsg
Copy link
Member

I agree!

@ichard26
Copy link
Member Author

ichard26 commented Jul 4, 2024

Sounds good, in the absence of any preferred approach, I'll take a crack at the first approach and submit a PR for feedback once I finish up #12818.

@ichard26 ichard26 self-assigned this Jul 10, 2024
@ichard26 ichard26 added resolution: deferred till PR Further discussion will happen when a PR is made and removed state: needs discussion This needs some more discussion labels Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolution: deferred till PR Further discussion will happen when a PR is made type: refactor Refactoring code
Projects
None yet
Development

No branches or pull requests

2 participants