Fast paths of common methods assume no builtin subclassing #4490

ChayimFriedman2 · 2024-08-25T08:43:33Z

Bug Description

Many implementations of common protocols for builtin Python types, perform a direct access to the relevant fields in the Python object descriptor, without calling into Python C APIs.

This is good for performance, but it actually has a problem for correctness: if those builtins are subclassed, the subclass can be passed as Bound<Builtin> (or other smart pointers), but the fields won't be updated - a call to the method implementing the protocol in Python will be needed (if it is overridden), meaning that calls to PyO3 functions will return an incorrect result.

Steps to Reproduce

Here's an example to demonstrate the problem with PyTuple and len(), but the problem exists with more methods and in more types:

use pyo3::prelude::*;
use pyo3::types::{IntoPyDict, PyTuple};

fn main() {
    Python::with_gil(|py| {
        let from_code = PyModule::from_code(
            py,
            cr#"
class my_tuple(tuple):
    def __len__(self): return 42
"#,
            c"my_tuple.py",
            c"my_tuple",
        );
        let m = from_code.unwrap();
        let my_tuple = m.getattr("my_tuple").unwrap();
        let instance = my_tuple.call1(((1, 2, 3),)).unwrap();
        let python_result = py
            .eval(
                c"len(instance)",
                None,
                Some(&[("instance", &instance)].into_py_dict(py)),
            )
            .unwrap()
            .extract::<usize>()
            .unwrap();
        assert_eq!(python_result, 42);
        let py_any_result = instance.len().unwrap();
        assert_eq!(py_any_result, 42);
        let tuple = instance.downcast::<PyTuple>().unwrap();
        let py_tuple_result = tuple.len();
        assert_eq!(py_tuple_result, 3); // Oops!
    });
}

Not only the result is incorrect, it is also inconsistent with the result we get from PyO3 itself for PyAny.

Backtrace

No response

Your operating system and version

Windows 11

Your Python version (`python --version`)

3.12.0

Your Rust version (`rustc --version`)

rustc 1.80.1 (3f5fd8dd4 2024-08-06)

Your PyO3 version

Tip of git (but also reproduces with latest published version)

How did you install python? Did you use a virtualenv?

The official website.

Additional Info

Fixing this bug without harming performance is difficult and may prove impossible; I am filling this bug so we will be aware at least.

If we do decide it's worth fixing, a potential path will be to differentiate in the type system between Bound<PyClass> and Bound<Exact<PyClass>>. Making the latter deref to the former, downcast() return the former but downcast_exact() return the latter, only specialize the latter, and tell people that care about micro-performance to use the latter (if they don't care about subclasses).

The text was updated successfully, but these errors were encountered:

birkenfeld · 2024-08-25T12:24:10Z

As a first reaction, this is the same as working with C extensions that use the APIs that PyO3 uses. So at the very least it's not a PyO3 specific bug/incompatibility. And most advanced Python programmers subclassing builtin types are already sensitized to issues arising there (another is that using e.g. a builtin method on a str subclass yields a regular str).

davidhewitt · 2024-08-25T21:19:04Z

I am inclined to agree with @birkenfeld though at the same time I would love for us to not have this surprise factor. The idea of Bound<Exact<PyDict>> could work, maybe another option is to have a Bound<PyDict, Exact> second generic argument which can default to Lax. We could probably even do this without too much backwards-compatibility worry, because Bound<Exact<PyDict>> and Bound<PyDict> could both implement PyDictMethods (with different API calls under the hood.)

There seems to me to be a little bit of overlap with #3226 and the idea of splitting strict / lax function argument inputs too, perhaps.

I'm not entirely keen to complicate the API in this manner though perhaps this will be the right step in the long run. I think the GIL Refs migration and upcoming introduction of IntoPyObject is already creating a lot of churn, so I'd be glad to let things settle down a bit before we make any moves too quickly here.

ChayimFriedman2 added the bug label Aug 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast paths of common methods assume no builtin subclassing #4490

Fast paths of common methods assume no builtin subclassing #4490

ChayimFriedman2 commented Aug 25, 2024

birkenfeld commented Aug 25, 2024

davidhewitt commented Aug 25, 2024

Fast paths of common methods assume no builtin subclassing #4490

Fast paths of common methods assume no builtin subclassing #4490

Comments

ChayimFriedman2 commented Aug 25, 2024

Bug Description

Steps to Reproduce

Backtrace

Your operating system and version

Your Python version (python --version)

Your Rust version (rustc --version)

Your PyO3 version

How did you install python? Did you use a virtualenv?

Additional Info

birkenfeld commented Aug 25, 2024

davidhewitt commented Aug 25, 2024

Your Python version (`python --version`)

Your Rust version (`rustc --version`)