Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HistoricForeignKey when used together with prefetch_related() #1159

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mr-mare
Copy link

@mr-mare mr-mare commented Apr 14, 2023

Description

This fixes behaviour of HistoricForeignKey when used together with prefetch_related().

The following is sufficient to reproduce the bug:

  • models:
from django.db import models
from simple_history.models import HistoricalRecords, HistoricForeignKey


class Poll(models.Model):
    question = models.CharField(max_length=200)

    history = HistoricalRecords()


class Choice(models.Model):
    poll = HistoricForeignKey(Poll, on_delete=models.CASCADE)
    choice = models.CharField(max_length=200)
    votes = models.IntegerField()

    history = HistoricalRecords()
  • code:
from poll.models import Poll, Choice

why_poll = Poll.objects.create(question="why?")
how_poll = Poll.objects.create(question="how?")

Choice.objects.create(poll=why_poll, votes=0)
Choice.objects.create(poll=how_poll, votes=0)

for poll in Poll.objects.prefetch_related("choice_set").all():
    print(poll.choice_set.all())

# expected result:
# <QuerySet [<Choice: Choice object (1)>]>
# <QuerySet [<Choice: Choice object (2)>]>

# actual result:
# <QuerySet [<Choice: Choice object (1)>]>
# <QuerySet []>

Related Issue

#1152

Motivation and Context

Property related_manager_cls is defined in HistoricReverseManyToOneDescriptor. This property returns a function call of create_reverse_many_to_one_manager function and passes it HistoricRelationModelManager as an argument. However, create_reverse_many_to_one_manager defines RelatedManager, which subclasses the manager passed as an argument and returns this RelatedManager class.

When prefetch_related is used, get_prefetch_queryset method of RelatedManager is called.
The problem lies in the fact that it bypasses RelatedManager.get_queryset method which does filtering to instance (_apply_rel_filters) by calling super().get_queryset(). But due to inheritance, it gets evaluated to HistoricRelationModelManager.get_queryset which does the same filtering as well.

This SQL is generated then (notice the "poll_choice"."poll_id" = 1 condition):

SELECT "poll_poll"."id", "poll_poll"."question" FROM "poll_poll"
SELECT "poll_choice"."id", "poll_choice"."poll_id", "poll_choice"."choice", "poll_choice"."votes" FROM "poll_choice" WHERE ("poll_choice"."poll_id" = 1 AND "poll_choice"."poll_id" IN (1, 2))

Removing _apply_rel_filters call from HistoricRelationModelManager.get_queryset solves this problem.

On the other hand, since create_reverse_many_to_one_manager returns RelatedManager, rather than HistoricRelationModelManager, the only way to access HistoricRelationModelManager.get_queryset is by calling super().get_queryset() from RelatedManager. This call can be found:

  • in RelatedManager.get_prefetch_queryset where it is undesired,
  • in RelatedManager.get_queryset where _apply_rel_filters is called right after super().get_queryset().

Therefore, I believe it is safe to remove it.

How Has This Been Tested?

Added new tests.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • I have run the pre-commit run command to format and lint.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have added my name and/or github handle to AUTHORS.rst
  • I have added my change to CHANGES.rst
  • All new and existing tests passed.

@codecov
Copy link

codecov bot commented Apr 14, 2023

Codecov Report

Merging #1159 (bcf2086) into master (27b3dbf) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1159   +/-   ##
=======================================
  Coverage   97.24%   97.24%           
=======================================
  Files          23       23           
  Lines        1234     1234           
  Branches      200      200           
=======================================
  Hits         1200     1200           
  Misses         16       16           
  Partials       18       18           
Impacted Files Coverage Δ
simple_history/models.py 96.71% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@evanhaldane
Copy link

Plans to review/merge this?

(I'm in the process of integrating this package for the first time and I spent a few hours chasing around a bug until I realized it was caused by this issue.)

cc: @ddabble ?

@anyidea
Copy link

anyidea commented Jun 21, 2024

any progress?

@robertofd-nextmatter
Copy link

Any reason why this is not getting merged? We are also having the same issue

@mr-mare
Copy link
Author

mr-mare commented Sep 5, 2024

I'm happy to help. If there is anything I can do, please let me know.

Copy link
Member

@ddabble ddabble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, good job investigating this solution! 😄 I agree with your reasoning for why it should be safe removing the call to _apply_rel_filters(). Furthermore, it looks like create_reverse_many_to_one_manager() usually gets a standard Manager as its superclass arg, which doesn't contain any _apply_rel_filters() method. And so it seems likely that RelatedManager was designed to be expecting that its superclass' get_queryset() method doesn't do any of the things that _apply_rel_filters() does, which again supports the decision to remove the call.

Do you think we should remove the try block with self.instance._prefetched_objects_cache[...] as well? The exact same call is in RelatedManager.get_queryset(), but it doesn't seem to hurt performance much... Though I'm not sure about the consequences of removing it (or any unknown, existing bugs that might live on if we leave it there, for that matter), considering that RelatedManager.get_prefetch_querysets() seems to skip that cache lookup by calling super().get_queryset(), combined with the fact that the mentioned standard Manager also doesn't make any cache lookups in get_queryset() 🤔

Lastly, could you add a test case for a nested prefetch_related() call, like the SubSub examples in the description of #1152? 🙂 I guess it should be enough e.g. adding a new history-tracked model with a HistoricForeignKey referencing TestHistoricParticipanToHistoricOrganization, and test with the different arguments to prefetch_related() from the mentioned SubSub examples, which would include "historic_participants__<subsub related_name>" and various nested variants using Prefetch(...).

Other than that, very clear and thorough test cases! And sorry that it took so long getting to reviewing this!

@@ -4,6 +4,7 @@ Changes
Unreleased
----------

- Fixed ``HistoricForeignKey`` behaviour when used together with ``prefetch_related()``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful being a little more explicit on what the bug used to be; also, adding the issue number 🙂

Suggested change
- Fixed ``HistoricForeignKey`` behaviour when used together with ``prefetch_related()``
- Fixed ``HistoricForeignKey``'s behaviour when used together with
``prefetch_related()``, which caused some related objects to be missing (gh-1152)

with self.assertNumQueries(2):
record1, record2 = TestOrganizationWithHistory.objects.prefetch_related(
"participants"
).all()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all() after prefetch_related() (or most other queryset-returning methods) is unnecessary.

Suggested change
).all()
)

The same goes for the similar calls to all() in the two test cases below.

@@ -2545,3 +2545,96 @@ def test_historic_to_historic(self):
)[0]
pt1i = pt1h.instance
self.assertEqual(pt1i.organization.name, "original")

def test_non_historic_to_historic_prefetch(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring detailing a little more explicitly what the test case is testing 🙂 E.g. something like this: (feel free to adjust the text)

Suggested change
def test_non_historic_to_historic_prefetch(self):
def test_non_historic_to_historic_prefetch(self):
"""Test that the field works properly with ``prefetch_related()`` when defined
on a model *without* history tracking, and with the ``to`` arg set to a model
*with* history tracking."""

The same goes for the two test cases below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants