Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process record summaries in DB layer #3561

Closed
wants to merge 22 commits into from
Closed

Conversation

seancolsen
Copy link
Contributor

@seancolsen seancolsen commented May 1, 2024

Before reviewing this PR, make sure you're familiar with the Propose changes to record summaries for beta.

Examples

These examples use the Library Management data, with the following id values:

  • "Books" (oid: 36147)

    • "Title"(attnum: 2)
    • "Author" (attnum:10)
  • "Authors"

    • "First Name" (attnum: 2)
    • "Last Name" (attnum: 3)

Auto-generated Books summaries

select * from msar.get_record_summaries(36147, array[4,9,15,21,35]);
id record_summary
4 Abhorsen
15 A Civil Contract
21 A Darkness at Sethanon
9 Academ's Fury
35 Alexander Hamilton

Manually-templated Books summaries

select * from msar.get_record_summaries(
  36147,
  array[4,9,15,21,35],
  '[[2], ", by ", [10, 2], " ", [10, 3]]'::jsonb
);
id record_summary
4 Abhorsen, by Garth Nix
15 A Civil Contract, by Georgette Heyer
21 A Darkness at Sethanon, by Raymond E Feist
9 Academ's Fury, by Jim Butcher
35 Alexander Hamilton, by Ron Chernow

Thoughts

  • I feel the least confident about the implementation within get_record_summaries_via_query. Very much open to different approaches there, including changing the approach of passing id values in via array.

  • On a similar note, I wrote a test __SKIP__test_record_summary_performance. This is skipped because it's failing. The intent of this test is to verify that we're only generating record summaries for the records passed in via the array. But... (sad trombone)... it doesn't actually work like I thought it would. My next step here is to dig into EXPLAIN to get a better sense of the query plan. I thought Postgres would be smarter here, but oh well.

    This empirical evidence does make me question my original premise with this design. I was hoping that we'd be able to define a bespoke record summary query for a specific table — devoid of any filtering — and then use that at the inner layer to compose the filtered query using application-defined logic. Alas, it may be that the bespoke queries actually need to be aware of the filtering. I'd like to experiment with this more and bounce some ideas around with you.

    For the time being, this works. But I'd like to eventually make it faster. In theory we could potentially kick that can down the road.

  • Are there some types which wouldn't be able to cast to TEXT via cast( v AS text )? I don't understand enough about how this works in Postgres. It seems like there is a difference between implicit casting and explict casting, or something like that. I did a little research but didn't find the answers I was looking for. I did write a test test_record_summary_with_no_text_columns, which asserts that we can auto generate record summaries for a table having only uuid and point column types. This actually led me to make the fix in 9cca125. I'm hoping this will cover all possible types, but I think you would know better.

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the develop branch of the repository
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@seancolsen seancolsen self-assigned this May 6, 2024
@seancolsen seancolsen changed the base branch from develop to permissions_stub May 9, 2024 01:58
@seancolsen seancolsen marked this pull request as ready for review May 9, 2024 02:15
@seancolsen seancolsen assigned mathemancer and unassigned seancolsen May 9, 2024
@seancolsen seancolsen added the pr-status: review A PR awaiting review label May 9, 2024
@seancolsen
Copy link
Contributor Author

@mathemancer this is ready for a proper review now.

@seancolsen
Copy link
Contributor Author

Oh I also meant to mention that I rebased this on top of #3556 — but only for the sake of convenience during development (so that I could see/run/copy your tests in that PR). I can rebase this back on another branch if you prefer.

Base automatically changed from permissions_stub to architectural_overhaul May 9, 2024 13:02
Base automatically changed from architectural_overhaul to develop June 4, 2024 15:36
@seancolsen
Copy link
Contributor Author

Example of transitive record summaries:

SELECT *
FROM msar.get_record_summaries(
  '"Checkouts"'::regclass::oid,
  array[1896],
  '[
    "[", [1], "]: ",
    [3,2], " ", [3,3],
    " checked out ",
    [2,2],
    " (", [2,5,2], " by ", [2,5,10,2], " ", [2,5,10,3], ")"
  ]'::jsonb
);
  id  |                                  record_summary                                  
------+----------------------------------------------------------------------------------
 1896 | [1896]: Michael Marshall checked out F823-986B-B978-1634 (Abhorsen by Garth Nix)

@kgodey kgodey added this to the Beta release milestone Sep 18, 2024
@seancolsen seancolsen mentioned this pull request Oct 17, 2024
7 tasks
@seancolsen
Copy link
Contributor Author

Closing because I'm re-doing this in a newer PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-status: review A PR awaiting review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants