Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change "exact match boost" implementation to use a match_phrase query in should #4960

Closed
sarayourfriend opened this issue Sep 19, 2024 · 2 comments · Fixed by #4978
Closed
Assignees
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: api Related to the Django API 🔧 tech: elasticsearch Involves Elasticsearch 🐍 tech: python Involves Python

Comments

@sarayourfriend
Copy link
Collaborator

Problem

We currently use an additional simple query string entry into the boolean filter for search to boost exact phrasal matches on the title of a work:

quotes_stripped = query.replace('"', "")
exact_match_boost = Q(
"simple_query_string",
flags=DEFAULT_SQS_FLAGS,
fields=["title"],
query=f"{quotes_stripped}",
boost=10000,
)

Description

We can bypass simple query string altogether for this, saving some parsing cycles in Elasticsearch, and making our intention much clearer in code (and in the generated query), by using match_phrase directly. This also obviates the need to mock out the simple-query-string syntax for phrasal matches.

@sarayourfriend sarayourfriend added 🟩 priority: low Low priority and doesn't need to be rushed ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository 🐍 tech: python Involves Python 🧱 stack: api Related to the Django API 🔧 tech: elasticsearch Involves Elasticsearch labels Sep 19, 2024
@dryruffian
Copy link
Contributor

Hi @sarayourfriend,
Hope You are doing well, is this the solution you are looking for

exact_match_boost = Q(
    "match_phrase",
    title={
        "query": query,
        "boost": 10000
    }
)

if yes can you assignee me this issue.

@sarayourfriend
Copy link
Collaborator Author

That looks right, yep! I've assigned the issue to you, thanks!

sarayourfriend added a commit that referenced this issue Sep 23, 2024
…y in should (#4978)

* Add utori to the "Made with Openverse" page

dd Sutori to the "Made with Openverse" page.

* Add Sutori to the "Made with Openverse" page.

* Add Sutori to Made with Openverse page (#4972)

Fixed commit message to be more precise and include the issue number

* Add Sutori to Made with Openverse page (#4972)

Fixed commit message to be more precise and include the issue number

* Add Sutori to Made with Openverse page (#4972)

* Add Sutori to Made with Openverse page (#4972)

* @dryruffiaAdd Sutori to Made with Openverse page (#4972)

I added the suggestion from @sarayourfriend.

Co-authored-by: sarayourfriend <[email protected]>

* Add Sutori to Made with Openverse page (#4971)

* Change "exact match boost" implementation to use a match_phrase query in should #4960

I have added support for "match_pharse"
and also removed variable quotes_stripped it's not needed now.

* Changed exact match boost implementation to use a match_phrase query in should #4960

I have added support for "match_pharse"
and also removed variable quotes_stripped it's not needed now.

* I have added support for match_pharse
and also removed variable quotes_stripped it's not needed now.

* Added Test to acoomodate match_pharses

* Added Test to acoomodate match_pharses

* Added Test to accommodate march_phrases

---------

Co-authored-by: sarayourfriend <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: api Related to the Django API 🔧 tech: elasticsearch Involves Elasticsearch 🐍 tech: python Involves Python
Projects
Archived in project
2 participants