Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FVH BaseFragmentsBuilder does not properly support colored pre/post tags #13933

Open
MateuxLucax opened this issue Oct 19, 2024 · 0 comments
Open
Labels

Comments

@MateuxLucax
Copy link

Description

Given the BaseFragmentsBuilder description:

...
/**
 * Base FragmentsBuilder implementation that supports colored pre/post tags and multivalued fields.
 *
 * <p>Uses {@link BoundaryScanner} to determine fragments.
 */
public abstract class BaseFragmentsBuilder implements FragmentsBuilder {
...

We assume that if we input a query and an array of pre and post tags, they will follow the same order, like:

Query Pre tag Post tag
A B <ab> </ab>
C B <cb> </cb>
C A <ca> </ca>

It will not tag in a ordered way as the current BaseFragmentsBuilder implementation gets tags in a almost random order:

protected String getPreTag(String[] preTags, int num) {
  int n = num % preTags.length;
  return preTags[n];
}

This is links back to this issue.

I already done some initial work to solve a problem where I work, but I would like to have a proper solution for Lucene.

The root cause is in the FieldQuery flatten, saveTerms and expand methods. They do need to exist but they also mess the order of pre/post tags. The termOrPhraseNumber is used to get the preTag, and should follow the order of the queries.

I will try to add a unit test that properly illustrates this problem as it is kinda complex.

Version and environment details

Lucene 3.0+

Any environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant