Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Deserializing MatchQuery ZeroTermsQuery field fails if the source JSON comes from OpenSearch MatchQuery #1150

Open
dbwiddis opened this issue Aug 21, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@dbwiddis
Copy link
Member

dbwiddis commented Aug 21, 2024

What is the bug?

The enum defined for the client uses lower case values for the ZeroTermsQuery enum:

@JsonpDeserializable
public enum ZeroTermsQuery implements JsonEnum {
All("all"),
None("none"),

However, this enum is defined on OpenSearch with traditional all-caps enum names:

public enum ZeroTermsQuery implements Writeable {
    NONE(0),
    ALL(1),

As a result, a search query generated on OpenSearch can not simply transform its JSON into a client search request.

How can one reproduce the bug?

  1. Generate a Search Query on OpenSearch, for example:
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchQuery(CONNECTOR_ID_FIELD, connectorId));
  1. Add that query into a SearchSourceBuilder:
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(boolQueryBuilder);
  1. Transform that SearchSourceBuilder to JSON:
String json = sourceBuilder.toString();

Note the value of zero_terms_query is all upper case.

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "connector_id": {
              "query": "Jm_4dpEBnn49655wiz2Y",
              "operator": "OR",
              "prefix_length": 0,
              "max_expansions": 50,
              "fuzzy_transpositions": true,
              "lenient": false,
              "zero_terms_query": "NONE",
              "auto_generate_synonyms_phrase_query": true,
              "boost": 1
            }
          }
        },
        {
          "ids": {
            "values": [],
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  }
}
  1. Create a parser with that JSON:
JsonpMapper mapper = openSearchClient._transport().jsonpMapper();
JsonParser parser = mapper.jsonProvider().createParser(new StringReader(json));
  1. Attempt to deserialize that JSON into a OpenSearch Java Client SearchRequest object:
SearchRequest searchRequest = SearchRequest._DESERIALIZER.deserialize(parser, mapper);
  1. Observe exception:
2024-08-21 15:04:46 jakarta.json.stream.JsonParsingException: Invalid enum 'NONE'
2024-08-21 15:04:46     at org.opensearch.client.json.JsonEnum$Deserializer.deserialize(JsonEnum.java:116) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.JsonEnum$Deserializer.deserialize(JsonEnum.java:102) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.JsonEnum$Deserializer.deserialize(JsonEnum.java:61) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.JsonpDeserializer.deserialize(JsonpDeserializer.java:87) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.ObjectDeserializer$FieldObjectDeserializer.deserialize(ObjectDeserializer.java:81) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.ObjectDeserializer.deserialize(ObjectDeserializer.java:185) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.ObjectDeserializer.deserialize(ObjectDeserializer.java:146) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.JsonpDeserializer.deserialize(JsonpDeserializer.java:87) ~[?:?]
2024-08-21 15:04:46     at org.opensearch.client.json.ObjectBuilderDeserializer.deserialize(ObjectBuilderDeserializer.java:91) ~[?:?]
<snip>

What is the expected behavior?

In general, an OpenSearch SearchSourceBuilder can be serialized into JSON and then deserialized into an OpenSearch Java Client SearchRequest.

In this particular case, the all-caps enum name should match case-insensitively rather than throwing an exception.

See, for example, how logical operators (such as the OR in this query) accept either all-upper or all-lower case:

public enum Operator implements JsonEnum {
And("and", "AND"),
Or("or", "OR"),

What is your host/environment?

Running this on OpenSearch 2.15 code, but it has not changed since pre-fork.

Do you have any additional context?

Relevant code block causing the issue on a feature branch:
https://github.com/opensearch-project/ml-commons/blob/feature/multi_tenancy/plugin/src/main/java/org/opensearch/ml/sdkclient/RemoteClusterIndicesClient.java#L230-L232

@dbwiddis dbwiddis added bug Something isn't working untriaged labels Aug 21, 2024
@Xtansia Xtansia removed the untriaged label Aug 28, 2024
@Xtansia
Copy link
Collaborator

Xtansia commented Aug 28, 2024

As you've noted for cases like Operator the JsonEnum type does have an affordance for aliases for given enum values. So this should be an easy fix of just amending ZeroTermsQuery to add the all-caps variants as aliases.

This is something we should consider how to represent in the spec: https://github.com/opensearch-project/opensearch-api-specification/blob/19421f502740967e4d6df102f1c5765c53eaa010/spec/schemas/_common.query_dsl.yaml#L943

@dbwiddis
Copy link
Member Author

@dbwiddis
Copy link
Member Author

I agree with you though, this should likely be handled in the spec somehow but not sure there's a standard for that and not sure we want to hand-edit exceptions. It's easy enough to work around if documented, but let's at least discuss options.

@dbwiddis
Copy link
Member Author

It looks like Jackson JsonP does support this with appropriate annotations.
https://stackoverflow.com/questions/26058854/case-insensitive-json-to-pojo-mapping-without-changing-the-pojo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants