Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] BWC issue with async http client change #2417

Closed
ylwu-amzn opened this issue May 8, 2024 · 2 comments
Closed

[BUG] BWC issue with async http client change #2417

ylwu-amzn opened this issue May 8, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@ylwu-amzn
Copy link
Collaborator

ylwu-amzn commented May 8, 2024

What is the bug?
We have changed to async http client in 2.14 (#1839), tested with 2.14 RC4, find the predict result for text embedding model not same with 2.13.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Create two test clusters: 2.13 and 2.14 RC4
  2. Follow steps of this issue [BUG] Neural search: 4xx error ingesting data with Sagemaker external model #2249 to create model
  3. Then run this test on 2.13 and 2.14 test cluster
POST /_plugins/_ml/_predict/text_embedding/<your_model_id>
{
  "text_docs":[ "today is sunny", "hello world"],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

2.13 result

{
  "inference_results": [
    {
      "output": [
        {
          "name": "sentence_embedding",
          "data_type": "FLOAT32",
          "shape": [
            768
          ],
          "data": [
            0.009424215,
            -0.008311393,
            0.06740056,
            ...
           ]
        }
      ],
      "status_code": 200
    },
    {
      "output": [
        {
          "name": "sentence_embedding",
          "data_type": "FLOAT32",
          "shape": [
            768
          ],
          "data": [
            0.010724226,
            0.055782758,
            0.027084162,
            ...
          ]
        }
      ],
      "status_code": 200
    }
  ]
}

2.14 result

{
  "inference_results": [
    {
      "output": [
        {
          "name": "sentence_embedding",
          "data_type": "FLOAT32",
          "shape": [
            768
          ],
          "data": [
            0.009424215,
            -0.008311393,
            0.06740056,
            ...
          ]
        }
      ],
      "status_code": 200
    }
  ]
}

What is the expected behavior?
We should not break BWC. Should return same result of 2.13.

What is your host/environment?

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

@zane-neo
Copy link
Collaborator

zane-neo commented May 9, 2024

The issue is caused by the refactor of the httpclient from sync to async, when using sync httpclient with user defined preprocess function, a list of string input will be processed in sequential order, user script always pick up the first element in the list, and the list shrinks when a prediction is done.
With async httpclient approach, we need to calculate the total chunks before sending any request to remote model end, but the code doesn't handle user defined script case well, the request is been regarded as a batch request to remote model. So we need to check if user defined script is shown, we need to calculate the chunks in different way, this PR fixed this issue: #2418

@dhrubo-os
Copy link
Collaborator

PR is merged. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants