Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid infinite loop in flat_object parsing #15985

Merged
merged 4 commits into from
Sep 20, 2024

Conversation

msfroh
Copy link
Collaborator

@msfroh msfroh commented Sep 18, 2024

Description

We had logic in flat_object parsing that would:

  1. Try parsing a flat object field that is not an object or null.
  2. Would see an END_ARRAY token, ignore it, and not advance the parser.

Combined, this would create a scenario where passing an array of strings for a flat_object would parse the string values, then loop infinitely on the END_ARRAY token.

Related Issues

Resolves #15982

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@msfroh msfroh added the backport 2.x Backport to 2.x branch label Sep 18, 2024
@msfroh msfroh marked this pull request as ready for review September 18, 2024 21:52
@github-actions github-actions bot added bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing labels Sep 18, 2024
@msfroh
Copy link
Collaborator Author

msfroh commented Sep 18, 2024

@kkewwei -- would you mind taking a look at this change?

Copy link
Contributor

❌ Gradle check result for 73e7e84: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 18cfd45: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kkewwei
Copy link
Contributor

kkewwei commented Sep 19, 2024

In additional, if we should attempt to delete currentFieldName from the parseToken in the pr?(#14069 (comment))

We had logic in flat_object parsing that would:

1. Try parsing a flat object field that is not an object or null.
2. Would see an END_ARRAY token, ignore it, and not advance the parser.

Combined, this would create a scenario where passing an array of
strings for a flat_object would parse the string values, then loop
infinitely on the END_ARRAY token.

Signed-off-by: Michael Froh <[email protected]>
The removed code does not actually seem to affect the logic. Also, I
want to be 100% sure that every call to parseToken is guaranteed to
call parser.nextToken() at some point.

Signed-off-by: Michael Froh <[email protected]>
Thanks for the reminder, @kkewwei!

Signed-off-by: Michael Froh <[email protected]>
Copy link
Contributor

❌ Gradle check result for 4ee50a1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for dff498d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

The test fails on MixedClusterClientYamlTestSuiteIT because 2.x still
has the infinite loop until backport.

Signed-off-by: Michael Froh <[email protected]>
Copy link
Contributor

✅ Gradle check result for 6ddd18b: SUCCESS

Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 73.91304% with 6 lines in your changes missing coverage. Please review.

Project coverage is 71.90%. Comparing base (3937ccb) to head (6ddd18b).
Report is 15 commits behind head on main.

Files with missing lines Patch % Lines
...opensearch/index/mapper/FlatObjectFieldMapper.java 70.58% 4 Missing and 1 partial ⚠️
...ch/common/xcontent/JsonToStringXContentParser.java 83.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #15985      +/-   ##
============================================
- Coverage     71.90%   71.90%   -0.01%     
+ Complexity    64392    64290     -102     
============================================
  Files          5278     5280       +2     
  Lines        300877   300864      -13     
  Branches      43478    43473       -5     
============================================
- Hits         216351   216337      -14     
+ Misses        66747    66736      -11     
- Partials      17779    17791      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jainankitk jainankitk merged commit 05dab3b into opensearch-project:main Sep 20, 2024
34 of 38 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15985-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 05dab3b7eb54a361af3583a322f0a748d6412836
# Push it to GitHub
git push --set-upstream origin backport/backport-15985-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-15985-to-2.x.

@msfroh msfroh deleted the flat_object_invalid_token branch September 20, 2024 22:24
msfroh added a commit to msfroh/OpenSearch that referenced this pull request Sep 20, 2024
* Avoid infinite loop in flat_object parsing

We had logic in flat_object parsing that would:

1. Try parsing a flat object field that is not an object or null.
2. Would see an END_ARRAY token, ignore it, and not advance the parser.

Combined, this would create a scenario where passing an array of
strings for a flat_object would parse the string values, then loop
infinitely on the END_ARRAY token.

Signed-off-by: Michael Froh <[email protected]>

* Remove some unused code and add more tests

The removed code does not actually seem to affect the logic. Also, I
want to be 100% sure that every call to parseToken is guaranteed to
call parser.nextToken() at some point.

Signed-off-by: Michael Froh <[email protected]>

* Remove unused parameter from parseToken

Thanks for the reminder, @kkewwei!

Signed-off-by: Michael Froh <[email protected]>

* Add skip for newly-added test

The test fails on MixedClusterClientYamlTestSuiteIT because 2.x still
has the infinite loop until backport.

Signed-off-by: Michael Froh <[email protected]>

---------

Signed-off-by: Michael Froh <[email protected]>
(cherry picked from commit 05dab3b)
@msfroh
Copy link
Collaborator Author

msfroh commented Sep 20, 2024

Manual backport PR: #16026

msfroh added a commit to msfroh/OpenSearch that referenced this pull request Sep 20, 2024
* Avoid infinite loop in flat_object parsing

We had logic in flat_object parsing that would:

1. Try parsing a flat object field that is not an object or null.
2. Would see an END_ARRAY token, ignore it, and not advance the parser.

Combined, this would create a scenario where passing an array of
strings for a flat_object would parse the string values, then loop
infinitely on the END_ARRAY token.

Signed-off-by: Michael Froh <[email protected]>

* Remove some unused code and add more tests

The removed code does not actually seem to affect the logic. Also, I
want to be 100% sure that every call to parseToken is guaranteed to
call parser.nextToken() at some point.

Signed-off-by: Michael Froh <[email protected]>

* Remove unused parameter from parseToken

Thanks for the reminder, @kkewwei!

Signed-off-by: Michael Froh <[email protected]>

* Add skip for newly-added test

The test fails on MixedClusterClientYamlTestSuiteIT because 2.x still
has the infinite loop until backport.

Signed-off-by: Michael Froh <[email protected]>

---------

Signed-off-by: Michael Froh <[email protected]>
(cherry picked from commit 05dab3b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unbounded execution for write threads when parsing a flat_object field provided as array
3 participants