Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Ion 1.1 symbols for opcodes E1-E3 #844

Merged
merged 3 commits into from
May 10, 2024

Conversation

popematt
Copy link
Contributor

@popematt popematt commented May 9, 2024

Issue #, if available:

None

Description of changes:

Prior to this change, Op codes E1 - E3 were being handled incorrectly because of several problems:

  • Ion 1.0 and Ion 1.1 symbol values with SIDs were both being treated as if they were encoded as a (big endian, Ion 1.0) UInts.
  • IonReaderContinuableCoreBinary.readFixedUInt_1_1() would start reading at peekIndex and stop at valueMarker.endIndex, but peekindex was not necessarily being set.
  • The IonTypeID for Opcode E3 has a length of -1, and when reading the next value, IonCursorBinary was blindly setting valueMarker.endIndex to one less than the start index.

Changes:

  • symbolValueId() now branches based on the Ion version. If it is Ion 1.1, it reads FixedUInt rather than UInt.
  • IonReaderContinuableCoreBinary.readFixedUInt_1_1() now accepts a start and end index.
  • If an Ion 1.1 IonTypeID doesn't have a length prefix (i.e. variableLength == false), and it has length == -1, (and it's not delimited), then slowReadValueHeader() and calculateEndIndex_1_1() now assume it to be a FlexInt or FlexUInt, and the length is calculated by looking at the continuation bits of the value.
  • Adds Ion 1.1 SID symbol value test cases to IonReaderContinuableCoreBinaryTest.

Why didn't we notice this sooner? Lack of proper unit testing, and maybe a bit of (bad) luck. Most of our test suite has symbol tables with less than 256 symbols, which means that (a) for the SIDs we were testing, there was no practical difference between big and little endian since they were only one byte long, and (b) we were not exercising opcodes E2 and E3 in any meaningful way.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tgregg
Copy link
Contributor

tgregg commented May 9, 2024

Why didn't we notice this sooner?

We just hadn't implemented the feature yet, then we forgot about it. That explains the lack of tests. Everything was just following the existing 1.0 code path.

Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like spotless is complaining about SeekableReaderTest in this build. I changed that in #842 on master, and applied our spotless rules to it. Why is it causing a failure on this branch? Spotless always compares against changes from master?

Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, noting the ktlint failure.

@popematt
Copy link
Contributor Author

popematt commented May 9, 2024

Spotless always compares against changes from master?

Yes. It does. If I can figure it out, I will change it to use the commit from master that is the closest ancestor of the current commit—that way divergence between master and a feature branch won't cause sudden failures like this.

@popematt popematt merged commit a4ff2e7 into amazon-ion:ion-11-encoding May 10, 2024
14 of 34 checks passed
tgregg pushed a commit that referenced this pull request Jun 28, 2024
tgregg pushed a commit that referenced this pull request Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants