Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure CopyElement does not break UTF-8 characters #192

Merged
merged 4 commits into from
Oct 11, 2023

Commits on Sep 15, 2023

  1. Configuration menu
    Copy the full SHA
    b078b46 View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2023

  1. Configuration menu
    Copy the full SHA
    f913251 View commit details
    Browse the repository at this point in the history
  2. test(common): add reproducer test for issue #188

    This reproducer test will ensure the output is not bodged by broken
    UTF-8 multibyte chars again.
    poikilotherm committed Sep 29, 2023
    Configuration menu
    Copy the full SHA
    5057b23 View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2023

  1. fix(common): apply more lowlevel fix for broken UTF-8 chars #188

    With this commit, we introduce an analysis routine that will go over
    the last few bytes when reading with CopyElement. This is faster
    than converting to String and checking for the UTF-8 unknown char sign
    over and over again.
    
    By using a buffered input stream, we can rewind the stream if necessary
    and read again up to the point that we don't have a broken char.
    
    Extensive testing was added to make sure the analysis function works
    with any length of multibyte UTF-8 chars
    poikilotherm committed Sep 30, 2023
    Configuration menu
    Copy the full SHA
    53f8222 View commit details
    Browse the repository at this point in the history