-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/utf 8 #3935
Fix/utf 8 #3935
Conversation
Can you please base the PR on the current master so you aren't comparing across branches and seeing hundreds of files in the diff? Thanks. |
Also, would you mind please making sure to sign your commits to invoke the DCO? Thanks. |
a0115a2
to
76f527f
Compare
This is somewhat problematic as large parts of [1] https://en.cppreference.com/w/cpp/locale/wstring_convert |
I like the intent of this PR. It would be great to burn artists actual names in a slate for instance. |
@jessey-git I'm not sure exactly why it seems to pass CI for us (including the test where we do use C++20), but Godbolt certainly makes a compelling case that this is problematic. What's the preferred approach these days? Can the implementation we are using in Strutil::utf8_to_unicode be repaired to fix whatever shortcoming inspired this PR? |
For the windows builder it might be because it's using For the linux builders I think it's a combination of not using a warning level that would show the warnings and/or not using Unfortunately I don't quite know what's state-of-the-art in terms of unicode in c++ nowadays. |
It was not me who did it, but I have a feeling that the unreliability of wstring_convert (not existing prior to C++11, and perhaps also knowing that a deprecation was on the way for C++17), is what inspired the unicode decoding from http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ being added to Strutil. The nice thing about that it that it's small and independent of the std or any other dependencies. The down side, of course, is that there appear to be some sequences it doesn't handle properly, which had never been reported prior to this PR. I still think that if this code can be repaired (or substituted with a different implementation with a license acceptable to us), that would be better than relying on std::wstring_convert. But I definitely am in favor of some kind of fix; @progweb is correct that we should render those codes properly. |
@progweb Can you supply any test case that we can try out other solutions (such as whatever made your example above)? I noticed that the code in Strutil is not the only formulation on that web page it came from; maybe it is the only buggy one and the others work properly? But I don't have a good test. |
I'm going to try to fix it with an other implementation... |
Signed-off-by: Nicolas VIVIEN <[email protected]>
Work as expected.
|
Signed-off-by: Larry Gritz <[email protected]>
I took the liberty of adding a test for this and pushing it to your tree. Once we see that it passes CI, I'll do the merge. |
Merged, and I'll backport to the next 2.4 release. Note that I disable the test I added on Windows, because I just couldn't figure out how to embed the unicode on the Windows command line. Seems to work fine for Linux & MacOS. If anybody knows the right way to fix, please suggest and we can re-enable that aspect of the test for Windows as well. |
Text rendering doesn't work with utf-8... I can't render chinese, arabic... or special chars as éèûîà... It seems that the string function to convert utf-8 to unicode is quiet simple. Fixes AcademySoftwareFoundation#3933 [Note]: OpenImageIO is used in the [gpx2video project](https://github.com/progweb/gpx2video). --------- Signed-off-by: Nicolas VIVIEN <[email protected]> Signed-off-by: Larry Gritz <[email protected]> Co-authored-by: Larry Gritz <[email protected]>
Text rendering doesn't work with utf-8...
I can't render chinese, arabic... or special chars as éèûîà...
It seems that the string function to convert utf-8 to unicode is quiet simple.
Ref: #3933
[Note]: OpenImageIO is used in the gpx2video project.