-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deserialize-utf8-lossy feature to always deserialize using lossy UTF-8 conversion #1187
Conversation
Also consider making this the default to improve interoperability with other drivers. |
Thanks for the contribution! I don't think we'll want to make this the default to avoid breaking anyone relying on the existing validation, but it seems quite helpful as an opt-in feature. If you're willing, could you share more information about the situation that motivates this? The bson spec says that strings are UTF-8, so at least in theory drivers shouldn't be writing values that require this. |
Yes, I agree that in a perfect world this would be rejected by all drivers and also the MongoDB server itself. However, at least the Java driver doesn't always produce valid UTF-8 when writing to a collection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Hi @tyilo, apologies for the delay here! UTF-8 lossy deserialization is a concept that we'd like to keep contained to the Here's the basic idea of what using this type would look like:
You could also use this type with |
@isabelatkinson Seems to work great. |
@tyilo great to hear. I just merged in the addition to the BSON library, and it will be included in that crate's next release. Going to close this out - thanks for bringing this issue to our attention! |
Useful if you need to read from a collection created by a driver for another programming language.
See #799