Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added explicit encoding to all open statements. #407

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

korsbakken
Copy link
Collaborator

Added encoding="utf-8" to all calls to open that open a yaml file or stream, where encoding was not already specified. This is intended to deal with inconsistencies that arise from not specifying or inconsistently specifying encoding on platforms where utf-8 is not necessarily the default text file encoding.

I have only checked for calls to open. There might be other types of calls that could be affected by platform-dependent encoding defaults and that don't specify encoding, such as calls through pyam or pandas methods, code that deals with Excel files. But especially in the latter case, it's maybe less clear that we should enforce UTF-8.

This PR addresses #406 and potentially resolves it. Unfortunately I don't have a suitable Windows system to test on.

Added `encoding="utf-8"` to all calls to `open` that open a text-based file or
stream, where encoding was not already specified. This is intended to deal with
inconsistencies that arise from not speciyfing or inconsistently specifying
encoding on platforms where utf-8 is not necessarily the default text file
encoding.
Copy link
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @korsbakken for catching and fixing this oversight.
With regards to Excel, we should be fine since the information about the encoding used should be part of the file itself.
Since we run all of our tests in GitHub on Windows, MacOS and Ubuntu you can just add a test for reading a character that would require utf-8. This way we can verify that it at least works on the GitHub actions version of Windows. It might also be interesting to see if we can reproduce the error in a GH action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants