Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append to preexisting MSYS env var even if ill-formed #1580

Merged
merged 3 commits into from
Sep 6, 2024

Conversation

EliahKagan
Copy link
Contributor

@EliahKagan EliahKagan commented Sep 6, 2024

Fixes #1574

When the MSYS variable is absent, we treat this the same as when it is present but empty. However, as described in #1574, an MSYS variable that is present but whose value contains an unpaired surrogate would also be replaced entirely, rather than appending to its old value. This changes that, to instead append, retaining whatever was there even if it was ill-formed Unicode.

An alternative change could be to panic when the old value is ill-formed Unicode. The change made here allows and appends to the old value, rather than panicking or keeping and documenting the previous behavior of discarding the old value, because the appended sequence winsymlinks:nativestrict is effective at causing fixture scripts to attempt to create actual symlinks even if the preceding code point is an unpaired Unicode high surrogate.


To avoid confusion related to #1358, I tested without setting GIX_TEST_IGNORE_ARCHIVES, relying on how some fixtures must still run because archive generation is suppressed for them in .gitignore files. That failure would occur in connection with the MSYS variable even when run this way was observed in #1443 and #1444, but I have also verified that.

This PR adds and then removes tests, which were unsuitable for the reasons noted in #1578, and which I removed rather than attempting to fix due to the concerns in #1577 combined with there being little doubt, in this case, that the code using OsString::push works as it seems. If those abandoned tests are not wanted in the history, I have no objection to rebasing to remove those commits, which I would be pleased to do on request.

This gist contains transcripts of the manual verification done to decide on and check over this approach. The scripts run in a normal environment, as well as in an environment in which the old value of MSYS has a trailing unpaired Unicode high surrogate. Failures occur due to the absence of MSYS when the code that sets it is removed, which verifies both that this code does not break setting it, and that the appended space character and winsymlinks option takes effect properly even if it immediately follows an unpaired high surrogate that "tries" to combine with the next code point. Testing with a panic! temporarily added, by the same technique as #1574, confirms that we really are concatenating.

All full test runs have an unrelated failure of jj_realistic_needs_to_be_more_clever locally, which is due to #1575 and can therefore be disregarded in the transcripts of local test runs here. Between tests, I removed generated fixture script output by using the technique discussed in #1435 of running gix clean -xd -m '*generated*' -e, which is shown in the gist transcripts. I did it that way rather than by running the stronger command gix clean -xde so the unpaired surrogate in the MSYS environment variable would not stall the build due to rust-lang/libz-sys#215. The command used is strong enough, but to doubly ensure insufficient cleaning would not cause misinterpreted results, I ran the tests with the expected failure modification last of all the full runs, so that if left-over fixture output was going to cause tests to wrongly pass, then it would happen with the tests I was verifying should fail.

The value of an environment variable as obtained by the facilities
in `std::env` is not always well-formed Unicode. Specifically, on
Windows the values of environment variables, like paths, are
natively UTF-16LE strings except that unpaired surrogate code
points can also occur. An `&OsStr` on Windows may accordingly not
quite be UTF-8.

When the `MSYS` variable is absent, we treat this the same as when
it is present but empty. However, as described in Byron#1574, an `MSYS`
variable that is present but whose value contains an unpaired
surrogate would also be replaced entirely, rather than appending to
its old value.

This changes that, to instead append, retaining whatever was there
even if it was ill-formed Unicode.

An alternative change could be to panic when the old value is
ill-formed Unicode. This commit allows and appends to the old
value, rather than panicking or keeping and documenting the
previous behavior of discarding the old value, because the appended
sequence ` winsymlinks:nativestrict` is effective at causing
fixture scripts to attempt to create actual symlinks even if
the preceding code point is an unpaired Unicode high surrogate.
Because they are not really testing the right thing, due to the
distinction between inherited and `.env()`-specified environment.

See Byron#1578.

While for `configure_command_clears_external_config` this slightly
diminishes but largely leaves intact the value of the test, the
`configure_command_msys` and `configure_command_msys_extends` tests
are really verifying almost nothing, and the latter (for this
reason) cannot pass.

A variant of it for testing the preceding fix to append to
arbitrary environment variable values would likewise not be able to
pass. So that is not added at this time.

Because the `configure_command_clears_external_config` test remains
useful, it is not removed.
Copy link
Owner

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

I should definitely have used push and var_os from the beginning, and will try to remember that (at least by now) OsString has some ways of altering it. In my head, it's still so that OsStr is known to be able to do nothing, so it's hard to work with.

@Byron Byron merged commit d00235a into Byron:main Sep 6, 2024
15 checks passed
@EliahKagan
Copy link
Contributor Author

that (at least by now) OsString has some ways of altering it

It occurs to me that I do not always check when features are introduced. I am hoping that if this feature came in later than the project MSRV, then one of the clippy checks on CI would catch it and fail. (If that is not the case, please let me know.)

@EliahKagan EliahKagan deleted the msys branch September 6, 2024 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

An old value of MSYS with unpaired surrogates is silently replaced for fixtures
2 participants