Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended URLs do not have stable checksums #341

Open
abentley-ssimwave opened this issue Jan 30, 2023 · 8 comments
Open

Recommended URLs do not have stable checksums #341

abentley-ssimwave opened this issue Jan 30, 2023 · 8 comments

Comments

@abentley-ssimwave
Copy link

The README recommends https://github.com/nelhage/rules_boost/archive/96e9b631f104b43a53c21c87b01ac538ad6f3b48.tar.gz but this does not have a stable binary representation.

We just had this happen to us: the sha256 hash of https://github.com/nelhage/rules_boost/archive/c1d618315fa152958baef8ea0d77043eebf7f573.tar.gz changed from eb25099249f01be30eb0f56a8e22b28db648b4e74b6bf0a1c8bf73c975ce61f to 9e51084241f67207adbcb36d7a99b98091b5d4f7e0d47fd4ec5dfb15e87f28b3

This appears to be a change in the gzip compression.

This happens from time to time, e.g.
https://twitter.com/shs96c/status/1488480089700503558

An effective way to have stable binary contents is to upload files to releases. Note: If no length is shown for a file, GitHub is not guaranteeing a specific binary representation. For example, here: https://github.com/abentley/oaf/releases/tag/v0.1.4 the "Source code" files don't have a stable binary representation, but the remaining ones do (and also have a length).

AFAIK, no files of the form "https://github.com/nelhage/rules_boost/archive" will have a stable binary representation.

@neumann-nico
Copy link

@abentley-ssimwave thank you for the analysis!
I just had the same problem. Did you find a temporary solution to solve this issue?

@abentley-ssimwave
Copy link
Author

Yes: download the new file, gunzip it, then gzip it with "gzip -n6". That will reproduce the original bitstream. The change was due to the Git project switching from executing the gzip binary to using a library.

git/git@4f4be00

Note that Github has decided to revert this change:
https://github.blog/changelog/2023-01-30-git-archive-checksums-may-change/

@cpsauer
Copy link
Collaborator

cpsauer commented Jan 31, 2023

Hey guys! Thanks for the heads up. I'm not the main maintainer (@nelhage) who should make release change decisions. But...

It seems like this will be a fairly general problem for Bazel and GitHub-based build-package management at large if changes continue. A lot of Bazel repos (following Abseil) have a live-at-HEAD philosophy, and I know this temporarily broken lots of non-bazel tools. This is definitely something we should track. That said, latest-commit-is-release does have simplicity advantages, so if GitHub does end up promising stability in the wake of reverting, I'm guessing he may want to stick with the current release model.

@abentley-ssimwave
Copy link
Author

I appreciate that changing approaches can be painful, but this kind of thing has been going on since at least 2017.
libgit2/libgit2#4343 (comment) I think Bazel projects should use a deterministic source of tarballs, even if that is a bit more work.

Or perhaps, Bazel could provide a way of hashing the untarred results instead of the tarball. Though I suppose there's some risk of tar or gzip exploits in that case.

@abentley-ssimwave
Copy link
Author

A git tree / commit OID would be a good identifier, if it was sha256...

@cpsauer
Copy link
Collaborator

cpsauer commented Jan 31, 2023

^ could still be a pretty good solution if GitHub doesn't do guarantees, also for dependencies that haven't conformed? Would want to check on whether Renovate can still auto-update them, but I'd think so.

@Vertexwahn
Copy link
Collaborator

@cpsauer
Copy link
Collaborator

cpsauer commented Apr 15, 2023

Good crossreferences!
Will add also GitHub's update: https://github.blog/2023-02-21-update-on-the-future-stability-of-source-code-archives-and-hashes/

I don't think I'm the decision maker on this one, but I do still think that live-at-HEAD (use commits as releases) is handy enough that it'd be good to preserve. Further, it's used by transitive dependencies of this repo, so we'll have to work around it regardless. Perhaps the move is to switch to git archive, indexing off the commit, both internally and externally, if Renovate supports? What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants