You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently source archives are downloaded directly from their canonical source. While this works well enough for lastest packages, the download links go down over time and as a result it's almost guaranteed to fail if you try to build packages from an older commit of this repo (I tried to build libllvm 14 and a few archive links are already gone).
Also not to mention some source websites are painfully slow. Sourceforge for example downloads at around 5KB/s on my network. This significantly slows down package building.
What if we maintain source archives in a hash-addressable mirror? All mirrored archives would be retrivable via the sha256 checksum. This would solve both of the issues (link going down and slow downloads).
Such a mirror can be maintained by GitHub Actions similar to how the package registry is maintained automatically right now. When building packages, a flag can be supplied so that the builder tries to fetch the file from the mirror first.
In terms of cost, maintaining such a service shouldn't be too expensive as there are object storage services with zero charge on ergress (e.g. Cloudflare R2), which makes the storage cost totally predictable (not depending on traffic).
On the upload side, we can leverage Cloudflare Workers for authentication and checksum validation, making the whole thing serverless.
What do you guys think? I think this is especially helpful given that the registry itself does not maintain old package artifacts - this would make it more viable to build old package versions than it is right now. Such a change should also be harmless as the mirror will only be used as an optimization.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Currently source archives are downloaded directly from their canonical source. While this works well enough for lastest packages, the download links go down over time and as a result it's almost guaranteed to fail if you try to build packages from an older commit of this repo (I tried to build
libllvm
14 and a few archive links are already gone).Also not to mention some source websites are painfully slow. Sourceforge for example downloads at around 5KB/s on my network. This significantly slows down package building.
What if we maintain source archives in a hash-addressable mirror? All mirrored archives would be retrivable via the sha256 checksum. This would solve both of the issues (link going down and slow downloads).
Such a mirror can be maintained by GitHub Actions similar to how the package registry is maintained automatically right now. When building packages, a flag can be supplied so that the builder tries to fetch the file from the mirror first.
In terms of cost, maintaining such a service shouldn't be too expensive as there are object storage services with zero charge on ergress (e.g. Cloudflare R2), which makes the storage cost totally predictable (not depending on traffic).
On the upload side, we can leverage Cloudflare Workers for authentication and checksum validation, making the whole thing serverless.
What do you guys think? I think this is especially helpful given that the registry itself does not maintain old package artifacts - this would make it more viable to build old package versions than it is right now. Such a change should also be harmless as the mirror will only be used as an optimization.
Beta Was this translation helpful? Give feedback.
All reactions