-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendor Rust crates into source distributions #49
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool, thank you!
I left some notes about how it might be integrated in an inline comment. Let me know what you think, maybe you had a different idea in mind.
One global comment is I'm trying to log fairly verbosely. Debug messages are usually written to a file. Info messages go to the console as progress messages, indicating that some action (especially anything that might take time) is starting.
src/fromager/vendor_rust.py
Outdated
|
||
|
||
def vendor_rust( | ||
sdist: pathlib.Path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should work out how to integrate this into the flow.
Today we have
- download-source
- prepare-source
- unpack
- patch
- prepare-build
- build-wheel
That leaves the patched source tree with build artifacts in it on the disk, and leaves the original sdist unmodified.
I think we want instead something like
- download-source
- prepare-source
- unpack
- patch
- vendor rust
- re-roll-sdist (to preserve a copy of our patched sources with the vendored rust code, before any build artifacts are written, so we can distribute the sources we use for the build)
- prepare-build
- build-wheel
If that's the flow, then I think for input we'll want this function to take the path to the source root instead of the sdist file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good plan. I like the idea to re-roll sdist. Eventually we want to store new sdists with downstream patches into a separate repo, too.
I have changed the function. It now takes a project_dir argument and returns a bool vendor_rust(project_dir: pathlib.Path, *, shrink_vendored: bool = True) -> bool
. The project directory is the first directory in the root of a tar ball.
src/fromager/vendor_rust.py
Outdated
# Remove large, unused archive and lib files. They are only | ||
# used on Windows, macOS, and iOS. This makes the vendor bundle up to | ||
# 60% smaller. | ||
for pattern in VENDOR_FILTER_PATTERNS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make the pattern configurable, eventually? Are we likely to want to have it be part of the per-package customization interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored the code and moved the cleanup to a separate function. The cleanup now removes all **/*.a
and **/*.lib
files. It didn't work before because I didn't update the crate's checksum file. That's now fixed.
The removal of pre-compiled archive and lib files serves two purposes:
- Reduce size of vendored crates
- Prevent use of pre-compiled files
If we ever need the files, then we have to figure out how to build the crates from source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, if it's not going to be variable keeping it in the source makes sense. We can always make it configurable later, if we need to.
src/fromager/vendor_rust.py
Outdated
with open(fileame, "r", encoding="utf-8") as f: | ||
cfg = toml.load(f) | ||
|
||
modified = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to keep project-specific knowledge out of the core bits, and provide an override hook or other mechanism for a build system to inject that sort of knowledge based on the packages it wants to build. In this case, for setting the rust level, I could see us setting a global config in the settings file, or I could see a patch being applied to the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood! That makes sense.
I have removed the code that patches the MSRV in Cargo.toml
files. Let's fix this in a separate patch.
I also filed an issue and PR with pydantic-core to lower their MSRV to 1.75. Hopefully their next upstream version can be built without a patch.
src/fromager/vendor_rust.py
Outdated
raise FileNotFoundError(project_dir) | ||
|
||
manifests = list(project_dir.glob("**/Cargo.toml")) | ||
for manifest in manifests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An info log message here would provide good progress output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a bunch of log statement (info and debug) to the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed your suggestions and updated the code.
I also added a minimal Rust project with pyo3 and setuptools-rust to e2e
directory for future testing.
src/fromager/vendor_rust.py
Outdated
# Remove large, unused archive and lib files. They are only | ||
# used on Windows, macOS, and iOS. This makes the vendor bundle up to | ||
# 60% smaller. | ||
for pattern in VENDOR_FILTER_PATTERNS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored the code and moved the cleanup to a separate function. The cleanup now removes all **/*.a
and **/*.lib
files. It didn't work before because I didn't update the crate's checksum file. That's now fixed.
The removal of pre-compiled archive and lib files serves two purposes:
- Reduce size of vendored crates
- Prevent use of pre-compiled files
If we ever need the files, then we have to figure out how to build the crates from source.
src/fromager/vendor_rust.py
Outdated
with open(fileame, "r", encoding="utf-8") as f: | ||
cfg = toml.load(f) | ||
|
||
modified = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood! That makes sense.
I have removed the code that patches the MSRV in Cargo.toml
files. Let's fix this in a separate patch.
I also filed an issue and PR with pydantic-core to lower their MSRV to 1.75. Hopefully their next upstream version can be built without a patch.
src/fromager/vendor_rust.py
Outdated
raise FileNotFoundError(project_dir) | ||
|
||
manifests = list(project_dir.glob("**/Cargo.toml")) | ||
for manifest in manifests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a bunch of log statement (info and debug) to the code.
src/fromager/vendor_rust.py
Outdated
|
||
|
||
def vendor_rust( | ||
sdist: pathlib.Path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good plan. I like the idea to re-roll sdist. Eventually we want to store new sdists with downstream patches into a separate repo, too.
I have changed the function. It now takes a project_dir argument and returns a bool vendor_rust(project_dir: pathlib.Path, *, shrink_vendored: bool = True) -> bool
. The project directory is the first directory in the root of a tar ball.
7f1499b
to
6bc1bcd
Compare
`vendor_rust()` takes an existing sdist tar ball and creates a new sdist with all Rust crates vendored. This allows offline compilation without downloading any Rust crates at build time. Switch from `tomli` to `toml`. The `tomli` package does not have write support. I have successfully wheel builds with vendored crates for - `cryptography-42.0.7.tar.gz` - `maturin-1.5.1.tar.gz` - `pydantic_core-2.18.4.tar.gz` - `setuptools-rust-1.9.0.tar.gz` with `CARGO_NET_OFFLINE=true python3 -m build -w`. Signed-off-by: Christian Heimes <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@@ -15,6 +15,9 @@ def build_wheel(ctx, req, sdist_root_dir, build_env): | |||
if not builder: | |||
builder = default_build_wheel | |||
extra_environ = overrides.extra_environ_for_pkg(ctx.envs_dir, req.name, ctx.variant) | |||
# TODO: refactor? | |||
# Build Rust without network access | |||
extra_environ["CARGO_NET_OFFLINE"] = "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should go in default_build_wheel
? Maybe it's better to do it here so any overrides also use the same setting without having to remember to add it themselves.
@Mergifyio refresh |
✅ Pull request refreshed |
vendor_rust()
takes an existing sdist tar ball and creates a new sdist with all Rust crates vendored. This allows offline compilation without downloading any Rust crates at build time.Switch from
tomli
totoml
. Thetomli
package does not have write support.I have successfully wheel builds with vendored crates for
cryptography-42.0.7.tar.gz
maturin-1.5.1.tar.gz
pydantic_core-2.18.4.tar.gz
setuptools-rust-1.9.0.tar.gz
with
CARGO_NET_OFFLINE=true python3 -m build -w
.