Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support remote scripts with uv run #6375

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

manzt
Copy link
Contributor

@manzt manzt commented Aug 21, 2024

First off, congratulations on the 0.3 release! The PEP 723 standalone scripts support is awesome, and I can already imagine a long tail of little scripts of my own that would benefit from this functionality.

Background

I really like the Deno CLI's support for running and installing remote scripts.

deno run <url>
deno install --name foo <url>

I can see parallels with uv run and uvx. After mentioning this on Discord, @zanieb suggested I could take a stab at a PR to implement similar functionality for uv.

Summary

This PR attempts to add support for executing remote standalone scripts directly with uv run. While this is already possible by downloading the script (i.e., via curl/wget) and then using uv run, having direct support would be convenient.

The proposed functionality is:

uv run <url>

Another addition/alternative could be to support running scripts via stdin:

curl -sL <url> | uv run -

But that is not implemented in this PR.

Test Plan

I noticed that GitHub and files.pythonhosted.org URLs are used in some of the tests. I've created a personal GitHub Gist with the example from PEP 723 for now to test this functionality.

However, I couldn't figure out how to get the with_snapshot config filter to filter out the tempfile path, so the test is currently failing. Any assistance with this would be appreciated.

Notes

I'm not totally pleased with the implementation of this PR. I think it would be better to handle the case earlier (and probably reuse the cache), and avoid mutation, but since run command requires a local path this was the simplest implementation I could come up with.

I know that performance is paramount with uv so I totally understand if this requires a different approach or something more explicit to avoid "inferring" the path. I'm just taking this as an opportunity to learn a little more Rust and acquaint myself with the code base. cheers!

@zanieb
Copy link
Member

zanieb commented Aug 21, 2024

Awesome! Thanks for putting this up.

@zanieb zanieb self-assigned this Aug 21, 2024
@zanieb zanieb added the enhancement New feature or request label Aug 21, 2024
crates/uv/tests/run.rs Outdated Show resolved Hide resolved
@mgaitan
Copy link

mgaitan commented Aug 23, 2024

Amazing feature. Imagine this with #6283

@zanieb
Copy link
Member

zanieb commented Sep 17, 2024

Sorry for the delay, I've been stretched thin on reviews.

@BurntSushi would you be able to take over review of this?

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before getting into the weeds of how this is implemented, I wanted to pop-up a level and think about this feature more holistically. And I think I do have a significant concern here.

My main concern is that it is technically possible for a valid URL to also be a valid file path. For example:

$ mkdir 'https:'
$ echo wat 'https:/example.com'
$ cat https://example.com
wat

If uv run https://example.com were used, then I believe this PR would result in uv doing an HTTP request instead of using the file path on disk.

Now, I'm not especially worried about this happening in normal usage. The double slash for example is clearly a bit of subterfuge to provoke the ambiguity. So I think the thing where I'm worried is if someone somehow causes a uv run command to run where the end user thinks it's hitting a local file but it's actually hitting a URL. I had a bug similarish to this in ripgrep a while back where end users thought the system gzip executable was being used, but if they cloned a repository containing a gzip.exe on Windows and ran rg from that same directory, then a quirk of Windows would result in executing the gzip.exe in the clone instead.

There are a lot of different possible mitigations to this. Or perhaps this isn't as much of a problem as I think. For example:

  • This could be a feature that requires some opt-in flag to be enabled in our config.
  • The default mode could be to show the Python script that will be executed on stdout first and then provide a prompt confirming it. And then add a --no-confirm (or whatever) flag to make it non-interactive.
  • We could look for a file path matching the URL, and if it exists, either use the local file or report an error.

Maybe there's another idea I haven't thought of.

crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
@zanieb zanieb assigned BurntSushi and unassigned zanieb Sep 17, 2024
@zanieb
Copy link
Member

zanieb commented Sep 17, 2024

I generally have a preference for the "We could look for a file path matching the URL, and if it exists, either use the local file or report an error." option rather than prompting. Opt-in might make sense, but via a --remote flag rather than a persistent configuration option (maybe that's what you meant though).

@zanieb
Copy link
Member

zanieb commented Sep 17, 2024

See also #7396 (comment)

@BurntSushi
Copy link
Member

If we go the route of prioritizing a local file when given a URL when a file exists, then I think that probably satisfies me. I can't think of a way for that to go wrong from a "accidentally executed a remote script instead of the intended local file" perspective.

There may still be an argument in favor of a --remote flag, but more in a vague "otherwise it's too easy to run remote scripts" sense. I don't feel as strongly about that personally.

@manzt What do you think?

@manzt
Copy link
Contributor Author

manzt commented Sep 18, 2024

Thank you for raising this concern. I agree it's a valid issue I hadn't considered.

My preference aligns with @zanieb's suggestion to prioritize checking for a local file path matching the URL first, erroring if a match is found. For context, I checked your example with deno, which always makes the HTTP request (though it operates in a more sandboxed environment than Python, not sure why they decided on this behavior).

Regarding a --remote flag, while I don't have strong arguments for or against it, I appreciate its explicitness. The common use case I envision is sharing links to Gists or GitHub, where adding --remote before execution seems a reasonable safeguard (especially if the readme just has a copy link).

It's worth noting that with #6467, I believe scripts can already be executed via curl <url> | uv run - (though last time I checked the script metadata wasn't parsed from stdin). However, handling this directly in uv does seem really nice UX.

Apologies for the delayed response!

@BurntSushi
Copy link
Member

All righty, let's go with checking for whether a local file exists or not.

@charliermarsh Do you have any opinions here? Do you think uv run https://example.com/foo.py should "just work," or do you think there should be an extra step needed, e.g., uv run --remote https://example.com/foo.py? (Above we discussed mitigating the case where https://example.com/foo.py actually refers to a valid file path, in which case, we would just read the local file instead.)

@zanieb
Copy link
Member

zanieb commented Sep 19, 2024

I sort of prefer "just works"

@BurntSushi
Copy link
Member

Same.

@charliermarsh
Copy link
Member

Same. Whatever we do, it would be nice to align the behavior of requirements.txt files, which can also be either remote or local.

@manzt
Copy link
Contributor Author

manzt commented Sep 19, 2024

Seems like agreement, I’ll start addressing the prior comments and handle the file behavior!

@manzt
Copy link
Contributor Author

manzt commented Sep 25, 2024

Thanks for your patience. I think I was able to achieve the desired behavior and added another test.

I'm not sure how we should plan on testing the remote behavior (right now still depends on a gist with a fixture of my own).

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! This is coming along. :-) I've requested another round of changes and suggested a way forward with respect to testing.

crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/tests/run.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@manzt manzt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Learning a lot.

Just a clarifying question about Path::try_exists.

crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
@manzt
Copy link
Contributor Author

manzt commented Sep 26, 2024

No worries, I opened #7713

@manzt
Copy link
Contributor Author

manzt commented Sep 26, 2024

Test is working now, pointing to #7713. Will switch to hash on main once merged.

crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
BurntSushi pushed a commit that referenced this pull request Sep 26, 2024
Fixes broken test in #6375

Tested here with:

```sh
cargo run run --exclude-newer=2024-03-25T00:00:00Z scripts/uv-run-remote-script-test.py CI
```
@BurntSushi
Copy link
Member

The CI failure looks legitimate. My guess is that the file path is just totally invalid on Windows, which in turn causes Path::new(...).try_exists() to fail and thus bail out of the remote logic. So I think I would change the if to this:

    // Only continue if we are absolutely certain no local file exists.
    //
    // We don't do this check on Windows since the file path would
    // be invalid anyway, and thus couldn't refer to a local file.
    if cfg!(unix) && !matches!(Path::new(target).try_exists(), Ok(false)) {
        return Ok(None);
    }

Also, I think we need to add docs to uv run mentioning this feature. :)

@manzt
Copy link
Contributor Author

manzt commented Sep 27, 2024

Thanks for the tips!

Also, I think we need to add docs to uv run mentioning this feature. :)

I took a stab at updating the docs.

@BurntSushi
Copy link
Member

I think you need to run cargo dev generate-cli-reference. Basically, changes to the docs need to be propagated out to other places.

@manzt
Copy link
Contributor Author

manzt commented Sep 27, 2024

Ah, sorry... wasn't sure what the source of truth was. Thanks!

@manzt
Copy link
Contributor Author

manzt commented Sep 27, 2024

Any thoughts on how to update the test checking for the local file? Is there a way to mark that a test should fail on windows for example?

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on how to update the test checking for the local file? Is there a way to mark that a test should fail on windows for example?

I don't think it should fail though? It looks like it's failing because it can't write a temp file? I'm not sure what the issue is there.

crates/uv/src/lib.rs Outdated Show resolved Hide resolved
docs/reference/cli.md Outdated Show resolved Hide resolved
@manzt
Copy link
Contributor Author

manzt commented Sep 27, 2024

I don't think it should fail though? It looks like it's failing because it can't write a temp file? I'm not sure what the issue is there.

I think on Windows we skip checking for a local file, which would result in requestinghttps://example.com/path/to/main.py

It looks like it's failing because it can't write a temp file?

Right, I'm also not sure of this error... but if we were to write the tempfile I think it would still fail. Maybe I'm missing something.

@BurntSushi
Copy link
Member

I think on Windows we skip checking for a local file, which would result in requestinghttps://example.com/path/to/main.py

OK, so you're referring to the run_url_like_with_local_file_priority test here. I was looking at run_remote_pep723_script.

For run_remote_pep723_script, I think the test should pass.

For run_url_like_with_local_file_priority, right, the thing being tested doesn't make sense on Windows. So I would probably just add a #[cfg(unix)] on that test with a comment explaining why that's there.

@manzt
Copy link
Contributor Author

manzt commented Sep 27, 2024

OK, so you're referring to the run_url_like_with_local_file_priority test here. I was looking at run_remote_pep723_script.

Yes, apologies for the confusion. I added the #[cfg(unix)] with a comment.

@manzt
Copy link
Contributor Author

manzt commented Sep 27, 2024

For run_remote_pep723_script, I think the test should pass.

Agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants