-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finish review on initial implementation #15
Comments
So I agree that transparency in what is being run is definitely something to consider, but the problem is that many of the pip options used will just not work locally (and are not actually needed locally; the options are quite different categories from those you use with bundler). And for several options we make it so that people don't even need them if running from a run image either. (You mention Lastly, I think in many of the potential failure modes, what needs to be clearest to the end user is the action they need to take, and in some cases their action is "try again, since this isn't your fault" - and in those cases including the whole command run actually makes the error message more verbose and potentially dilutes the intended message. As such, I really think this has to be decided on a case by case basis. I'll continue to think about this, however.
None of my subprocess commands have their output captured (intentionally). The helper does more than just stream, it checks exit code etc. So I think the generic name of "run" seems to capture all the things it does, whereas "stream" might sound like it only streamed? (If I had two helpers, and one streamed and the other didn't then yeah.) However, you mentioning this made me realise the rustdocs for
This is similar to the discussion we had on the PR. At the moment it seems that I would like to dig more into this, which is why there is a ~TODO here: buildpacks-python/src/errors.rs Line 280 in 60be5bb
I've also now filed GUS-W-12650236 to track.
Unfortunately, the GitHub diff links aren't jumping to the correct place in the PR (it seems every URL is identical?), so I'm having to guess which lines were being referenced. (I believe it's possible to leave a comments-only review on a closed PR, so next time could you leave on the PR rather an an issue?) If this was in reference to the early return in Also, I'm not convinced this type of suggestion belongs in a PR review, since which to use here comes down to personal style or a decision made on a case-by-case basis (and is not a unilateral best practice) and really doesn't affect readability at all. If this were a style enforced by Clippy or rustfmt, then I'd be more than happy to go with that, however, as-is it feels like someone else's personal style guide is being enforced on other repos without us having agreed on it as a team.
Is this about the If so, good callout - although, looking now I think it might still be best as info, since:
I'm guessing this is about the detect test? If so, I agree the test assertion contains parts of lifecycle (not pack) output, which is subject to change. However, this was an intentional decision/tradeoff so I could test the UX of the message as a whole.
I agree - I've filed GUS-W-12650345.
This wouldn't correctly interleave the stdout and stderr though. (ie: The line ordering would be different unless all the errors happened to be output after the stdout)
The git test above is testing the success path. The bad requirements test is testing the non-zero exit status path. Removing the latter would mean the non-zero exit status (and resultant error message) was not tested anywhere in the buildpack.
This is an intentional decision to ensure the UX of the full output can be seen/reviewed/is tested. Yes the tests will need updating more often, but that is a price I'll happily pay. I'm not going to enforce that style on other language's repos if they don't want, but it's something I want to do for Python. This is also something that came up in one of our team meetings in the last 2-3 weeks - maybe you were away then? I gave several classic buildpack examples then of why this kind of thing can be beneficial.
This is exactly the type of change I want to catch! If the logging style changes (for example the "Error:" prefix in the error headings changes), we should check our messages make sense with the new style/helper wordings etc.
I'm aware there's overlap. The point of the multiple-cases test is to test the UX of the more complicated error message, since it's multi-line and concatenated in a different way.
I have no idea what test these are referring to, given all the links that were pasted are the same URL. In general though, I put a lot of thought (multiple days) into exactly what integration tests are run, to get as much coverage of possible, with as little overlap as possible. I've definitely erred slightly more on the side of coverage than no-overlap, however, this is not by accident.
I agree short CI end to end times are important, which is why I've put a lot of time into making them quick for both the classic buildpack and CNBs. This is why my classic buildpack integration tests are currently the fastest of any of the classic language buildpacks (
I think you might not be taking into account cold/warm cache, and also GitHub Actions queueing time. For example, this Ruby CNB run is not the ~3 mins you mention, but instead ~6.5 mins since cold cache: And this Python CNB run is not the ~6 mins you mention, but instead ~3 mins since warm cache: So picking random runs and comparing them isn't going to work - it has to be like for like. (Note: Caches are discarded whenever the Rust version changes, and also after 7 day of inactivity.) If we want to be fair, we should (a) pick the fastest of several runs with warm cache, (b) look at the individual job time (not the workflow time, since that includes queuing time spent waiting for a worker to be available). Doing that I found:
So the Python CNB integration tests are 3 seconds slower (2%) than the Ruby CNB's. However, the Python CNB runs 20 integration tests in that time, rather than the 3 for Ruby - so Python has much better coverage for basically the same CI duration. I see that Ruby doesn't test each major Ruby version (how do you know your S3 archives work, or that bundler works with each major you support?), along with several other features that aren't tested at all in integration tests (eg cache invalidation). As such, I'd say the Python integration tests are doing more than ok, and maybe it's the Ruby CNB integration tests that need some work for increasing coverage :-P |
Updated in #16. |
Thanks for the consideration. I think this could be an area where a "debug logging" or "give me the firehose of information" might be useful, though it's not supported by the CNB spec. We could informally add something like that to our buildpacks by convention in the future. One of the debugging techniques I use if someone says "this used to work, but now it fails and nothing changed" is to "heroku run bash" into their machine (after getting permission) and re-run the exact command from the last deploy.
While you're the primary maintainer of the Python buildpack, it's code the team maintains as well. Not commenting feels to me as having to accept someone else's style. We don't really have these conversations anywhere else and I honestly don't want to have a meeting for it. At least not yet. I would like to have a place to have these conversations. I proposed merging this in heroku/libcnb.rs#533 and we could add a chapter on styles and have discussions on PRs to that chapter.
Thanks!
So sorry. Though to be quite honest you're very good at divining my intentions. I now see the UI you're talking about, which I couldn't find last night. Sorry about that.
It's in
I understand the desire. In In my ideal world, I would get a warning to investigate differences in output format that can be dismissed without needing a code change. The more rigid the tests, the more toil it is to make minor changes. That being said, you're probably the most toil-resistant developer I've met.
I don't believe the buildpacks should be testing binaries. They didn't make them. I hatchet test once when they're created https://github.com/heroku/docker-heroku-ruby-builder/blob/efecb7ebdd850dab93bcdb44674ee6c40e5eae28/Rakefile#L126-L153. Then I trust they will continue to work. My tooling prevents me from over-writing an existing binary.
Happy to be wrong here. Thanks for investigating/clarifying. I pulled the last CI runs out of convenience, not malice :)
I'm not sure what common cases should be tested. Happy to hear opinions. In general I'm most wary of caching and the provide/require interface. That's what I've focused on. I've got some gus backlog work items (if you can believe it) for some tests in classic that aren't represented in the CNB, but they're all edge cases that I'm not sure need to be tested due to guarantees both Rust and the CNB spec give us. |
I wasn't actually done with my review on #3. I threw out my back and wasn't able to get to it on Friday. Here's the rest of it:
I would like to see the command logged to the end user. Above it's hard coded to "Running pip install". I would like to see the command be exactly what we run for the user so if they copy and paste (and of course modify any paths) they would get the exact same result.
Since it also looks like the results rely on
PYTHONUSERBASE
env var. I would like to see that in the output as well.In addition to local debugging, it can also help buildpack maintainers spot accidental discrepancies.
https://github.com/heroku/buildpacks-python/pull/3/files#diff-e1cef540ad7956e8fb544ccfa9d7eef4517a3b12e6fa892b991b078ec8c9bdffR74 naming nit. The function streams output to the user but is named
run_command
. Maybestream_command
?https://github.com/heroku/buildpacks-python/pull/3/files#diff-ebe18e82a4ab770943fb39f7b7fec5a08762263112e0adb4e5d3727d2fd3d85bR109
Same as above. We're streaming this command, but not announcing the (exact) command that's being streamed. If we don't want to announce this specific command (since it seems it's a helper command rather than something a user might expect to be run against their code) then perhaps we move to only emitting the command string in the error message.
I want the exact command run to be in the logs or the error (or both).
Function name is
log_io_error
. Function body says "unexpected error" which might not always be true. Or rather if we're saying "unexpected error" I'm reading that as synonymous with "not my fault" if I'm a customer reading it.I imagine someone copying and pasting that function without looking too close thinking it's for handling all IO errors. For example a new file format is added and reading it generates an
std::io::Error
due to a permissions problem or bad symlink the customer checked in. In that case it wouldn't be so unexpected. Rename for clarity?unexpected_io_error
?Mentioned in Colin's PR. I would like to avoid early returns when there's only two branches. We can if/else this and eliminate the early return.
I think we should
log_warning
instead oflog_info
.In a bit of a surprise to even me, I'm going to advocate for less testing. I think testing build logic inside of main.rs should be enough here. One fewer docker boots at CI test time.
Also we're effectively testing the output of pack here which is subject to change. If you do want to keep this test, I would scope it to only the strings you control.
Same as above. Doesn't need to be a docker test.
Regarding the error message (it's very good), when we say the user is missing files: It would be nice for us to give them an
ls
of what files we see in that directory. So at a glance they could see in one window we're looking for "requirements.txt" but they have "REQUIREMENTS.txt" (or something).Comment: You could manually combine the streams yourself
let output = format!("{}{}", context.pack_stdout, context.pack_stderr)
. I'm assuming the goal is to get all of the pack output on test failure.Also I think you could get rid of this test. Esssentially it's testing that a bad requirements.txt file triggers a non-zero
pip install
. The git test above it, seems more useful as an integration test.Testing caching 👍. We can make these less brittle by asserting only the individual lines (or even just parts of lines). I think asserting for "Using cached Python" and "Using cached pip" without the version numbers, would be enough to convince me. Maybe a "Using cached typing_extensions" for good measure. All the other values and numbers will cause churn on this file and possibly failures on otherwise unrelated changes (if libherokubuildpack updates it's logging style for example).
That comment applies to all integration tests. They're really nice and easy to review in this format, but I don't want to have to update 10 files every time I add an oxford comma (for example).
If we know that one change invalidates things, I would bet multiple changes would as well. I think this is covered in your unit tests ✂️
Unit test should be okay.
Comment: This is a good idea
Unit test should be fine.
Time to test
Not that it's a race, but time to run CI universally always goes up, and integration tests are historically one of the last things that developers are willing to delete. Right now Python is ~6min for integration tests while Ruby is ~3min.
Ideally I would like to aim for <5min for CI completion with a maximum of around 10 min. Once you hit 10 min and a random glitch causes the tests to need to be re-run then you're pushing nearly half an hour for a single change and it absolutely kills (my) productivity.
I think we should be agressive in testing and safety. I also think we should consider pruning some of these tests now, as this is otherwise the fastest this CI suite will ever execute.
The text was updated successfully, but these errors were encountered: