Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate CI dependency installation from pip to uv #3675

Merged
merged 5 commits into from
Mar 7, 2024
Merged

Migrate CI dependency installation from pip to uv #3675

merged 5 commits into from
Mar 7, 2024

Conversation

janosh
Copy link
Member

@janosh janosh commented Mar 7, 2024

hoping the migration to uv, besides being faster than pip, can also fix the CI import errors for matgl and pydantic that started 2 days ago. not optimistic about pydantic which shouldn't be imported by pymatgen or any of its deps anyway but maybe matgl.

related PR: CederGroupHub/chgnet#133

@janosh janosh added ci Continuous integration pkg Package health and distribution related stuff labels Mar 7, 2024
@janosh
Copy link
Member Author

janosh commented Mar 7, 2024

good news: looks like linux CI dependency install time went from 2.5 to 1 min (2.5x speed up) and windows install time went from 9-14 min to 3-5 min (~3x speed up)!

also, matgl import error was indeed automatically fixed by uv and remaining failing tests due to pydantic import error are skipped until materialsvirtuallab/matgl#238 is resolved.

pinging a few people in case they want to migrate their repos to uv as well for (hopefully) similar speedups.

@janosh janosh enabled auto-merge (squash) March 7, 2024 09:04
@janosh janosh added the performance Some functionality is too slow or regressed label Mar 7, 2024
@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Mar 7, 2024

Amazing! I have a few upstream issues in uv I need fixed before I can switch. Already tried. But I am hopeful it will be soon!

@DanielYang59
Copy link
Contributor

DanielYang59 commented Mar 7, 2024

Glad to help you with pymatviz. Would get my hands on that soon (well... another promise 😄 ).

Despite of the dramatic speed up, I'm quite curious why the original import issue in the first place because these two should function the same?

@janosh janosh merged commit a4fbeeb into master Mar 7, 2024
22 checks passed
@janosh janosh deleted the uv branch March 7, 2024 09:16
@janosh
Copy link
Member Author

janosh commented Mar 7, 2024

I'm quite curious why the original import issue in the first place because these two should function the same?

uv is meant as a drop-in replacement for pip, yes, but their module resolution works differently. i assume pip failed to install matgl due conflicting version reqs where uv succeeded. but just guessing, didn't look into it

@DanielYang59
Copy link
Contributor

i assume pip failed to install matgl due conflicting version reqs where uv succeeded. but just guessing, didn't look into it

Well...At lease from this run, matgl is successfully installed but still the weird importError. Not idea why.

Successfully installed ..... matgl-1.0.0 ....

@DanielYang59
Copy link
Contributor

A quick update @janosh , it seems uv timed out for some reason here.

Run uv pip install numpy cython --system
Resolved 2 packages in 98ms
Downloaded 2 packages in 419ms
Installed 2 packages in 17ms
 + cython==3.0.9
 + numpy==1.26.4
Built 1 editable in 28.18s
Resolved 194 packages in 2.55s
error: Failed to download distributions
  Caused by: Failed to fetch wheel: torch==2.2.1
  Caused by: Failed to extract source distribution
  Caused by: Failed to download distribution due to network timeout. Try increasing UV_HTTP_TIMEOUT (current value: 300s).

@janosh
Copy link
Member Author

janosh commented Mar 7, 2024

good to know, i'll keep an eye on it


# TODO remove next line installing ase from main branch until FrechetCellFilter is released
pip install git+https://gitlab.com/ase/ase
uv pip install -e '.[dev,optional]' --system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janosh, sorry for digging out this. But why do we need to install "editable" for CI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't remember exactly. i'm fairly sure i tried non-editable first and it failed for some reason. but that was with an earlier uv and might have changed

Copy link
Contributor

@DanielYang59 DanielYang59 Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could confirm install "non-editable" indeed breaks the tests, and install editable works fine. The pip install process seems to run without any issue, but pytest seems unable to access test files. I don't quite understand this behavior though (thought the only difference being able to change source code).

An off-topic question: why do we need:

jobs:
test:
# prevent this action from running on forks
if: github.repository == 'materialsproject/pymatgen'

By default workflows would not run on forks until enabled by fork owner. Having this would prevent fork from running workflows, which seems unnecessary (some people might prefer to test on their own fork before pushing to upstream)? Or in my case I have to remove this line to run tests on my own fork.

Screenshot 2024-04-10 at 16 57 58

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

editable install basically just symlinks the repo into site-packages, meaning the whole repo contents including test files are symlinked. i think with non-editable install TEST_FILES_DIR points at a non-existent directory.

why do we need:

     if: github.repository == 'materialsproject/pymatgen' 

not sure if this changed but people used to get every CI failure email twice from GitHub, once on their fork and once on open PRs which was very annoying

Copy link
Contributor

@DanielYang59 DanielYang59 Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

editable install basically just symlinks the repo into site-packages, meaning the whole repo contents including test files are symlinked. i think with non-editable install TEST_FILES_DIR points at a non-existent directory.

That's interesting. I might experiment with this later. Because I have a feeling that we don't need install editable when we don't actually edit the source code. And I'm not sure if install editable would have any performance penalty. It's just guessing now but I would test later and let you know :)

not sure if this changed but people used to get every CI failure email twice from GitHub, once on their fork and once on open PRs which was very annoying

I think this is intended (one failure from their own fork, and one from upstream). They should disable workflow on their own fork if they don't want workflow to run on their fork, instead of we disable altogether for them? Otherwise people who want to run workflow on their fork for any reason would need to remove this line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm guessing if anything, --editable offers a performance gain given how fast it is to symlink

Otherwise people who want to run workflow on their fork for any reason would need to remove this line?

not sure that's an issue? people usually just fork to make PRs in which CI runs on this repo. if they want to maintain a fork long-term, removing this line is not much work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm guessing if anything, --editable offers a performance gain given how fast it is to symlink

Not sure, but I would test later and let you know.

people usually just fork to make PRs in which CI runs on this repo.

Yes. But my point is... By default workflow would NOT run on forks (I assume people would enable workflow on their fork only when they have a good reason to), such that people could easily toggle workflow on and off easily on their fork (and have more control over their fork).

But if we disable workflow for them, they can not easily enable it. (I guess click a button would be easier than changing a line and make a commit? Also if they want to to push their change to upstream, they would have to revert it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration performance Some functionality is too slow or regressed pkg Package health and distribution related stuff
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants