Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conda-forge::python install not working #626

Open
kchaung-lilly opened this issue Sep 6, 2024 · 13 comments
Open

conda-forge::python install not working #626

kchaung-lilly opened this issue Sep 6, 2024 · 13 comments

Comments

@kchaung-lilly
Copy link

From the GUI, there is no version drop down menu for conda-forge::python. Additionally, the existing version of conda-forge::python=3.13.0rc1 causes the build to fail.

example build id: 0352622f5438358c_1

image

@pditommaso
Copy link
Collaborator

Thanks for reporting. tagging @ewels for visibility

@ewels
Copy link
Member

ewels commented Oct 1, 2024

Quoting discussion with @vladsavelyev :


This is a known issue with anaconda search API. Some packages are missing there, python one of them https://api.anaconda.org/search?name=python&organisation=conda-forge

The Anaconda website search bar, however, works, https://anaconda.org/search?q=python, so we put in a workaround to additionally scrape it. Unfortunately, it returns only the most recent version for a package, which is what you're seeing here.

I reported this problem to anaconda, their response:

Thank you so much for contacting us, we are currently in the process of enhancing our API search feature to ensure it meets the highest standards of performance and accuracy.

We appreciate your patience as we work on improvements, and we look forward to providing you with a more robust experience soon.

In the meantime the results may not be accurate, please let us know if you need anything else.


I'd like to add that the Wave CLI should work fine. We may look into adding a free-text input to the web UI to allow users to request arbitrary package names without needing to invoke the Anaconda search APIs. I'm not sure that there's much more that we can do here for now I'm afraid.

@stevekm
Copy link

stevekm commented Oct 1, 2024

doesnt the wave cli require the usage of a Seqera Platform user account token? That is one of the main reasons I have avoided using it for custom container builds. Also because I am not sure how you are meant to search for available pacakges with it, so I dont know what versions of packages exist (or even the packages themselves)

@pditommaso
Copy link
Collaborator

pditommaso commented Oct 1, 2024

Wave does work without using Seqera Platform user account token *however* to push to your own repository the Platform token is needed to authorise the push of the image

@ewels
Copy link
Member

ewels commented Oct 2, 2024

As Paolo says, a token is not needed to use the Wave CLI to get an image on Seqera Containers. But you also can't push custom container builds to Seqera Containers.. You can get temporary custom container builds from Wave though. Ok this is getting complicated. Here's a list:

  • ✅ Wave CLI to get a new Seqera Containers image from conda, no auth
  • ✅ Wave CLI to get a custom container from Dockerfile, ephemeral, no auth
  • ❌ Custom containers from Dockerfiles on Seqera Containers
  • ❌ Pushing container builds (frozen) to custom registries without auth

No-auth requests also have a lower rate limit. Does that all make sense?

Also because I am not sure how you are meant to search for available pacakges with it, so I dont know what versions of packages exist (or even the packages themselves)

Yeah if you're using the CLI then that's up to you. You have to find them yourself first, as you would normally with vanilla Conda. You can see package lists for Bioconda, conda-forge and search basically all conda channels here.

@ewels
Copy link
Member

ewels commented Oct 3, 2024

@mahesh-panchal found what I guess is a similar story with Perl:

https://wave.seqera.io/view/builds/25c1e5978bf53020_3

channels:
- conda-forge
- bioconda
dependencies:
- bioconda::samtools=1.21
- conda-forge::perl=5.32.1.1
The following package could not be installed
└─ perl 5.32.1.1*  does not exist (perhaps a typo or a missing channel).

@ewels
Copy link
Member

ewels commented Oct 11, 2024

Hi all,

A bit of investigative work to see how conda search does it - as that does find the correct Python versions, so clearly can't be using the Anaconda API.

I just got a license for Proxyman so that I could poke at the SSL requests coming from the command in the terminal. I was expecting to see some undocumented magic for what it's querying, but it's actually exceptionally simple. Running the command simply downloads all packages.

My local conda config looks like this:

$ conda config --show channels
channels:
  - conda-forge
  - bioconda
  - defaults

Then running conda search python hits the following endpoints:

No headers, no query, no POST body, nothing. Just a simple GET request to each (ok it has an If-None-Match header so maybe it has a local cache somewhere, but I imagine that these will be breaking the cache every few minutes / hours with new packages).

These 4 requests sum to 222MB download, which is probably some of why it's so slow to run. Each is a huuuuge JSON file of presumably every package in the conda channel, which I guess is then searched locally.

Will discuss and compare to the logic that we have back-end for the Seqera Containers search to see if we can learn anything from this technique.

@mahesh-panchal
Copy link

See how rattler is doing it. Its behind Pixi, Mamba, py-Rattler.

@ewels
Copy link
Member

ewels commented Oct 11, 2024

pixi search python
❯ pixi search python
Using channels: conda-forge

python h206b6c5_100_cp313
-------------------------

Name                python
Version             3.13.0
Build               h206b6c5_100_cp313
Size                12873785
License             Python-2.0
Subdir              osx-arm64
File Name           python-3.13.0-h206b6c5_100_cp313.conda
URL                 https://conda.anaconda.org/conda-forge/osx-arm64/python-3.13.0-h206b6c5_100_cp313.conda
MD5                 b09a725400f670179c355b975e2854cc
SHA256              a126d434dbe34ce188a46364966aeeb5a4c9c5a8547a3fec8aa095031e206c9a

Dependencies:
 - __osx >=11.0
 - bzip2 >=1.0.8,<2.0a0
 - libexpat >=2.6.3,<3.0a0
 - libffi >=3.4,<4.0a0
 - libmpdec >=4.0.0,<5.0a0
 - libsqlite >=3.46.1,<4.0a0
 - libzlib >=1.3.1,<2.0a0
 - ncurses >=6.5,<7.0a0
 - openssl >=3.3.2,<4.0a0
 - python_abi 3.13.* *_cp313
 - readline >=8.2,<9.0a0
 - tk >=8.6.13,<8.7.0a0
 - tzdata
 - xz >=5.2.6,<6.0a0

CleanShot 2024-10-12 at 01 20 51@2x

Much less network traffic - did a bunch of HEAD calls and downloaded two files, both compressed .zst files. But effectively the same repodata.json files by the look of it.

@pditommaso
Copy link
Collaborator

They use a sharded index https://prefix.dev/blog/sharded_repodata.

@vladsavelyev
Copy link

Interesting, wonder if we could switch to using the sharded index for conda-forge which the pixi team built at https://fast.prefix.dev/conda-forge.

@ewels
Copy link
Member

ewels commented Oct 12, 2024

The problem is that, until Anaconda supports sharded indexes for all conda channels, we'd have to implement and maintain a dual method to work with and without sharding, depending on channels. Which sounds messy :/

How difficult would it be to run pixi search CLI on the server back end @vladsavelyev ? Rather than reimplementing ourselves?

@mahesh-panchal
Copy link

mahesh-panchal commented Oct 16, 2024

Another one where the version isn't parsed correctly for rstudio:
https://wave.seqera.io/view/builds/bd-123dcc2296d1a143_1

Edit: Not sure what's going on there, but rstudio is part of the r channel ( defaults ), but if you search in seqera containers it comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants