Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for using Simple Repository API #62

Closed
2 of 5 tasks
ryanking13 opened this issue Apr 17, 2023 · 9 comments
Closed
2 of 5 tasks

Plans for using Simple Repository API #62

ryanking13 opened this issue Apr 17, 2023 · 9 comments

Comments

@ryanking13
Copy link
Member

ryanking13 commented Apr 17, 2023

I've recently started working on changing micropip to use the simple API (PEP 503, PEP 691), instead of legacy JSON API. I think changing to simple API will help people to create alternative package registries following the standard, and we can also benefit from other PyPA tools.

@ryanking13
Copy link
Member Author

ryanking13 commented Apr 20, 2023

For now, this is blocked by: pypi/warehouse#12214: some stale responses does not contain CORS headers added in pypi/warehouse#13222.

@rth
Copy link
Member

rth commented Apr 25, 2023

Thanks a lot for looking into this @ryanking13 !

Use devpi or pypi/warehouse to host wheels for test

See also the discussion in pyodide/pyodide#3049 though I haven't really made much progress on it. Aside from determining that warehouse is not suitable for this purpose according to its devs, as it's the code for PyPI and too complex to use for other applications.

@rth
Copy link
Member

rth commented Apr 25, 2023

some stale responses does not contain CORS headers added in
Change micropip's default endpoint to pypi.org/simple

Would it be too much work to support both? I mean this would be great to support other hosting solutions that could use the Simple API even independently from PyPI. For instance,

to name a new.

@ryanking13
Copy link
Member Author

ryanking13 commented Apr 26, 2023

Would it be too much work to support both?

Do you mean supporting both JSON API and Simple API? I think it is not that hard, but I think most private hosting solutions would use Simple API (except for pypiserver), so I was thinking that it should be okay to support only simple API which is now a standard.

@rth
Copy link
Member

rth commented Apr 26, 2023

I think it is not that hard, but I think most private hosting solutions would use Simple API (except for pypiserver), so I was thinking that it should be okay to support only simple API which is now a standard.

If that outdated cache issue has a workaround for PyPI sure we can only keep Simple API.

But so are we talking about HTML Simple API or Json Simple API? It's pretty horrible to have to parse HTML files to extract links to then parse other HTML files and parse more links. I mean maybe for native installers it doesn't matter, but on a web page every bit of overhead matters when loading the page.
So that's why if we can avoid HTML Simple API being the default it would probably be better. Although I do understand that if third-party services use it, we have to support it.

@ryanking13
Copy link
Member Author

ryanking13 commented Apr 27, 2023

If that outdated cache issue has a workaround for PyPI sure we can only keep Simple API.

Right, if the cache issue is not resolved for a long time, we may need to provide a JSON API as a fallback... let me see how hard it would be to support both APIs.

But so are we talking about HTML Simple API or Json Simple API?

Both. I found that there already exists a good parser (https://github.com/brettcannon/mousebender) that can parse both HTML and JSON API. We can add a Accept: application/vnd.pypi.simple.v1+json header that tells server that we prefer JSON response, but it is possible to handle HTML response as well.

@ryanking13
Copy link
Member Author

https://discuss.python.org/t/pep-658-is-now-live-on-pypi/26693

PEP 658 yay :)

@ryanking13
Copy link
Member Author

ryanking13 commented May 17, 2023

It seems like PyPI JSON-based Simple API (PEP 691) now contains CORS headers correctly, while HTML-based Simple API (PEP 503) still doesn't. Probably PyPI purged all cached JSON responses recently due to PEP 658.

test script:

import requests
import time
import random

top_pypi_packages = "https://hugovk.github.io/top-pypi-packages/top-pypi-packages-30-days.min.json"

packages = requests.get(top_pypi_packages).json()
rows = packages["rows"]
for idx, package in enumerate(random.choices(rows, k=100)):
    name = package["project"]

    # PEP 691
    resp = requests.get(f"https://pypi.org/simple/{name}/", headers={"Accept": "application/vnd.pypi.simple.v1+json"})
    headers = resp.headers

    assert headers["Content-Type"] == "application/vnd.pypi.simple.v1+json"
    assert resp.ok
    if headers.get("Access-Control-Allow-Origin") != "*":
        print(f"({idx}) Fail (json): {name}")

    # PEP 503
    resp = requests.get(f"https://pypi.org/simple/{name}/", headers={"Accept": "text/html"})
    headers = resp.headers
    
    assert headers["Content-Type"] == "text/html"
    assert resp.ok
    if headers.get("Access-Control-Allow-Origin") != "*":
        print(f"({idx}) Fail (html): {name}") 

    time.sleep(1)

Result:

(1) Fail (html): backoff
(2) Fail (html): better-exceptions
(3) Fail (html): pypdf2
(9) Fail (html): flask-swagger-ui
(23) Fail (html): sqlalchemy-mate
(25) Fail (html): scipy
(30) Fail (html): httpcore
(36) Fail (html): ngram
(37) Fail (html): ordered-set
(39) Fail (html): azure-mgmt-billing
(41) Fail (html): awscli-local
(43) Fail (html): azure-mgmt-datalake-analytics
(60) Fail (html): pamela
(65) Fail (html): ipympl
(78) Fail (html): pytzdata
(79) Fail (html): typish
(81) Fail (html): django-ckeditor
(85) Fail (html): pebble
(87) Fail (html): azure-common
(89) Fail (html): scandir
(91) Fail (html): oscrypto
(92) Fail (html): pydocstyle
(94) Fail (html): azure-mgmt-redhatopenshift
(95) Fail (html): cligj
(96) Fail (html): spacy-loggers

Which is a good news and I think I can continue on #65, as we will avoid using HTML APIs by default (though we will need to support and test it locally).

@ryanking13
Copy link
Member Author

Closing as completed, I'll open a separate issue for pep658.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants