-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We must replace uwsgi by something else #937
Comments
Before jumping into the context, the following 3 terms are very different despite having similar names:
Without going deeper into various utils, the top results that could be alternatives to uWSGI include: Upon looking in tutor, I found out gunicorn was used before uWSGI. The migration to uWSGI was done when upgrading to Koa (728ef96). Not sure if we can still revert to gunicorn in current state. Compared with uWSGI, gunicorn is slow (some comparison is on https://emptyhammock.com/projects/info/pyweb/gunicorn.html, however it might not be exactly true). mod_wsgi is mainly Apache module. There were some references of nginx mod_wsgi but I was not able to find any latest information on that. CherryPy is both a Web server and a minimalistic Web framework. Waitress, on the other hand, is WSGI only server running on PyPy on Unix (python 3.7+). I will be looking into each item and further explore any other alternatives as well. References
|
I'm thinking that a possible alternative would be to revert back to gunicorn, but launch a separate container with a web server (such as caddy or nginx) to serve static assets. I hate the fact that we would have to launch a new container just for that (and for every other web app...), but it's the only solution that I'm seeing right now. |
An alternative that I'm looking at for our WSGI server uses is nginx-unit, but I haven't done any testing of that yet. |
Hi folks. I was pointed to this ticket after I posted a forums question about the performance implications of serving static assets with uWSGI.
I feel like throughput benchmarks aren't really that useful for us. Even the slowest thing there shows gunicorn with 4 worker processes giving a throughput of roughly 3200 requests per second, so ~800 req/s per worker, which means the overhead it's imposing for spawning a request and reading/writing 70K is something like 1.25 ms. I think that even our fastest web transactions run in the 30-50 ms range, with more common ones running 150+ ms, and a number of painful courseware calls running multiple seconds. Whether the WSGI server is imposing 1.25 ms of overhead or 0.4 ms of overhead won't really be noticeable in that context.
I'm still really ignorant about Docker things. Would the idea be that we'd have one static asset server container running Caddy, and that there's some shared volume where each Django service writes its static files to a different sub-directory? Or a volume per service, with the Caddy static assets container having an entry for each?
My main hesitation with this is that gunicorn is so ubiquitous for serving Python apps that a lot tooling will accommodate it out of the box. For instance, New Relic would give you things like worker availability, utilization, restart events, etc. I imagine many other APMs do the same. Gunicorn is also going to be more familiar for Python developers and easier to find solutions for common problems via StackOverflow and the like. |
@ormsbee Hi, thanks for the context. I have not had time to continue this issue. There is some missing context for me which some experimentation (with gunicorn and uwsgi) will help provide. |
I'll add another long term alternative to uwsgi: granian. It's implemented in Rust using the hyper library. It's selling points are highly consistent response times (much smaller 99th percentile deviations, because the networking i/o stack is on the Rust side), and the ability to have a single server do ASGI and WSGI (along with its own RSGI standard that it's trying to push). I don't think it's appropriate for us at this time. It's a single developer effort, and it has no support for "restart the worker after X requests", which we unfortunately need because of memory leaks. I merely mention it as something to keep an eye on in the longer term. |
Do you have an idea of what is causing these memory leaks? Note: here's the upstream granian issue: emmett-framework/granian#34 |
The last time someone investigated this, a lot of it owed to circular references in the XBlock runtimes and modulestore. I haven't looked into it since the major runtime simplification or old mongo removal PRs landed. |
For the record, the uWSGI configuration that currently ships with Tutor for the LMS/CMS containers do not make use of the "restart the worker after X requests" flag (max-requests). Tutor users are not complaining, so... I guess we don't need it? |
I hope that means the problem has gotten a lot better. I don't know what value it's set to for 2U these days, but we definitely had issues with it in the past and that's why there's a slot for it in the configuration repo. I suppose it's also possible that it's an issue with gunicorn specifically that's causing this... In any case, FYI to @timmc-edx, @robrap, @dianakhuang in case this is of interest for how 2U deploys. |
Thanks @ormsbee. I think we still have this set for edxapp in Stage, Prod and Edge. I don't think we use it for any of our other workers. However, it's likely to remain this way, given the old "if it ain't broke, don't fix it", along with all of our other priorities. We've had too many recent incidents from changes (mostly DD related) that seem like they shouldn't cause any issues, but surprise surprise. |
FYI this issue was featured on episode 401 of the Python Bytes podcast: https://www.youtube.com/watch?v=XKI5gtnKMus |
What about uvicorn? I used it in some fastapi projects. It supports async and apparently, it's fast! |
My main concern with uvicorn is that it doesn't support WSGI. Its native WSGI implementation is deprecated, and they point you to a2wsgi instead. Maybe that will work smoothly, but running WSGI apps doesn't seem to be an area of focus for them, and we're probably going to be running in that mode for a while to come. |
uwsgi is now in maintenance mode: https://uwsgi-docs.readthedocs.io/en/latest/
So we should take the opportunity that we are releasing Quince to remove uwsgi from the code base.
The text was updated successfully, but these errors were encountered: