Stop responding in a few days in wsgi mode #383

vooon · 2024-08-20T08:33:16Z

I've set to run OpenStack Placement API to be run by Granian. But sees that it stop's responding in a day or two.
Unfortunately i do not see any logs, process looks alive, port are in netstat...
But if i curl -v http://192.168.50.91:18778 it just hangs.

I see the same problem for other components as well, Heat API, Barbican...
But Placement is very simple compared to others, it do not use eventlets or rabbitmq. Just WebOb and SQLAlchemy.
So it probably easier to debug.

I suppose that the problem somewhere in WSGI handling part, as Skyline APIserver (which is written on FastAPI, and so deployed in ASGI mode) working without a problem for months.

The text was updated successfully, but these errors were encountered:

gi0baro · 2024-08-20T16:22:08Z

@vooon can you also provide the full granian parameters/config you're using?

vooon · 2024-08-20T16:43:49Z

@gi0baro it's similar to other services, since use same template, just app factory is placement.wsgi:init_application:

/usr/bin/granian /etc/granian/openstack_placement_api.py:application \
	--host 192.168.50.91 \
	--port 18778 \
	--interface wsgi \
	--workers 2 \
	--threads 4 \
	--log-level debug

gi0baro · 2024-08-22T10:45:49Z

@gi0baro it's similar to other services, since use same template, just app factory is placement.wsgi:init_application:
/usr/bin/granian /etc/granian/openstack_placement_api.py:application \
	--host 192.168.50.91 \
	--port 18778 \
	--interface wsgi \
	--workers 2 \
	--threads 4 \
	--log-level debug

My guess is that something is blocking the Python threads and thus Granian runs out of working threads to process requests (it could also be the Rust runtime gets blocked, but I would expect connection refused/timeouts in that case).

I'd suggest to configure --backpressure, as with your configuration you can end up with 512 threads per workers interacting with Python code, I guess that just too much. Also, as per documentation, you won't benefit at all from --threads 4, I would just remove that.
In the end, I would change your run command with something like this (where N is the maximum Python concurrency you expect):

/usr/bin/granian /etc/granian/openstack_placement_api.py:application \
	--host 192.168.50.91 \
	--port 18778 \
	--interface wsgi \
	--workers 2 \
	--backpressure N \
	--log-level debug

gi0baro · 2024-09-04T10:29:30Z

Closing this as stale

gi0baro added the wsgi Issue related to WSGI protocol label Aug 20, 2024

gi0baro closed this as completed Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop responding in a few days in wsgi mode #383

Stop responding in a few days in wsgi mode #383

vooon commented Aug 20, 2024

gi0baro commented Aug 20, 2024

vooon commented Aug 20, 2024

gi0baro commented Aug 22, 2024 •

edited

Loading

gi0baro commented Sep 4, 2024

Stop responding in a few days in wsgi mode #383

Stop responding in a few days in wsgi mode #383

Comments

vooon commented Aug 20, 2024

gi0baro commented Aug 20, 2024

vooon commented Aug 20, 2024

gi0baro commented Aug 22, 2024 • edited Loading

gi0baro commented Sep 4, 2024

gi0baro commented Aug 22, 2024 •

edited

Loading