`http/web.py` performance issues #305

tleb · 2024-07-24T08:02:11Z

Issue for tracking http/web.py performance work.

On the official instance, a directory load (without caching) is about 500ms. A small file load (without caching) is about 750ms. That is a lot of time spent for only reading data. I expect low-hanging fruits.

See if these sort of timescales are also seen in local environment.
Profile http/web.py to see where time is spent.
Make decisions based on that.

The text was updated successfully, but these errors were encountered:

fstachura · 2024-07-24T17:46:23Z

I decided to investigate on #304. I know this may be controversial, but a) major culprits seem to be in things that didn't change much, b) since that revision is split into more funcitons, the flamegraph contains more interesting information. It's also easier to choose what to measure manually since code is refactored into more functions.

Workstation: 16GB DDR4 (~50% free), NVMe, i5-8250U (most cores were idle).

I used two methods:

CProfile and flamegraphs

Elixir modification that uses cProfile for profiling can be seen here.

Note that /usr/local/elixir/profiles/ directory needs to be writeable by www-data for this to work.

Flamegraphs can be created from prof files using flameprof and FlameGraph.
Example: flameprof --format=log 30931.prof | perl ~/apps/src/FlameGraph/flamegraph.pl > 30931.svg.

I created three flamegraphs, one for the biggest identifier, one for a relatively big source file and one for a big directory tree:
https://fbstc.org/scrap/uboot-u32-ident.svg
https://fbstc.org/scrap/uboot-lib-kconfig.svg
https://fbstc.org/scrap/uboot-include-tree.svg

In all three, a lot happens before web.py - I think these are imports. In the source framegraph, guess_lexer_for_filename takes the majority of time spent in web.py.
Of course these are just three random flamegraph samples - not enough information to make a fully informed decision. Still, some patterns can be seen. I guess one could ask whether the flamegraphs would be similar on production.

My own profiler

I wrote my own simple profiler that logs execution time of selected blocks and functions. Statistical methodology is probably dubious (results are averages of percentages of total request time), but I mainly needed this as a sanity check for the flamegraphs (some information in there was suspicious).

Elixir version that uses this profiler can be seen here.

Results from a short, pseudo-random ("clicking around") request sample can be seen here

Results are divided into categories based on route (and file type in case of the source route).
See calls to measure_function and measure_block in web.py for meaning of names of the measure points.
Numbers represent the average percentage of total script execution time (measured from the first import to the end of handle_request) over all requests from that category.
The report was generated using analyze_simple_profiler_logs.py

Conclusions

We could of course dig deeper and make more accurate measurements, but I think the conclusions are clear and actionable:

CGI forces us to redo all imports for each request. This is not only redundant, imports take a significant execution time. AFAIK this shouldn't be a problem with WSGI/UWSGI/ASGI because the architecture is different - the application is initialized once, and then it just listens for requests. Requests are processed by a long-running process.
guess_lexer_for_filename takes more time than code formatting itself. It seems that it's because Pygments imports some lexers that didn't match the filename, until it finds the matching lexer. Maybe this could be reported and fixed, if we investigate a bit more. But again, imports are cached, so this shouldn't be a problem if we move to WSGI.

tleb · 2024-07-24T19:34:27Z

Ah, so even the guess_lexer_for_filename that looks like slow code is actually slow imports as well. Let's wait for the switch to WSGI then we'll do a second round of profiling. Cool insight you found here, thanks!

tpetazzoni · 2024-07-24T19:56:27Z

Thanks for all the investigation. What is holding us from switching over to WSGI ?

fstachura · 2024-07-24T20:04:39Z

What is holding us from switching over to WSGI ?

Elixir still uses global state and we need to get rid of that first. Most, if not all, should be gone with recent web.py refactorings.

tpetazzoni · 2024-07-25T05:53:52Z

Amazing! Then I'd say keep up the good work in this direction, so that we can move to WSGI in the near future.

tleb · 2024-11-08T16:29:44Z

I'd say WSGI and the follow-up perf improvements (00a160a, d946818, 7df3543) solved this issue. The prod server hasn't crashed since then, and response times are much lower than before (to the point that it can be hard to guess if pages come from the cache or are live generated when using the website).

tleb changed the title ~~Performance issues~~ http/web.py performance issues Jul 24, 2024

tleb assigned fstachura Jul 24, 2024

tleb added the bug label Jul 24, 2024

tleb closed this as completed Nov 8, 2024

tleb mentioned this issue Nov 8, 2024

Elixir is slow to respond to most requests #178

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`http/web.py` performance issues #305

`http/web.py` performance issues #305

tleb commented Jul 24, 2024 •

edited

Loading

fstachura commented Jul 24, 2024

tleb commented Jul 24, 2024

tpetazzoni commented Jul 24, 2024

fstachura commented Jul 24, 2024

tpetazzoni commented Jul 25, 2024

tleb commented Nov 8, 2024

http/web.py performance issues #305

http/web.py performance issues #305

Comments

tleb commented Jul 24, 2024 • edited Loading

fstachura commented Jul 24, 2024

CProfile and flamegraphs

My own profiler

Conclusions

tleb commented Jul 24, 2024

tpetazzoni commented Jul 24, 2024

fstachura commented Jul 24, 2024

tpetazzoni commented Jul 25, 2024

tleb commented Nov 8, 2024

`http/web.py` performance issues #305

`http/web.py` performance issues #305

tleb commented Jul 24, 2024 •

edited

Loading