Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: provide doc about DB config for read performance #195

Open
PikachuEXE opened this issue Aug 21, 2024 · 6 comments
Open

Suggestion: provide doc about DB config for read performance #195

PikachuEXE opened this issue Aug 21, 2024 · 6 comments

Comments

@PikachuEXE
Copy link

If users don't know how to make the read performance faster via config (besides gem config) it's hard to even try it out

I am using PSQL but there is no doc for MySQL too (let's include doc for both?)

@andyatkinson
Copy link
Member

andyatkinson commented Sep 18, 2024

@PikachuEXE I'm working on a basic blog post integrating Solid Cache with a Rails + Postgres app. What would you like to see on read performance?

For example, do you mean for solid_cache_entries, identifying any missing indexes? I did a quick check of the schema and indexes, but haven't yet studied the queries much.

Did you have other read operation slowness or optimizations in mind?

@PikachuEXE
Copy link
Author

I think this gem is quite new to people and there are mainly 2 cases (at first)

  • Same DB instance for both data & cache
  • Dedicated for cache

Case (1) could be more difficult to "optimize" (I generally avoid doing that even in redis) so I expect some info for (2)
Which would be what kind of DB config (or what should be considered) would help improve read performance (info for write performance is less important but welcome)

The schema and indexes shouldn't be a concern for users (provided by gem and should not be different per user)

I have asked on reddit before and you can check some of the answers for what I expect
Though I wouldn't agree with use unlogged tables for solid_cache

@AxelTheGerman
Copy link

how to make the read performance faster

Why? Are there any performance issues out of the box? If its really about tweaking and maximizing then it might depend a whole lot on your individual set up, e.g. DB version, CPU/RAM/disk performance, number of instances, usage patterns etc

@andyatkinson
Copy link
Member

@AxelTheGerman Agree best to hold off until there's significant data and query volumes for solid cache data, before getting into database tuning. Even then, it will depend on the available system resources and how to allocate them, including memory, CPU etc.

@PikachuEXE With that said above, I've published the post I mentioned earlier with the basics of Solid Cache and PostgreSQL, and did get into some generic server instance tuning guidance, considering read and write optimizations. I'm not intending for this to be directly related to your issue here as documentation for solid cache, in part because it's specific to PostgreSQL, however I wanted to share it for you or anyone else that's interested in exploring these kinds of optimizations to see what's possible. I'm happy to receive any feedback or questions.
https://andyatkinson.com/solid-cache-rails-postgresql

I'm eager to see how solid cache is adopted, and to learn about big deployments with large amounts of data and query volume in cache.

@PikachuEXE
Copy link
Author

@andyatkinson
Great article for beginners (as gem users) :)
Though there are some places with seems to be mistakes (but I can be wrong)

As discussed earlier, we could disable write ahead logging (WAL) for the solid_cache_entries table to reduce arguably unnecessary write IO related to cache entries. If Postgres were to restart, this means solid_cache_entries will be truncated, so keep that in mind. Keep the table logged if you need to be guaranteed this data will exist following a restart.

From https://www.postgresql.org/docs/current/sql-createtable.html

If specified, the table is created as an unlogged table. Data written to unlogged tables is not written to the write-ahead log (see Chapter 30), which makes them considerably faster than ordinary tables. However, they are not crash-safe: an unlogged table is automatically truncated after a crash or unclean shutdown. The contents of an unlogged table are also not replicated to standby servers. Any indexes created on an unlogged table are automatically unlogged as well.

I think If Postgres were to restart is not clear enough to represent a crash or unclean shutdown (not normal shutdown)

Keep the table logged if you need to be guaranteed this data will exist following a restart.
ALTER TABLE solid_cache_entries SET UNLOGGED;

Do you mean ALTER TABLE solid_cache_entries SET LOGGED; here?

I think there are more possible optimizations like longer checkpoint distance & synchronous_commit = off (from https://www.cybertec-postgresql.com/en/reduce-wal-by-increasing-checkpoint-distance/#increasing-checkpoint-distances) can be mentioned

@andyatkinson
Copy link
Member

Hi @PikachuEXE. I rushed this out a bit, so I appreciate you reading it and providing corrections and additions. I've incorporated your feedback and credited you at the end of the post. I'll take one more pass over it, but here's a summary of changes:

  • Clarified wording showing how to switch a logged table, the default, to unlogged
  • Replaced the use of "restart", too broad and misleading, with the specific language from the docs, a "crash or unclean shutdown" event that means an unlogged table is truncated after restarting
  • I added a note about changing synchronous_commit for a specific transaction, from on to local, which only guarantees local durability, and could improve transactions per second rate. What's nice about that as a generic recommendation is that it can be scoped to only transactions related writes for solid_cache_entries, vs. other settings that are system-wide like checkpoint frequency. In general I limited recommendations to ones for the specific solid cache table, or specific transactions related to writing to it or reading from it.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants