Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All target table reads & writes blocked because of autovacuum on repack temporary table #371

Open
schmave opened this issue Nov 9, 2023 · 3 comments

Comments

@schmave
Copy link

schmave commented Nov 9, 2023

I'm running pg_repack 1.4.7 on Postgres 14 hosted on Amazon RDS.

I see other issues related to autovacuum, but I don't think any of them are the same as this. Thanks for taking a look.

While pg_repack was repacking a table A in my database the other night, all reads and writes to the table were failing. I looked at the running processes and saw that this query had been running for over half an hour:

postgres | 6881 | | autovacuum: VACUUM ANALYZE repack.table_36175838 (to prevent wraparound)

After I canceled that process with select pg_cancel_backend(6881);, reads and writes on the original table A were able to proceed.

In investigating the outage after the fact, I saw this in the logs. I believe that "canceling autovacuum task" was caused by me running pg_cancel_backend.

2023-11-08 09:37:27 UTC::@:[6881]:ERROR:  canceling autovacuum task
2023-11-08 09:37:27 UTC::@:[6881]:CONTEXT:  while scanning block 384003 of relation "repack.table_36175838"
	automatic vacuum of table "cs1.repack.table_36175838"
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG:  process 28599 acquired AccessExclusiveLock on relation 116814165 of database 16401 after 2233899.660 ms
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:CONTEXT:  SQL statement "DROP TABLE IF EXISTS repack.table_36175838 CASCADE"
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:STATEMENT:  SELECT repack.repack_drop($1, $2)
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG:  duration: 2233917.744 ms  execute <unnamed>: SELECT repack.repack_drop($1, $2)
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:DETAIL:  parameters: $1 = '36175838', $2 = '4'
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG:  duration: 2233917.741 ms  plan:
	Query Text: SELECT repack.repack_drop($1, $2)
	Result  (cost=0.00..0.01 rows=1 width=4) (actual rows=1 loops=1)
	  Buffers: shared hit=1016 read=15 dirtied=28
	  I/O Timings: read=8.343

Here's an example of a query not being able to run because of locks held by process 28599:

2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:LOG:  process 29360 still waiting for AccessShareLock on relation 36175838 of database 16401 after 1000.045 ms at character 1034
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:DETAIL:  Process holding the lock: 28599. Wait queue: 28809, 10941, 29360.
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:STATEMENT:  SELECT * FROM A WHERE (A.b IS NULL AND A.c = 602146)

Should pg_repack prevent autovacuum on the temporary table or automatically kill any processes that prevent it from acquiring a lock to delete the table? Maybe I'm doing something wrong. Any advice is much appreciated.

@za-arthur
Copy link
Collaborator

From your logs I see that the lock contention happened between repack.repack_drop() and autovacuum worker, and repack.repack_drop() blocks all other reads and writes.

On this stage pg_repack should have swapped original and temporary tables. And therefore it isn't clear what client reads/writes to that temporary table. From my understanding pg_repack shouldn't try to read/write temporary table.

Does another application tries to read the temporary table?

@za-arthur
Copy link
Collaborator

pg_repack 1.5.0 has a fix #298. It resolves slightly different issue (previously pg_repack locked the original table for a long time), but it might help in your case.

@schmave
Copy link
Author

schmave commented Nov 29, 2023

Thanks for taking a look at this! I don't think that there is any application reading the temporary table, though I could be mistaken about that. Glad to hear that #298 might be helpful here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants