You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running pg_repack 1.4.7 on Postgres 14 hosted on Amazon RDS.
I see other issues related to autovacuum, but I don't think any of them are the same as this. Thanks for taking a look.
While pg_repack was repacking a table A in my database the other night, all reads and writes to the table were failing. I looked at the running processes and saw that this query had been running for over half an hour:
After I canceled that process with select pg_cancel_backend(6881);, reads and writes on the original table A were able to proceed.
In investigating the outage after the fact, I saw this in the logs. I believe that "canceling autovacuum task" was caused by me running pg_cancel_backend.
2023-11-08 09:37:27 UTC::@:[6881]:ERROR: canceling autovacuum task
2023-11-08 09:37:27 UTC::@:[6881]:CONTEXT: while scanning block 384003 of relation "repack.table_36175838"
automatic vacuum of table "cs1.repack.table_36175838"
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG: process 28599 acquired AccessExclusiveLock on relation 116814165 of database 16401 after 2233899.660 ms
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:CONTEXT: SQL statement "DROP TABLE IF EXISTS repack.table_36175838 CASCADE"
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:STATEMENT: SELECT repack.repack_drop($1, $2)
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG: duration: 2233917.744 ms execute <unnamed>: SELECT repack.repack_drop($1, $2)
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:DETAIL: parameters: $1 = '36175838', $2 = '4'
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG: duration: 2233917.741 ms plan:
Query Text: SELECT repack.repack_drop($1, $2)
Result (cost=0.00..0.01 rows=1 width=4) (actual rows=1 loops=1)
Buffers: shared hit=1016 read=15 dirtied=28
I/O Timings: read=8.343
Here's an example of a query not being able to run because of locks held by process 28599:
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:LOG: process 29360 still waiting for AccessShareLock on relation 36175838 of database 16401 after 1000.045 ms at character 1034
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:DETAIL: Process holding the lock: 28599. Wait queue: 28809, 10941, 29360.
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:STATEMENT: SELECT * FROM A WHERE (A.b IS NULL AND A.c = 602146)
Should pg_repack prevent autovacuum on the temporary table or automatically kill any processes that prevent it from acquiring a lock to delete the table? Maybe I'm doing something wrong. Any advice is much appreciated.
The text was updated successfully, but these errors were encountered:
From your logs I see that the lock contention happened between repack.repack_drop() and autovacuum worker, and repack.repack_drop() blocks all other reads and writes.
On this stage pg_repack should have swapped original and temporary tables. And therefore it isn't clear what client reads/writes to that temporary table. From my understanding pg_repack shouldn't try to read/write temporary table.
Does another application tries to read the temporary table?
pg_repack 1.5.0 has a fix #298. It resolves slightly different issue (previously pg_repack locked the original table for a long time), but it might help in your case.
Thanks for taking a look at this! I don't think that there is any application reading the temporary table, though I could be mistaken about that. Glad to hear that #298 might be helpful here.
I'm running pg_repack 1.4.7 on Postgres 14 hosted on Amazon RDS.
I see other issues related to autovacuum, but I don't think any of them are the same as this. Thanks for taking a look.
While pg_repack was repacking a table A in my database the other night, all reads and writes to the table were failing. I looked at the running processes and saw that this query had been running for over half an hour:
postgres | 6881 | | autovacuum: VACUUM ANALYZE repack.table_36175838 (to prevent wraparound)
After I canceled that process with
select pg_cancel_backend(6881);
, reads and writes on the original table A were able to proceed.In investigating the outage after the fact, I saw this in the logs. I believe that "canceling autovacuum task" was caused by me running pg_cancel_backend.
Here's an example of a query not being able to run because of locks held by process 28599:
Should pg_repack prevent autovacuum on the temporary table or automatically kill any processes that prevent it from acquiring a lock to delete the table? Maybe I'm doing something wrong. Any advice is much appreciated.
The text was updated successfully, but these errors were encountered: