Start  trial

    Start trial

      In this blog, Nishchay Kothari from Fujitsu's Center of Excellence team discusses what steps to take in case your system becomes read-only due to transaction wraparound.

      As a PostgreSQL DBA, we commonly hear and use the term "dead rows" and "bloat." In this blog post, I'd like to talk about these concepts as well as "transaction wraparound," one of the most dangerous situations in PostgreSQL. I'll also outline how to handle this problem with an example.

      Enterprise-grade database systems with high traffic are bound to the risk of transaction wraparound. I will walk you through the steps to rectify, to help you manage PostgreSQL with confidence.

      Dead rows and bloat

      The old version of any row that is affected by an operation like UPDATE or DELETE in PostgreSQL is marked internally so that queries don't return that row. Therefore, these rows are referred to as "dead rows", and the space occupied by dead rows is called "bloat." A database program that often executes a lot of UPDATE/DELETE operations can quickly become very large and require sporadic maintenance work.

      Types of vacuuming

      Typically, vacuuming, a type of periodic database cleaning, is carried out in PostgreSQL. It comes in two different forms; one of them is plain VACUUM (without FULL), which merely frees up space for other uses. Plain vacuum does not incur exclusive locks, allowing it to run concurrently with regular reading and writing of the table. This type of vacuum operation will keep the additional space accessible for reuse inside the same table, rather than returning it to the operating system. The operating system receives the excess space back when using the VACUUM FULL form, which also rewrites the whole contents of the table onto a new disk file with no extra space. This form must be processed with an ACCESS EXCLUSIVE lock on each table, which makes it slower.

      How it works Vacuum Vacuum full
      Method Free up dead rows for reuse Rewrite the table with no dead rows
      Access Exclusive Lock No Yes
      Free space is made available for Inside the same table The Operating System

      Transaction wraparound

      Now, let’s learn about the transaction and transaction wraparound. Every row that is updated in the database in PostgreSQL receives a transaction ID (Txid) from the transaction control mechanism. These IDs regulate which rows are shown to other active transactions.

      The issue with transaction wraparound is Multi-Version Concurrency Control (MVCC). The ability to distinguish between two transactions based on their Txids is essential for MVCC. Txids in Postgres are just 32-bit integers. Accordingly, there are only roughly four billion (232) possible Txids.

      Four billion may seem excessive and difficult to achieve, but I must admit that for database systems with really high write-intensive workloads, four billion transactions are achievable in a matter of weeks.

      Therefore, PostgreSQL stops allowing WRITE operations and switches the database to READONLY mode if there are 2,000,000,000 unvacuumed transactions.

      Any one or more of the following conditions could contribute to transaction ID wraparound.

      1. Autovacuum is set to turned off
      2. Long-running transactions
      3. Heavy DML operations forcing the cancellation of autovacuum worker processes
      4. Many sessions or connections holding locks for very long durations.

      I was able to successfully simulate a transaction wraparound issue on my test machine. I'm getting the following entries in my PostgreSQL log files, which means that PostgreSQL has stopped accepting DML statements and has switched to READONLY mode.

      2022-09-16 08:43:38.265 +08 [74098] WARNING: database "postgres" must be vacuumed within 3000000 transactions (10184)
      2022-09-16 08:43:38.265 +08 HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
              You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
      2022-09-16 08:43:48.640 +08 [5764] ERROR: database is not accepting commands to avoid wraparound data loss in database "postgres" (10182)
      2022-09-16 08:43:48.640 +08 [5764] HINT: Stop the postmaster and vacuum that database in single-user mode.
              You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

      Therefore, I have established a connection to the issue database "postgres" and run some SELECT & CREATE commands to make sure.

      [fsepuser@primary log]$ psql
      psql (14.0)
      Type "help" for help.

      postgres=# select datname from pg_database;
        datname
      ----------
      postgres
      templatel
      template0
      (3 rows)

      postgres=# create table test (id numeric);
      ERROR: database is not accepting commands to avoid wraparound data loss in database "postgres" (10182)
      HINT: Stop the postmaster and vacuum that database in single-user mode.
      You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
      postgres=#
      postgres=# select usename from pg_user;
      usename
      ----------
      fsepuser
      (1 row)

      postgres=#

      The database "Postgres" is currently confirmed to be in READONLY mode because of transaction wraparound.

      How to fix transaction wraparound

      Let me now walk you through the methods to remedy this scenario. We must first take the database offline, connect in single-user mode, and then execute VACUUM FULL on each database. However, we must first complete the steps listed below.

      1. First of all, we need to find the name of table which is responsible for wraparound by using the query below.

        SELECT c.relnamespace::regnamespace as schema_name, c.relname as table_name,
        greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age,
        2^31-1000000-greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as remaining
        FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
        WHERE c.relkind IN ('r', 'm') ORDER BY 4;

      2. Dropping the table is the only option to fix the issue if you discover any tables in the output of the above query that are part of the 'pg_temp_' schema, as PostgreSQL does not allow you to VACUUM temporary tables generated in other sessions.
      3. Another scenario is that PostgreSQL is unable to freeze any transaction IDs that were established after the oldest presently active transaction began. This is due to the way the MVCC operates. Transactions may occasionally get so old that VACUUM is unable to clean them up for the entire 2 billion transaction ID wraparound limit, which results in the system no longer accepting new DML.

        2022-09-16 05:49:07.514 +08 [64931] WARNING: oldest xmin is far in the past (11833)
        2022-09-16 05:49:07.514 +08 [64931] HINT: Close open transactions soon to avoid wraparound problems.
        You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

      4. You have probably noticed that all the screenshots' HINTs tell you to COMMIT or ROLLBACK previously created prepared transactions. Therefore, using the query below, we must check for orphan or outdated prepared transactions.

        SELECT age(transaction),* FROM pg_prepared_xacts ;

        Using the gid from the above query, rollback any prepared transactions you find. The rollback query is below.

        ROLLBACK PREPARED gid;

      5. Once the above processes have been completed, we must bring the databases offline, connect in single-user mode, and conduct the VACUUM FULL option on each database.

        [fsepuser@primary log]$ pg_ctl -D /database/instl stop -mf
        waiting for server to shut down...... done
        server stopped
        [fsepuser@primary log]$

      6. Now, you need to start the database in single-user mode as below.

        [fsepuser@primary ~]$ /opt/fsepv14server64/bin/postgres --single -D /database/instl postgres
        2022-09-16 12:06:45.628 +08 [84585] WARNING: database with OID 14728 must be vacuumed within 3000000 transactions (10185)
        2022-09-16 12:06:45.628 +08 [84585] HINT: to avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

        PostgreSQL stand-alone backend 14.0
        backend>

      7. After successfully entering into single-user mode, we need to execute VACUUM FULL for each database.

        backend> Vacuum Full;
        2022-09-16 12:09:02.711 +08 [84585] WARNING: database "postgres" must be vacuumed within 3000000 transactions (10184)
        2022-09-16 12:09:02.711 +08 [84585] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
        2022-09-16 12:09:03.860 +08 [84585] WARNING: database "postgres" must be vacuumed within 2999999 transactions (10184)
        2022-09-16 12:09:03.860 +08 [84585] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

        2022-09-16 12:09:08.052 +08 [84585] WARNING: database "postgres" must be vacuumed within 3000000 transactions (10184)
        2022-09-16 12:09:08.052 +08 [84585] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
        2022-09-16 12:09:08.058 +08 [84585] WARNING: database "postgres" must be vacuumed within 3000000 transactions (10184)
        2022-09-16 12:09:08.058 +08 [84585] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
        2022-09-16 12:09:08.065 +08 [84585] WARNING: database "template1" must be vacuumed within 3000000 transactions (10184)
        2022-09-16 12:09:08.065 +08 [84585] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

      8. Now, if you can see that PostgreSQL asking to do the same VACUUM FULL for the template1 database as well. Let’s do the same for template1.

        [fsepuser@primary ~]$ /opt/fsepv14server64/bin/postgres --single -D /database/instl template1
        2022-09-16 12:14:14.897 +08 [84775] WARNING: database with OID 1 must be vacuumed within 2999860 transactions (10185)
        2022-09-16 12:14:14.897 +08 [84775] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

        PostgreSQL stand-alone backend 14.0
        backend> vacuum full; 2022-09-16 12:14:24.965 +08 [84775] WARNING: database "template1" must be vacuumed within 2999860 transactions (10184)
        2022-09-16 12:14:24.965 +08 [84775] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
        2022-09-16 12:14:25.026 +08 [84775] WARNING: database "template1" must be vacuumed within 2999859 transactions (10184)
        2022-09-16 12:14:25.026 +08 [84775] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

      9. Now it’s asking to execute same operation on template0.

        2022-09-16 12:14:25.696 +08 [84775] WARNING: database "template1" must be vacuumed within 2999795 transactions (10184)
        2022-09-16 12:14:25.696 +08 [84775] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
        2022-09-16 12:14:25.703 +08 [84775] WARNING: database "template0" must be vacuumed within 2999794 transactions (10184)
        2022-09-16 12:14:25.703 +08 [84775] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

      10. Let’s do the same for template0.

        [fsepuser@primary ~]$ /opt/fsepv14server64/bin/postgres --single -D /database/instl template0
        2022-09-16 12:16:18.247 +08 [84863] WARNING: database with OID 14727 must be vacuumed within 2999794 transactions (10185)
        2022-09-16 12:16:18.247 +08 [84863] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

        PostgreSQL stand-alone backend 14.0
        backend> Vacuum Full; 2022-09-16 12:16:36.346 +08 [84863] WARNING: database "template0" must be vacuumed within 2999794 transactions (10184)
        2022-09-16 12:16:36.346 +08 [84863] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

        2022-09-16 12:16:37.036 +08 [84863] WARNING: database "template0" must be vacuumed within 2999730 transactions (10184)
        2022-09-16 12:16:37.036 +08 [84863] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
        2022-09-16 12:16:37.042 +08 [84863] WARNING: database "template0" must be vacuumed within 2999729 transactions (10184)
        2022-09-16 12:16:37.042 +08 [84863] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
                You might also need to commit or roll back old prepared transactions, or drop stale replication slots.

        If you noticed, PostgreSQL is no longer requesting to perform VACUUM FULL on any additional databases because I only have three databases (postgres, template1, and template0), and I have successfully finished VACUUM FULL on each of them.
        After that we can start the PostgreSQL as below.

        [fsepuser@primary ~]$ pg_ctl -D /database/instl start
        waiting for server to start....2022-09-16 04:19:11.580 GMT [84987] WARNING: The license will expire in 66 days.
        2022-09-16 12:19:11.594 +08 [84987] LOG: redirecting log output to logging collector process
        2022-09-16 12:19:11.594 +08 [84987] HINT: Allure log output will appear in directory 'log'.
        done
        server started
        [fsepuser@primary ~]$ psql
        psql (14.0)
        Type "help" for help. postgres=#

        Now, let’s confirm if the problem is fixed or not by executing DML statements.

        [fsepuser@primary ~]$ psql
        psql (14.0)
        Type "help" for help.

        postgres=# select usename from pg_user;
         useename
        ----------
         fsepuser
        (1 row)

        postgres=# create table test (id numeric);
        CREATE TABLE
        postgres=#
        postgres=# insert into test values (1);
        INSERT 0 1
        postgres=# select * from test ;
        id ----   1
        (1 row) postgres=#

        As you can see that the issue is fixed now and PostgreSQL able to perform DML operations effectively like before the issue.

      How to avoid transaction wraparound

      In PostgreSQL, there are various important methods to use in order to prevent transaction wraparound. First and foremost, you must keep track of the age of your database's oldest transaction ID (XID). This is possible using the pg_class catalog table. You can determine if the XID is approaching the wraparound limit by keeping track of its age.

      Regular maintenance procedures, such as vacuuming and database analyzing, should be carried out in order to prevent transaction wraparound. Vacuuming helps to recover space that had been occupied by obsolete or deleted tuples, minimizing bloat and preventing transaction IDs from reaching the wraparound threshold. Analyzing updates the optimizer statistics, ensuring that query plans are correct.

      Additionally, consider modifying the autovacuum parameters to some optimal settings in your PostgreSQL configuration. The autovacuum processes are controlled by those settings and helping in controlling transaction IDs.

      Another preventive method is to configure a monitoring system to deliver notifications when the XID age reaches a certain limit. This allows you to respond quickly and prevents you from approaching the wraparound limit.

      Conclusion

      Although it is ideal to prevent transaction wraparound from occurring by setting up regular maintenance and analysis of your system, unexpected events or new requirements may arise as your business grow and evolve. With the example above, I illustrated how you can clean up your PostgreSQL system in case transaction wraparound occurs. I hope now you will not be afraid of transaction wraparound, knowing that you are able to fix it effectively.

      Fujitsu Enterprise Postgres
      leverages and extends the strength and reliability of PostgreSQL with additional enterprise features.
      Compare the list of features.
      We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.
      Explore PostgreSQL Insider >
      Subscribe to be notified of future blog posts
      If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.

      Topics: PostgreSQL, PostgreSQL support, Fujitsu Enterprise Postgres

      Receive our blog

      Search by topic

      see all >

      Read our latest blogs

      Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >