Upgrading a PostgreSQL replication cluster has historically involved downtime or complex workarounds, especially when using physical replication. With PostgreSQL 17, the upgrade process becomes more flexible thanks to major improvements in logical replication. In my session, I talked about how these changes make it significantly easier to perform rolling upgrades with minimal or no service disruption, even in production environments.

Let's dive into the topic
Below, you'll find the slides from my presentation at PGConf India 2025.

How to upgrade replication clusters without downtime
Shlok Kumar Kyal
Application Developer
Fujitsu

Agenda
- Introduction
- Upgrading replication clusters until PostgreSQL 16
- New Logical Replication Features in PostgreSQL
- Upgrading replication clusters until PostgreSQL 17

Part 1
Introduction

Introduction – what is upgrade?
- In PostgreSQL, upgrade refers to the process of moving database to a newer version
- The PostgreSQL community releases a new major version every year, but supports only for five years
- Every major version adds a lot of features, tools, etc.
- Major releases of PostgreSQL cannot understand old data directory
- System catalog may be changed
- WAL format may be changed
- etc.

pg_upgrade – the way to upgrade your instance
- The built-in upgrade tool
- 7x faster than normal dump and restore
- Avoids reading data by SQL commands
- Assumptions of this tool
- System catalogs are changed for every releases
- Table file format is preserved

pg_upgrade - weaknesses
- Requires the instance to be stopped
- Replication slot cannot be migrated, for PG16 and earlier
- Breaks the streaming replication cluster
- Streaming replication requires major version of instances are the same

Understanding replication in PostgreSQL

Understanding replication in PostgreSQL
Streaming replication | Logical replication | |
Naming of instances | Primary/Standby | Publisher/Subscriber |
What content they send | Exact WAL records | Replication messages - extracted information from WAL |
Who initially synchronizes data | pg_basebackup | Done automatically |
Replication target | Whole of database cluster | Per database |
What downstream can do | Read-only queries | Both read and write queries |
Environments | OS and major versions must be same | Can be different |
Streaming replication: Backup purpose - Primary/standby becomes same state
Logical replication: Backup and other purposes - Nodes can be different state

Logical replication - Usage
- The publication must be defined on an upstream node
postgres=# CREATE PUBLICATION pub FOR ALL TABLES; CREATE PUBLICATION postgres=# SELECT * FROM pg_publication; oid | pubname | pubowner | puballtables | pubinsert | pubupdate | pubdelete | pubtruncate | pubviaroot -------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------ 16396 | pub | 10 | t | t | t | t | t | f (1 row)
- Then a downstream node subscription subscribes to the publication
postgres=# CREATE SUBSCRIPTION sub CONNECTION 'user=postgres dbname=postgres port=5431' PUBLICATION pub; NOTICE: created replication slot "sub" on publisher CREATE SUBSCRIPTION postgres=# SELECT oid, subdbid, subname, subconninfo FROM pg_subscription; oid | subdbid | subname | subconninfo -------+---------+---------+----------------------------------------- 16402 | 5 | sub | user=postgres dbname=postgres port=5431 (1 row)

Logical replication - Replication slots
- Provides a way to ensure the instance does not remove WAL files
- Two types:
- Streaming replication slot
- Logical replication slot
- Logical slots contain an output plugin, used by logical decoding

Part 2
Upgrading replication clusters till PostgreSQL 16

Challenges of upgrading replication clusters
- Major releases can change the layout of the system catalogs
(addition of columns, changed column type, etc). - Major releases can change WAL records
(addition of new WAL record, modification of WAL record, etc) - Data files cannot be used by the upgraded instance
- Streaming replication clusters does not work after one of the instances is upgraded

Why use Logical Replication for upgrades?
- Logical replication works across major versions, so even if one of the instances (Publisher or Subscriber) is upgraded, logical replication can continue
- WAL format changes do not affect logical replication
- Continues to identify and replicate changes even after the upgrade
- It helps in reducing the downtime

Steps to upgrade streaming replication cluster (PostgreSQL 16)

Steps to upgrade streaming replication cluster | PostgreSQL 16
- We want to upgrade the cluster from PostgreSQL 13 to PostgreSQL 16
- Let's say primary is in node1 and standby is in node2
- Any concurrent activities are allowed, as much as possible
- Make sure wal_level=logical in primary

Steps to upgrade streaming replication cluster | PostgreSQL 16
Here we want to upgrade clusters from PostgreSQL 13 to PostgreSQL 16

Steps to upgrade streaming replication cluster | PostgreSQL 16
- Create replication slots for all databases
SELECT pg_create_logical_replication_slot…
- Create publications for all databases
CREATE PUBLICATION FOR ALL TABLES
- Promote to primary
$ pg_ctl –D node2 promote
- Advance replication slots
SELECT pg_replication_slot_advance…

Steps to upgrade streaming replication cluster | PostgreSQL 16
- Create subscriptions for all databases
CREATE SUBSCRIPTION subXXX…
- Stop and upgrade
$ pg_ctl –D node2 stop $ pg_upgrade –d node2 …
- Truncate all tables
TRUNCATE XXX …

Steps to upgrade streaming replication cluster | PostgreSQL 16
- Re-create subscriptions
DROP SUBSCRIPTION subXX… CREATE SUBSCRIPTION subXXX…
- Stop node 1
$ pg_ctl –D node1 stop

Steps to upgrade streaming replication cluster | PostgreSQL 16
- Run pg_basebackup
$ pg_basebackup –D node1_upgraded –R…
- Drop subscriptions
DROP SUBSCRIPTION subXX…

The problems with PostgreSQL 16 upgrade
- The logical replication slots must be re-created
- The replication slot LSN should be adjusted
- The subscription-related information will not be preserved
- The subscriptions should be dropped
- The table data should be truncated
- The subscriptions should be re-created, depending on the data size
- This process is complex and can be time-consuming

Steps to upgrade logical replication cluster (PostgreSQL 16)

Steps to upgrade logical replication cluster | PostgreSQL 16
Here we want to upgrade clusters from PostgreSQL 13 to PostgreSQL 16

Steps to upgrade logical replication cluster | PostgreSQL 16
- Disable all subscriptions
ALTER SUBSCRIPTION sub1_node1_node2 DISABLE
- Stop and upgrade
$ pg_ctl –D node1 stop $ pg_upgrade –d node1 …
- Re-create and advance replication slots
SELECT pg_create_logical_replication_slot… SELECT pg_replication_slot_advance…
- Stop and upgrade
$ pg_ctl –D node2 stop $ pg_upgrade –d node2 …

Steps to upgrade logical replication cluster | PostgreSQL 16
- Define tables that were created in node1 during upgrade
CREATE TABLE…
- Truncate all tables
TRUNCATE XXX…
- Re-create subscriptions
DROP SUBSCRIPTION sub1_node1_node2 CREATE SUBSCRIPTION sub1_node1_node2

The problems with PostgreSQL 16 upgrade
- The logical replication slots are not preserved
- The replication slots should be recreated after the upgrade
- The subscription-related information will not be preserved
- The subscriptions should be dropped
- The table data should be truncated
- The subscriptions should be re-created, depending on the data size

Part 3
New Logical Replication features in PostgreSQL 17

Migration of Logical Replication slots
- Logical replication slot info will be copied from the old cluster
- Slot information from the old cluster is fetched
- New logical replication slots are created on the new cluster
- After the upgrade, just the subscription connection strings should be updated to point to the updated Publisher instance
- Logical replication can continue seamlessly

Preserve subscription states during upgrade
- Previously, only the subscription metadata information was preserved
- Without the list of relations and their state, it's impossible to re-enable the subscriptions without missing some records
- Now the pg_subscription_rel information and replication origin will be preserved
- To preserve the subscription relations and replication origin, pg_dump restores the content of pg_subscription_rel from the old cluster and also updates the replication origin using new SQL APIs

pg_createsubscriber – a new tool in PostgreSQL 17
- Converts physical standby into logical Subscriber
- Confirms the standby is caught up at the certain point,
- Then defines subscriptions on the standby
- How? – introduces a new server application
- Must be executed on the standby server
$ pg_createsubscriber [option...] { -d | --database } dbname { -D | --pgdata } datadir { -P | --publisher-server } connstr
- Must be executed on the standby server

pg_createsubscriber - Motivation
- On primary:
CREATE PUBLICATION pub FOR ALL TABLES;
- To create a logical replica:
CREATE SUBSCRIPTION SUB CONNECTION …… WITH (COPY_DATA = ON);

Difficulties preparing the new Subscriber
- Takes a long time
- Initial synchronization runs COPY command, per table
- Estimated execution time is proportional to number of tables
- Requires additional disk resources
- Replication slots will be created while copying data
- Generated WAL files are preserved
- They may fill up disk - PANIC!

pg_createsubscriber - performance comparison
- Compares the elapsed time while synchronizing 10 tables
- Logical replication: elapsed time from CREATE SUBSCRIPTION to end of synchronization (4 tablesync workers run in parallel)
- pg_createsubscriber: elapsed time from setup of standby to end of pg_createsubscriber command execution time
wal_level = logical shared_buffers = 40GB max_worker_processes = 32 max_parallel_maintenance_workers = 24 max_parallel_workers = 32 synchronous_commit = off checkpoint_timeout = 1d max_wal_size = 24GB min_wal_size = 15GB autovacuum = off
$ cat /proc/meminfo | grep MemTotal MemTotal: 792237412 kB $ grep processor /proc/cpuinfo | wc -l 120

pg_createsubscriber - how it works
- Verify if this can be a Publisher
wal_level == logical max_walsender >= num of target DBs…
- Verify if this can be a Subscriber
max_worker_processes > num of targer DBs
- Create publications and replication slots for all tables
CREATE PUBLICATION FOR ALL TABLES SELECT pg_create_logical_replication_slot
- Set recovery_target_lsn and recovery_target_action
recovery_target_lsn = 'YYY' recovery_target_action = promote
- Restart standby
$ pg_ctl –D standby restart

pg_createsubscriber - how it works
- Create subscriptions
CREATE SUBSCRIPTION pg_createsubscriber_$oid_$pid… WITH (slot_name = XXX, copy_data = off, enabled = off);
- Update replication process
SELECT pg_replication_origin_advance
- Enable subscriptions
ALTER SUBSCRIPTION pg_createsubscriber_$oid_$pid ENABLE;

pg_createsubscriber - how it works
- Update system_identifier
$ pg_resetwal –D node1 … $ pg_resetwal –D node2 …

Part 4
Upgrading clusters from PostgreSQL 17 and onwards

Steps to upgrade streaming replication cluster | PostgreSQL 17
Here we want to upgrade clusters from PostgreSQL 17 to PostgreSQL 18+

Steps to upgrade streaming replication cluster | PostgreSQL 17
- Create logical replica
$ pg_createsubscriber –D node2 …

Steps to upgrade streaming replication cluster | PostgreSQL 17
- Stop node 2 and upgrade
$ pg_ctl –D node2 stop $ pg_upgrade –d node2 …

Steps to upgrade streaming replication cluster | PostgreSQL 17
- Stop node 1
$ pg_ctl –D node1 stop

Steps to upgrade streaming replication cluster | PostgreSQL 17
- Back up the cluster
$ pg_basebackup –D node1_upgraded –R …
- Drop subscriptions
DROP SUBSCRIPTION subXX …

Steps to upgrade logical replication cluster | PostgreSQL 17
Here we want to upgrade clusters from PostgreSQL 17 to PostgreSQL 18+

Steps to upgrade logical replication cluster | PostgreSQL 17
- Disable all subscriptions
ALTER SUBSCRIPTION sub1_node1_node2 DISABLE
- Stop node 1 and upgrade
$ pg_ctl –D node1 stop $ pg_upgrade –d node1 …
- Stop node 2 and upgrade
$ pg_ctl –D node2 stop $ pg_upgrade –d node2 …

Steps to upgrade logical replication cluster | PostgreSQL 17
- Define tables that were created in node1 during upgrade
CREATE TABLE…
- Enable and refresh subscriptions for all tables
ALTER SUBSCRIPTIONS sub_node1_node2 ENABLE ALTER SUBSCRIPTIONS sub_node1_node2 REFRESH PUBLICATION

Summary
- Upgrading replication clusters had many challenges
- Some features have been committed:
- Preserving logical replication slots information during upgrade
- Preserving subscription information during upgrade
- pg_createsubscriber has been added to convert the streaming replication cluster to a logical replication cluster
- Together, these features remove most downtime while upgrading the streaming replication cluster
- Logical replication clusters can be upgraded now without the need to copy the table data again

References
- pg_upgrade documentation
- Streaming Replication Protocol
https://www.postgresql.org/docs/current/protocol-replication.html
- Logical Replication
https://www.postgresql.org/docs/current/logical-replication.html

Conclusion
Upgrading PostgreSQL replication clusters, historically a complex process fraught with challenges such as re-creating logical replication slots and re-initializing subscriptions for major version changes, has seen significant advancements. While earlier versions up to PostgreSQL 16 required intricate manual steps that could lead to considerable downtime, the introduction of new logical replication features in PostgreSQL 17 and beyond has streamlined this critical operation.
Key improvements like the automatic migration of logical replication slots, preservation of subscription states during upgrades, and the powerful pg_createsubscriber tool now enable near-zero-downtime upgrades for streaming replication clusters and eliminate the need for redundant data copying in logical replication scenarios.
These innovations collectively ensure a more seamless, efficient, and robust upgrade path for PostgreSQL users, solidifying its reliability for mission-critical environments, ensuring continuous service, maintaining business continuity, enhancing user experience, safeguarding revenue, and protecting brand reputation.