
Background
Traditionally, setting up logical replication involves running the CREATE SUBSCRIPTION command, which initiates table synchronization (tablesync) processes using the COPY command. A separate tablesync process is launched for each table, and the number of tables synchronized in parallel is controlled by the GUC parameter max_sync_workers_per_subscription. You can check our blog post Logical Replication Tablesync Workers for more details.
For large databases, this initial synchronization phase can be resource-intensive and time-consuming due to increased WAL retention and the overhead of running multiple COPY operations.
To overcome these limitations, users can initially set up a physical replica using a base backup. Physical replication efficiently handles the initial data synchronization through WAL streaming, which is generally faster and more suitable for large datasets. Once the physical replica catches up with the primary, it can be converted into a logical replica to continue replicating incremental changes across PostgreSQL versions. However, converting a physical replica to a logical replica involves several manual steps, such as promoting the standby, creating PUBLICATION/SUBSCRIPTION pairs for each database, and correctly initializing replication progress to avoid duplicate changes. These steps are error-prone and time-consuming, making the overall process tedious.
The pg_createsubscriber tool was introduced to simplify and speed up the process of converting a physical replica to a logical replica.
What does the tool do?
When pg_createsubscriber is executed, it sets up logical replication for all tables in the specified database(s) by automatically creating the necessary publication and subscription objects. For each specified database, a PUBLICATION (FOR ALL TABLES) is created on the publisher server, and a corresponding SUBSCRIPTION is created on the subscriber server. This tool must be executed on the target server (i.e., the standby being converted). Ensure that the postgres server process is stopped on the target server before running this tool.
After successful execution, the target server is brought to a state equivalent to a freshly initialized logical replica.
The basic syntax of the tool is as follows:
pg_createsubscriber [option...] { -d | --database }dbname
{ -D | --pgdata }datadir { -P | --publisher-server }connstr
Note: pg_createsubscriber does not create any logical replication objects for databases that are not specified during its execution. Additionally, replication objects related to physical replication are dropped, and configurations related to physical replication are modified. As a result, neither logical nor physical replication is active for these databases.
Benefits
Speed up the process of logical replica creation
pg_createsubscriber combines the efficiency of physical replication with the flexibility of logical replication by transitioning an already-synchronized physical standby into a logical subscriber, without re-copying data.
It also speeds up the process of converting a physical replica to a logical replica by automating the process of promoting standby, creating a PUBLICATION/SUBSCRIPTION pair for each database, and setting replication progress.
Performance comparisons
To demonstrate the performance difference between setting up standard logical replication and using pg_createsubscriber, I considered a workload in which the upstream contains 10 tables. The size of each table is shown on the X-axis, while the Y-axis represents the time taken to complete the initial synchronization of all 10 tables in the following two cases:
- For a standard logical replication setup, the setup time is measured from the execution of the CREATE SUBSCRIPTION command to the completion of table synchronization. In this case, the GUC max_sync_workers_per_subscription is set to 4, allowing up to four tables to be synchronized in parallel.
- For setting up a logical replica using pg_createsubscriber, the total setup time includes the entire process, from the initiation of the standby creation to the completion of the pg_createsubscriber execution.
Performance comparison logical replication with pg_createsubscriber
The results show that as the size of the tables increases, setting up a logical replica is faster when a physical replica of the source server is created first and then converted into a logical replica using pg_createsubscriber.
Use cases
Setting up Logical Replication with large databases
Setting up logical replication on an instance containing large volumes of data can be time-consuming due to the resource-intensive initial synchronization phase. This process involves increased WAL retention and the overhead of multiple COPY operations, which can significantly impact performance. To overcome this, the pg_createsubscriber tool can be used to speed up the setup and reduce the load during initial synchronization.
Upgrading a PostgreSQL instance
PostgreSQL versions are supported for a period of five years from their release. Each new version introduces features, tools, and performance improvements. To take advantage of these enhancements, users need to upgrade their PostgreSQL instances. One common way to perform upgrades is by using the pg_upgrade tool. However, pg_upgrade requires the instance to be stopped, which can lead to significant downtime.
To minimize downtime, users can instead set up a logical replica and perform the upgrade on the replica. Since logical replication works across major versions, changes from the primary can continue to be synced during the upgrade process. Once the upgrade is complete, the logical replica can be promoted to act as the primary node, significantly reducing downtime.
Because PostgreSQL databases can contain large volumes of data, setting up a logical replica using traditional steps may be time-consuming. To accelerate this process, users can create a physical standby replica and then use the pg_createsubscriber tool to convert it into a logical replica efficiently.
Upgrading Streaming Replication cluster without downtime
Streaming replication in PostgreSQL provides several important benefits, including high availability, read scalability, and disaster recovery. However, performing a major version upgrade in a streaming replication setup can be challenging due to the following reasons:
- Physical replication requires both the primary and standby servers to run the same operating system and PostgreSQL major version.
- PostgreSQL supports in-place upgrades using pg_upgrade, but this process breaks physical replication.
- Reinitializing physical replication from scratch after an upgrade can introduce significant downtime.
A more effective approach is to temporarily convert the standby into a logical replica using the pg_createsubscriber utility. This enables cross-version logical replication, allowing replication to continue during the upgrade and significantly reducing downtime during the upgrade. After the upgrade is completed, a new physical replica of the upgraded instance can be created. For detailed steps, refer to my blog post How to upgrade replication clusters without downtime.
Internals of pg_createsubscriber
Here is a flowchart which describes the basic flow of the tool:
Basic flow of pg_createsubscriber
Let’s go through each step in detail.
Step 1: Verification of source and target
The process begins with a verification step, where the tool ensures that both the source and target servers are in the expected state. It first checks whether the target server is already running; if it is, an error is raised and the process is halted, as the target must be shut down before proceeding.
Once confirmed, the target server is started using specific command-line options required for the migration. Following this, the tool performs a series of prerequisite checks on both the source and target servers. These checks include validation of important GUC parameters such as max_wal_senders, max_replication_slots, wal_level, and other settings essential for replication. If any of these prerequisites are not met, the process is stopped to prevent potential issues. This verification step is crucial because if a problem arises later during subscription creation and the conversion fails, the target instance may no longer be able to become a standby server again.
Step 2: Create publications and replication slots on source
Once validated, the tool creates a PUBLICATION FOR ALL TABLES and a replication slot for each specified database on the source server. It follows a default naming pattern of pg_createsubscriber_%u_%x, where %u is the database OID and %x is a random integer. The replication slots follow the same naming pattern and are critical for maintaining data consistency. The LSN (Log Sequence Number) of the last replication slot is recorded. This LSN is used both as the recovery_target_lsn during the recovery phase and as the starting point for logical replication.
Step 3: Recovery of standby
Next, the tool writes the necessary recovery parameters to the target server’s data directory and restarts it. These parameters instruct the server to recover up to the specified LSN and then promote itself to a new primary.
Additional recovery parameters are added to avoid unexpected behavior during the recovery process, such as the end of the recovery as soon as a consistent state is reached (WAL should be applied until the replication start location), and multiple recovery targets that can cause a failure.
Once the restart is complete and recovery has reached the specified LSN, the target server is promoted.
Step 4: Create subscriptions and set replication origin on target
Following recovery, a SUBSCRIPTION is created on the target server for each specified database. These subscriptions use the replication slots created earlier in step 2, are configured to skip initial data copy, and remain disabled initially.
At this point, a pause allows pg_createsubscriber to safely advance the replication origin to the LSN where recovery is completed. This is done by calling pg_replication_origin_advance() for each subscription, using the corresponding origin name and starting LSN, to ensure that no duplicate changes are replicated.
Once the replication origins are advanced, the subscriptions are enabled, and logical replication begins precisely from the point where recovery ended.
Step 5: Update system identifiers
Finally, pg_resetwal is used to update the system identifier of the target server. This step requires the server to be stopped and ensures that the new logical replica is correctly identified as an independent system. Through this sequence of operations, pg_createsubscriber provides a reliable and resource-efficient method for transitioning from physical to logical replication.
Note
If pg_createsubscriber fails due to incorrect configuration or other reasons, the objects created during its execution—such as publications and replication slots—are automatically cleaned up. However, the target server is not cleaned up, as the objects on the target are created only after it is promoted. In such cases, the standby cannot be recovered, and a new standby must be created from scratch.
Detailed example
Let's understand it with the help of an example:
Suppose we have a streaming replication setup between two PostgreSQL nodes: Node1 and Node2.
- Node1 is the primary, running on port 5432.
- Node2 is the standby, running on port 5433.
The primary (Node1) hosts three databases: db1, db2, and db3.
Our objective is to convert Node2 from a physical standby into a logical replica for databases db1 and db2, while leaving db3 unchanged.
To begin this process, we first stop Node2. Then, on Node2, we execute the pg_createsubscriber command :
pg_createsubscriber -d db1 -d db2 -D ../node2
-P 'host=<node1_connstr> port=5432' -p 5433
Step 1: Verification of source and target
When the pg_createsubscriber command is executed, several operations take place in the background. First, on Node2, the system verifies that essential GUC parameters, such as max_active_replication_origins, max_logical_replication_workers, and max_worker_processes, are configured with the required values. These settings ensure that Node2 can handle logical replication and manage replication workers efficiently.
Similarly, on Node1, the system checks that the necessary GUC parameters, including max_wal_senders, max_replication_slots, and wal_level, are correctly configured to support logical replication and the creation of replication slots. These parameters are essential for enabling logical replication on the primary node.
Step 2: Create publications and replication slots on source
Following this, publications and replication slots are created for db1 and db2 on Node1. Each publication follows the default naming pattern pg_createsubscriber_%u_%x, where %u represents the database OID, and %x is a random integer. In this example, the publications created are named pg_createsubscriber_16387_5362dd23 and pg_createsubscriber_16388_f8f6b89c. The corresponding replication slots are also created using the same naming convention. The Log Sequence Number (LSN) of the last created replication slot is recorded and referred to as consistent_lsn. In this case, the consistent_lsn is '0/30006D0', which will be used as the recovery target for Node2.
Step 3: Recovery of standby
On Node2, recovery-related GUC parameters such as recovery_target_timeline, recovery_target_inclusive, recovery_target_action, and recovery_target_lsn are set. These parameters ensure that Node2 recovers to the consistent_lsn and is promoted from its standby state to a primary. After applying these settings, Node2 is restarted, and it recovers to the specified LSN, completing the promotion process.
Step 4: Create subscriptions and set replication origin on target
Once recovery is complete, subscriptions are created on db1 and db2 on Node2. These subscriptions are initially created in a disabled state to prevent premature replication. They are configured to use the replication slots that were created on Node1 in the previous step. After this, the replication origins on Node2 are advanced to the consistent_lsn ('0/30006D0'), ensuring that no data changes are replayed. After advancing the replication origins, the subscriptions are enabled, and logical replication is activated.
Step 5: Update system identifiers
As part of the cleanup process, any unused replication slots, such as primary replication or failover slots, are deleted. This ensures a clean replication environment, free from redundant slots that are no longer needed. Additionally, the system identifiers of both Node1 and Node2 are updated to reflect their new roles, avoiding conflicts in future operations.
To verify, let's check the catalog tables of Node1 for publications and replication slots and the catalog tables for Node2 for subscriptions.
On Node1:
db1=# select slot_name, slot_type, database from pg_replication_slots;
-[ RECORD 1 ]---------------------------------
slot_name | pg_createsubscriber_16387_5362dd23
slot_type | logical
database | db1
-[ RECORD 2 ]---------------------------------
slot_name | pg_createsubscriber_16388_f8f6b89c
slot_type | logical
database | db2
db1=# select pubname, puballtables, pubinsert, pubupdate, pubdelete, pubtruncate from pg_publication;
-[ RECORD 1 ]+-----------------------------------
pubname | pg_createsubscriber_16387_5362dd23
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | t
db2=# select pubname, puballtables, pubinsert, pubupdate, pubdelete, pubtruncate from pg_publication;
-[ RECORD 1 ]+-----------------------------------
pubname | pg_createsubscriber_16388_f8f6b89c
puballtables | t
pubinsert | t
pubupdate | t
pubdelete | t
pubtruncate | t
On Node2:
postgres=# select subdbid, subname, subslotname, subpublications from pg_subscription;
-[ RECORD 1 ]---+-------------------------------------
subdbid | 16387
subname | pg_createsubscriber_16387_5362dd23
subslotname | pg_createsubscriber_16387_5362dd23
subpublications | {pg_createsubscriber_16387_5362dd23}
-[ RECORD 2 ]---+-------------------------------------
subdbid | 16388
subname | pg_createsubscriber_16388_f8f6b89c
subslotname | pg_createsubscriber_16388_f8f6b89c
subpublications | {pg_createsubscriber_16388_f8f6b89c}
Summary
PostgreSQL 17 introduces a new tool, pg_createsubscriber, which simplifies and accelerates the creation of a logical replica by converting an existing physical standby server. This greatly streamlines the process of upgrading a streaming replication cluster.
For full documentation of the tool, refer to the PostgreSQL documentation PostgreSQL Server Applications > pg_createsubscriber, and for the commit you can check the GitHub page.
For the future
Stay tuned for our upcoming blog on pg_createsubscriber, where we’ll explore its latest features and enhancements in PostgreSQL 18.