Fujitsu Logo
ENQUIRE

    In this blog post I conclude my discussion on the new mechanisms available in PostgreSQL to handle logical replication conflicts by explaining how to use the ALTER SUBSCRIPTION SKIP command.

    In my previous blog post, I’ve written about the logical replication conflict and one mechanism introduced to avoid it - pg_replication_origin_advance.
    In this post, I’ll explain another measure to handle conflicts, using ALTER SUBSCRIPTION SKIP command

    This feature has a set of internal safeguards and inspections specific to the conflict handling. As a result, we don’t have the risk to skip non-failing transactions on conflict.

    Before we start

    My previous blog post has described what logical replication conflict is, how it happens, and how useful PostgreSQL 15 new features developed by the community are for its resolution. In parallel, there was a separate discussion for handling conflicts. The OSS community actively pursued the completion of this project, and we have built ALTER SUBSCRIPTION SKIP feature1 in the PostgreSQL code base successfully. As the documentation clarifies, this feature works to skip applying all changes of a remote transaction. This may sound similar to the functionality of pg_replication_origin_advance. Therefore, in this blog, I’d like to focus on ALTER SUBSCRIPTION SKIP to bring the differences into the open.

    In that post, I explained more background and related new features. If you haven’t already, it may be worthwhile to have a look at the article first. I will be applying ALTER SUBSCRIPTION SKIP to a scenario similar to the one used in the previous post for pg_replication_origin_advance.

    Disclaimer: Please note that this work is still not released, and the community can either decide to change its design or completely revert it.

    The ALTER SUBSCRIPTION SKIP command

    The purpose of this command is to skip a conflicting transaction finished at its specified LSN on the subscriber. The user can specify finish LSN by ‘lsn’ skip_option to indicate the failed transaction. The apply worker receives changes from the publisher and judges whether those should be skipped or not on the node. For this command input, we can use the error context’s finish LSN directly. Thus, no editing of an exported LSN on the server log is required, unlike the argument of pg_replication_origin_advance. In this sense, using ALTER SUBSCRIPTION SKIP is simpler. We’ll have a look at one demonstration below.

    Observing conflict and resolution by skipping a transaction (ALTER SUBSCRIPTION SKIP command)

    1 On the publisher side, create a table and a publication.

    postgres=# CREATE TABLE tab (id integer);
    CREATE TABLE
    postgres=# INSERT INTO tab VALUES (5);
    INSERT 0 1
    postgres=# CREATE PUBLICATION mypub FOR TABLE tab;
    CREATE PUBLICATION

    We now have one record for initial table synchronization.

    2 On the subscriber side, create a table with a unique constraint and a subscription.

    postgres=# CREATE TABLE tab (id integer UNIQUE);
    CREATE TABLE
    postgres=# CREATE SUBSCRIPTION mysub CONNECTION '…' PUBLICATION mypub WITH (disable_on_error = true);
    NOTICE: created replication slot "mysub" on publisher
    CREATE SUBSCRIPTION

    We have created a subscription with disable_on_error enabled. This subscription definition will cause the initial table synchronization in the background, which will succeed without any issues.

    As a result, the value 5 will be inserted on the subscriber.

    3 On the publisher side, execute three transactions in succession after the table synchronization.

    postgres=# BEGIN; -- Txn1
    BEGIN
    postgres=*# INSERT INTO tab VALUES (1);
    INSERT 0 1
    postgres=*# COMMIT;
    COMMIT
    postgres=# BEGIN; -- Txn2
    BEGIN
    postgres=*# INSERT INTO tab VALUES (generate_series(2, 8));
    INSERT 0 7
    postgres=*# COMMIT;
    COMMIT
    postgres=# BEGIN; -- Txn3
    BEGIN
    postgres=*# INSERT INTO tab VALUES (9);
    INSERT 0 1
    postgres=*# COMMIT;
    COMMIT
    postgres=# SELECT * FROM tab;
    id
    ----
    5
    1
    2
    3
    4
    5
    6
    7
    8
    9
    (10 rows)

    On the publisher, we could execute these transactions successfully. However, Txn2 includes duplicate data, which was already inserted at the time of the initial table synchronization. Thus, on the subscriber, this violates the unique constraint on the table and throws an error.

    4 On the subscriber side, check the current replication status.

    postgres=# SELECT subname, subenabled FROM pg_subscription;
     subname | subenabled
    ---------+------------
     mysub   | f
    (1 row)

    postgres=# SELECT * FROM tab;
    id
    ----
    5
    1
    (2 rows)

    Now, we’ll have a look at the current status on the subscriber. According to the behavior of  the disable_on_error option2, the subscription got disabled automatically by the error. Only Txn1 (and data replicated by the initial table synchronization) has been replicated on the subscriber. Here we can’t see the results of Txn2 and Txn3 because of the conflict. Txn3 will only get replayed after we resolve the conflict.

    5  On the subscriber's log, we see the error message of this conflict and log of disable_on_error option.

    ERROR: duplicate key value violates unique constraint "tab_id_key"
    DETAIL: Key (id)=(5) already exists.
    CONTEXT: processing remote data for replication origin "pg_16389" during "INSERT" for replication target relation "public.tab" in transaction 730 finished at 0/1566D10
    LOG: logical replication subscription "mysub" has been disabled due to an error

    In the subscriber’s server log, we can get the finish LSN for the argument of ALTER SUBSCRIPTION SKIP command. I’ll utilize this finish LSN to skip Txn2 as below.

    6 On the subscriber side, execute the ALTER SUBSCRIPTION SKIP command and enable the subscription.

    postgres=# ALTER SUBSCRIPTION mysub SKIP (lsn = '0/1566D10');
    ALTER SUBSCRIPTION
    postgres=# SELECT subname, subskiplsn, subenabled FROM pg_subscription;
     subname | subskiplsn | subenabled
    ---------+------------+------------
     mysub   |  0/1566D10 | f
    (1 row)

    postgres=# ALTER SUBSCRIPTION mysub ENABLE;
    ALTER SUBSCRIPTION
    postgres=# SELECT * FROM tab;
     id
    ----
     5
     1
     9
    (3 rows)

    postgres=# SELECT subname, subskiplsn, subenabled FROM pg_subscription;
     subname | subskiplsn | subenabled
    ---------+------------+------------
     mysub   |  0/0   | t
    (1 row)

    Here, I set the skip LSN and re-activated the subscription. Immediately, we see the replicated value of Txn3, without Txn2’s values. We have skipped the whole transaction. Along with this, the subskiplsn has been cleared, after we succeeded in skipping the changes.

    7 On the subscriber log, we can see the command’s log message of successful completion.

    LOG: start skipping logical replication transaction finished at 0/1566D10
    CONTEXT: processing remote data for replication origin "pg_16389" during "BEGIN" in transaction 730 finished at 0/1566D10
    LOG: done skipping logical replication transaction finished at 0/1566D10
    CONTEXT: processing remote data for replication origin "pg_16389" during "COMMIT" in transaction 730 finished at 0/1566D10

    We can see the logs of this success from the server log also.

    Advanced safeguards of ALTER SUBSCRIPTION SKIP

    This feature has a set of internal safeguards and inspections specific to the conflict handling. As a result, we don’t have the risk to skip non-failing transactions on conflict.

    Compare this with the example of misuse of pg_replication_origin_advance that I explained in my previous blog post How to handle logical replication conflicts in PostgreSQL for comparison. There, I touched on one aspect that this function could skip irrelevant successful transactions. But, this doesn’t happen with this ALTER SUBSCRIPTION SKIP feature. We achieve this by some internal checks below.

    Firstly, the specified LSN must be greater than the origin LSN which points to the location up to which data has been replicated on the subscriber.

    Further, this feature requires an exact match between the finish LSN of the argument and that of the first transaction which the subscriber gets from the publisher. Otherwise (after enabling the subscription), PostgreSQL will try to apply the failed transaction again and cause the same conflict instantly. This won’t happen in PostgreSQL 15, because of the improvement in the error message, which now includes the finish LSN. Since the correct finish LSN is available, we just need to specify it.

    Also, when we set a skip LSN without a conflict and succeed in applying a transaction, then the skip LSN is just cleared with a warning message. This makes sense since the incoming transaction turned out to be non-problematic, and a user may have set the LSN by mistake. The skip LSN may be set again at another proper time when a new conflict happens, and the user decides to adopt this measure.

    If you’re interested in details about the internal working, you can check the commit log3.

    Wrapping up

    For handling of logical replication conflicts, the community has dedicated much time and effort to creating this new convenient approach. It’s great that we have a choice to enjoy a simpler and safer method.

    In addition, several ideas have been already proposed to enhance this feature from various aspects in the community. For instance, there is an idea to add other types of skip_option to ALTER SUBSCRIPTION SKIP command, like indicating relations or commands (INSERT, DELETE,..) to be skipped. Another proposed idea is to store error-related information such as finished LSN to the system catalog or to record data the subscriber skipped somewhere in the server logs or tables.

    All of those enhancements sound attractive and I’m looking forward to what kind of new features we will have in the coming versions of PostgreSQL.

     


    References in this post:

     

     

    Subscribe to be notified of future blog posts
    If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.
    We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.

     

    Explore PostgreSQL Insider >

    Topics: PostgreSQL, PostgreSQL Community, Logical Replication

    Receive our blog

    Receive notification of PostgreSQL-based articles for business and technical audiences.

    Search by topic

    see all >

    Read our latest blogs

    Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.