Fujitsu Logo
ENQUIRE

    Fujitsu has worked with the PostgreSQL community to add a new feature that filters rows for logical replication. This allows users to create customized logical replication solutions that replicate row data only when it matches a specified filter expression.

    In this blog post, I will give a brief introduction to the new PostgreSQL 15 feature of publication row filters. This feature was added by the Fujitsu OSS team in close collaboration with other members of the PostgreSQL Open-Source community.

    There are several reasons why a user might choose to define logical replication row filters, such as reducing network traffic, reducing noise or eliminating bad data, replicating only data relevant to subscriber, and hiding sensitive information.

    Before we start

    Typically, when using logical replication PUB/SUB, all data changes from the published tables will be replicated to the appropriate subscribers. With the new row filter feature, the user can now limit the replication so that only row data that matches a given filter expression will be replicated.

    Feature overview

    Row filters are optionally specified per table at the time of CREATE PUBLICATION. A row filter WHERE clause is added for each table that you want to be row-filtered. Each publication might have zero or more row filters.

    Modified CREATE PUBLICATION syntax

    Here is a simplified CREATE PUBLICATION syntax, highlighting the new row filter WHERE clause added for PostgreSQL 15:

    CREATE PUBLICATION <pub-name> FOR TABLE <table-list>
    <table-list> ::= <table-list-item> [, <table-list>]
    <table-list-item> ::= <table-name> [ WHERE (<row-filter-expression>) ]

    Row filter expressions

    The row filter WHERE clause permits only simple expressions, including basic functions and logical operators. It can only refer to columns of the table that it belongs to. If a publication publishes any UPDATE or DELETE operations, then the row filter expression must contain only columns that are covered by the table’s "replica identity". If a publication publishes only INSERT operations, the row filter expression can use any column.

    Example 1

    CREATE PUBLICATION cheap_widgets
    FOR TABLE widget WHERE (price < 1.99);

    Example 2

    CREATE PUBLICATION sales_people
    FOR TABLE employees WHERE (role <> 'manager' AND dept = 'sales');

    Example 3

    CREATE PUBLICATION rate_usa    FOR TABLE exchange_rates WHERE (country = 'us');
    CREATE PUBLICATION rate_uk     FOR TABLE exchange_rates WHERE (country = 'uk');
    CREATE PUBLICATION rate_france FOR TABLE exchange_rates WHERE (country = 'fr');

    Applying row filters

    Row filters are applied before deciding to publish the change. If the row filter expression evaluates to NULL or false, then the row will not be replicated.

    Row filters are ignored for TRUNCATE TABLE commands.

    Filtered INSERTs are replicated as INSERTs.

    Filtered DELETEs are replicated as DELETEs.

    Filtered UPDATEs are more complicated because the objective is that the subscriber-side replicated table data must end up obeying the row filter expression as defined on the publisher. To decide what action is necessary to achieve this means whenever an UPDATE is processed, the row filter expression must be evaluated for both the old and the new row data (that is, before and after the UPDATE).

    • If the old row doesn't satisfy the row filter expression (we assert this row does not exist on the subscriber), but the new row does, then for data consistency the new row should be added to the subscriber — the UPDATE is transformed to an INSERT.
    • If the old row satisfies the row filter expression (we assert this row was already sent to the subscriber), but the new row doesn't, then for data consistency the old row should be removed from the subscriber — the UPDATE is transformed to a DELETE.
    UPDATE transformations
    Old row New row Replication method
    - - Don't replicate anything
    - Match filter INSERT the new row on the subscriber
    Match filter - DELETE the old row on the subscriber
    Match filter Match filter Replicate the UPDATE on the subscriber

    Combining multiple row filters

    Sometimes a subscription may subscribe to multiple publications where the same table was published with different row filter WHERE clauses. If this happens, then those expressions will be OR-ed together so that rows satisfying any of the expressions will be replicated.

    This also means that if one of subscribed publications for the table has no row filter, all other row filters for that table will be ignored.

    Initial table synchronization

    When the user elects to use the (default) copy_data = true subscription parameter, then only the pre-existing data satisfying the row filters of the publication(s) will be copied.

    Enhancements to psql commands

    As part of this new feature, the PostgreSQL 15 psql commands were also modified to provide useful information about the row filter expressions.

    The display table command (\d) now shows the WHERE clause on any publication the table is a member of. The display publication command (\dRp+) now shows the WHERE clause of any table.


    Example:

    postgres=# CREATE TABLE data(id int, rgb text);
    CREATE TABLE
    postgres=# CREATE PUBLICATION pub_data_all FOR TABLE data;
    CREATE PUBLICATION
    postgres=# CREATE PUBLICATION pub_data_blue FOR TABLE data WHERE (rgb = 'B');
    CREATE PUBLICATION
    postgres=# CREATE PUBLICATION pub_data_red FOR TABLE data WHERE (rgb = 'R');
    CREATE PUBLICATION
    postgres=# \d data
                    Table "public.data"
     Column |  Type   | Collation | Nullable | Default
    --------+---------+-----------+----------+---------
     id     | integer |           |          |
     rgb    | text    |           |          |
    Publications:
        "pub_data_all"
        "pub_data_blue" WHERE (rgb = 'B'::text)
        "pub_data_red" WHERE (rgb = 'R'::text)

    postgres=# \dRp+
                              Publication pub_data_all
      Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
    ----------+------------+---------+---------+---------+-----------+----------
     postgres | f          | t       | t       | t       | t         | f
    Tables:
        "public.data"

                             Publication pub_data_blue
      Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
    ----------+------------+---------+---------+---------+-----------+----------
     postgres | f          | t       | t       | t       | t         | f
    Tables:
        "public.data" WHERE (rgb = 'B'::text)

                              Publication pub_data_red
      Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
    ----------+------------+---------+---------+---------+-----------+----------
     postgres | f          | t       | t       | t       | t         | f
    Tables:
        "public.data" WHERE (rgb = 'R'::text)

    postgres=#

    Effect

    There are several reasons why a user might choose to define logical replication row filters:

    • To reduce network traffic (increase performance) by replicating only a small subset of a large data table.
      Example:

    CREATE PUBLICATION pub_data_2205
    FOR TABLE bank_transactions WHERE (year = 2022 AND month = 'May');

    • To reduce noise or eliminate bad data.
      Example:

    CREATE PUBLICATION pub_data_good
    FOR TABLE sensor WHERE (value IS NOT NULL AND value > 0);

    •  To provide only the data that is relevant to a subscriber node.
      Example:

    CREATE PUBLICATION pub_data_new_york
    FOR TABLE weather WHERE (state = 'NY');

    • To act as a form of security by hiding (not replicating) sensitive information.
      Example:

    CREATE PUBLICATION pub_data_salary
    FOR TABLE employees WHERE (role <> 'manager');

    For the future

    The addition of row filtering was a major feature for PostgreSQL 15 logical replication. This feature gives another tool for the user’s toolbox, allowing them to create more sophisticated or tailored logical replication solutions. The Fujitsu OSS team will continue to help enhance and add more capabilities to PostgreSQL logical replication.

    A related feature to row filters, which was also added in PostgreSQL 15, was logical replication column lists. This will be the topic of an upcoming blog from one of my colleagues.

    img-man-using-laptop-with-overlaid-file-and-magnifying-glass-icons-01Where you can find more information

    Subscribe to be notified of future blog posts
    If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.
    We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.

     

    Explore PostgreSQL Insider >

    Topics: PostgreSQL, PostgreSQL Community, Logical Replication

    Receive our blog

    Receive notification of PostgreSQL-based articles for business and technical audiences.

    Search by topic

    see all >

    Read our latest blogs

    Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.