Start  trial

    Start trial

      Fujitsu has worked with the PostgreSQL community to add a new feature that filters rows for logical replication. This allows users to create customized logical replication solutions that replicate row data only when it matches a specified filter expression.

      In this blog post, I will give a brief introduction to the new PostgreSQL 15 feature of publication row filters. This feature was added by the Fujitsu OSS team in close collaboration with other members of the PostgreSQL Open-Source community.

      There are several reasons why a user might choose to define logical replication row filters, such as reducing network traffic, reducing noise or eliminating bad data, replicating only data relevant to subscriber, and hiding sensitive information.

      Before we start

      Typically, when using logical replication PUB/SUB, all data changes from the published tables will be replicated to the appropriate subscribers. With the new row filter feature, the user can now limit the replication so that only row data that matches a given filter expression will be replicated.

      Feature overview

      Row filters are optionally specified per table at the time of CREATE PUBLICATION. A row filter WHERE clause is added for each table that you want to be row-filtered. Each publication might have zero or more row filters.

      Modified CREATE PUBLICATION syntax

      Here is a simplified CREATE PUBLICATION syntax, highlighting the new row filter WHERE clause added for PostgreSQL 15:

      CREATE PUBLICATION <pub-name> FOR TABLE <table-list>
      <table-list> ::= <table-list-item> [, <table-list>]
      <table-list-item> ::= <table-name> [ WHERE (<row-filter-expression>) ]

      Row filter expressions

      The row filter WHERE clause permits only simple expressions, including basic functions and logical operators. It can only refer to columns of the table that it belongs to. If a publication publishes any UPDATE or DELETE operations, then the row filter expression must contain only columns that are covered by the table’s "replica identity". If a publication publishes only INSERT operations, the row filter expression can use any column.

      Example 1

      CREATE PUBLICATION cheap_widgets
      FOR TABLE widget WHERE (price < 1.99);

      Example 2

      CREATE PUBLICATION sales_people
      FOR TABLE employees WHERE (role <> 'manager' AND dept = 'sales');

      Example 3

      CREATE PUBLICATION rate_usa    FOR TABLE exchange_rates WHERE (country = 'us');
      CREATE PUBLICATION rate_uk     FOR TABLE exchange_rates WHERE (country = 'uk');
      CREATE PUBLICATION rate_france FOR TABLE exchange_rates WHERE (country = 'fr');

      Applying row filters

      Row filters are applied before deciding to publish the change. If the row filter expression evaluates to NULL or false, then the row will not be replicated.

      Row filters are ignored for TRUNCATE TABLE commands.

      Filtered INSERTs are replicated as INSERTs.

      Filtered DELETEs are replicated as DELETEs.

      Filtered UPDATEs are more complicated because the objective is that the subscriber-side replicated table data must end up obeying the row filter expression as defined on the publisher. To decide what action is necessary to achieve this means whenever an UPDATE is processed, the row filter expression must be evaluated for both the old and the new row data (that is, before and after the UPDATE).

      • If the old row doesn't satisfy the row filter expression (we assert this row does not exist on the subscriber), but the new row does, then for data consistency the new row should be added to the subscriber — the UPDATE is transformed to an INSERT.
      • If the old row satisfies the row filter expression (we assert this row was already sent to the subscriber), but the new row doesn't, then for data consistency the old row should be removed from the subscriber — the UPDATE is transformed to a DELETE.
      UPDATE transformations
      Old row New row Replication method
      - - Don't replicate anything
      - Match filter INSERT the new row on the subscriber
      Match filter - DELETE the old row on the subscriber
      Match filter Match filter Replicate the UPDATE on the subscriber

      Combining multiple row filters

      Sometimes a subscription may subscribe to multiple publications where the same table was published with different row filter WHERE clauses. If this happens, then those expressions will be OR-ed together so that rows satisfying any of the expressions will be replicated.

      This also means that if one of subscribed publications for the table has no row filter, all other row filters for that table will be ignored.

      Initial table synchronization

      When the user elects to use the (default) copy_data = true subscription parameter, then only the pre-existing data satisfying the row filters of the publication(s) will be copied.

      Enhancements to psql commands

      As part of this new feature, the PostgreSQL 15 psql commands were also modified to provide useful information about the row filter expressions.

      The display table command (\d) now shows the WHERE clause on any publication the table is a member of. The display publication command (\dRp+) now shows the WHERE clause of any table.


      Example:

      postgres=# CREATE TABLE data(id int, rgb text);
      CREATE TABLE
      postgres=# CREATE PUBLICATION pub_data_all FOR TABLE data;
      CREATE PUBLICATION
      postgres=# CREATE PUBLICATION pub_data_blue FOR TABLE data WHERE (rgb = 'B');
      CREATE PUBLICATION
      postgres=# CREATE PUBLICATION pub_data_red FOR TABLE data WHERE (rgb = 'R');
      CREATE PUBLICATION
      postgres=# \d data
                      Table "public.data"
       Column |  Type   | Collation | Nullable | Default
      --------+---------+-----------+----------+---------
       id     | integer |           |          |
       rgb    | text    |           |          |
      Publications:
          "pub_data_all"
          "pub_data_blue" WHERE (rgb = 'B'::text)
          "pub_data_red" WHERE (rgb = 'R'::text)

      postgres=# \dRp+
                                Publication pub_data_all
        Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
      ----------+------------+---------+---------+---------+-----------+----------
       postgres | f          | t       | t       | t       | t         | f
      Tables:
          "public.data"

                               Publication pub_data_blue
        Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
      ----------+------------+---------+---------+---------+-----------+----------
       postgres | f          | t       | t       | t       | t         | f
      Tables:
          "public.data" WHERE (rgb = 'B'::text)

                                Publication pub_data_red
        Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
      ----------+------------+---------+---------+---------+-----------+----------
       postgres | f          | t       | t       | t       | t         | f
      Tables:
          "public.data" WHERE (rgb = 'R'::text)

      postgres=#

      Effect

      There are several reasons why a user might choose to define logical replication row filters:

      • To reduce network traffic (increase performance) by replicating only a small subset of a large data table.
        Example:

      CREATE PUBLICATION pub_data_2205
      FOR TABLE bank_transactions WHERE (year = 2022 AND month = 'May');

      • To reduce noise or eliminate bad data.
        Example:

      CREATE PUBLICATION pub_data_good
      FOR TABLE sensor WHERE (value IS NOT NULL AND value > 0);

      •  To provide only the data that is relevant to a subscriber node.
        Example:

      CREATE PUBLICATION pub_data_new_york
      FOR TABLE weather WHERE (state = 'NY');

      • To act as a form of security by hiding (not replicating) sensitive information.
        Example:

      CREATE PUBLICATION pub_data_salary
      FOR TABLE employees WHERE (role <> 'manager');

      For the future

      The addition of row filtering was a major feature for PostgreSQL 15 logical replication. This feature gives another tool for the user’s toolbox, allowing them to create more sophisticated or tailored logical replication solutions. The Fujitsu OSS team will continue to help enhance and add more capabilities to PostgreSQL logical replication.

      A related feature to row filters, which was also added in PostgreSQL 15, was logical replication column lists. This will be the topic of an upcoming blog from one of my colleagues.

      img-man-using-laptop-with-overlaid-file-and-magnifying-glass-icons-01Where you can find more information

      Subscribe to be notified of future blog posts
      If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.
      We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.

       

      Explore PostgreSQL Insider >

      Topics: PostgreSQL, PostgreSQL community, Logical replication

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >

      Read our latest blogs

      Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >