<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      The PostgreSQL Development Conference 2024 was held earlier this year for the community to nurture the further expansion of PostgreSQL. Fujitsu's OSS team was delighted to give 2 talks in this year's exciting line-up that highlights topics on PostgreSQL development and community growth, featuring stories from users, developers, and community organizers.

      I attended the PostgreSQL Development Conference 2024, which was held from in Vancouver, Canada. In this article, I will mainly introduce the content of the session I gave.

      This was my second time attending an international conference, following last year, where I introduced my work on logical replication. I once again realized that PostgreSQL's scalability holds infinite possibilities.

      What is the PostgreSQL Development Conference (PGConf.dev)?

      Port of Vancouver near the conference venue

      PGConf.dev is an international conference where PostgreSQL developers and community managers gather to give talks and hold discussions.

      It stands out from general technical events by placing a strong emphasis on fostering meaningful interactions between PostgreSQL developers and community managers. Unlike conferences that are saturated with corporate advertising, PGConf.dev provides a platform where professionals can engage in insightful discussions, share innovative ideas, and collaborate on advancing the PostgreSQL ecosystem.

      Until last year, a conference called PostgreSQL Conference (PGCon) was held, but PGConf.dev is its successor, continuing the tradition of bringing together experts and enthusiasts in the field. This year, it was held at Simon Fraser University (SFU) in Vancouver, Canada. The venue at SFU not only offered a conducive environment for learning and collaboration but also added a touch of academic charm to the conference setting.

      Three people from Fujitsu - me, Amit Kapila, and Zhijie Hou - attended and hosted two talks.

      My session: New features added to logical replication

      My talk was mainly about logical replication

      Logical replication is a mechanism that extracts changes made to data and replicates them to another PostgreSQL server. A well-known replication feature of PostgreSQL is streaming replication, but this feature requires that the physical representation of data be consistent between nodes, so replication cannot be performed on heterogeneous operating systems or between different major versions of PostgreSQL. Logical replication relaxes these restrictions, making it possible to build a more flexible system.

      Differences between streaming replication and logical replication

        Streaming replication Logical replication
      Instance terminology Primary / standby Publisher / subscriber
      Type of content sent Exact WAL records Replication messages, information extracted from WAL 
      Initial synchronization pg_basebackup Automatic
      Replication target DB cluster Database
      What downstream can do Read-only queries Read and write queries
      Environments OS and major versions must be the same Can be different

      Starting with PostgreSQL 17, a new server application for creating logical standbys (subscribers) will be added, and pg_upgrade will be available without destroying logical replication configurations. I explained these new features of the upcoming version

      Resolving the issues of setting up new logical replication in large-scale environments

      Although logical replication is still being actively developed, it still has some problems.

      One of them is that it is difficult to set up new logical replication in a large-scale environment. In logical replication, a COPY statement is first issued for all target tables to perform initial data synchronization. Therefore, initial data synchronization may take a long time depending on the number of tables and the amount of data involved.

      In addition, since logical replication needs to keep the WAL generated during synchronization, if the synchronization time is too long, the WAL storage disk may become full and the server process may crash.

      Known challenges in creating a subscriber

      • Takes a long time
        • Initial synchronization runs COPY command per table
        • Estimated execution time is proportional to the number of tables
      • Requires additional disk resources
        • Replication slots are created while copying date
        • Generated WAL files are preserved
        • They may fill up disk, which is very problematic

       

      Therefore, we focused on read replicas (asynchronous physical standbys) that may exist in the system, and developed pg_createsubscriber, a server application that converts physical standbys into subscribers.

      Since the problem is caused by having to copy from scratch a large amount of data, the time required for initial synchronization can be reduced by performing streaming replication to a certain extent, and building logical replication based on the nodes that are following the changes. The problem of large amounts of remaining WAL is solved by not performing initial data synchronization using the COPY statement in the first place.

      New server application pg_createsubscriber in PostgreSQL 17

      • Converts physical standy into logical subscriber
        • Confirms the standby is caught up at the correct point, then
        • Defines subscriptions on the standby
      • Done by introducing a new server application
          • Must be executed on the standby server

            $ pg_createsubscriber [option...] { -d | --database         } dbname
                                              { -D | --pgdate           } datadir
                                              { -P | --publisher-server } connstr

      • Pushed on HEAD

      Simplifying the upgrade process for logical replication clusters

      PostgreSQL 17 also solves another issue that logical replication had: it is not (practically) compatible with pg_upgrade.

      When building a logical replication environment, objects such as replication slots and replication origins are generated. These are necessary to record the replication status, such as the WAL transmission status and application status, but because they are node-specific information, they are not migrated by upgrades using pg_upgrade. Therefore, after upgrading a node that builds logical replication, users had to manually reconstruct these internal objects.

      For this reason, we have improved pg_upgrade to reference and rebuild these internal objects. This function makes it possible to automatically resume replication even when upgrading a node that has logical replication.

      Wrapping up

      img-badge-person-01This was my second time attending an international conference, following last year, where I introduced my work on logical replication.

      Because PGConf.dev focuses on interaction between developers, I was able to discuss solutions to logical replication issues with the participating developers, and I was happy to have insightful discussions with my fellow professionals there. I once again realized that PostgreSQL's scalability holds infinite possibilities.

      Our team at Fujitsu hopes to continue to actively propose ideas and participate in discussions at conferences, further contributing to the development of PostgreSQL.

      Further information

      For details about the talks at PGConf.dev 2024, please see below.

       

      Topics: PostgreSQL community, PostgreSQL development, Logical replication, PostgreSQL event

      Receive our blog

      Search by topic

      see all >
      Hayato Kuroda
      Software Development Engineer, Fujitsu OSS PostgreSQL team
      Kuroda has been working on the development of PostgreSQL and Fujitsu Enterprise Postgres.

      He has experience in enhancements for ECPG, postgres_fdw, and logical replication modules. He continues to attend Postgres conferences to share his expertise with the community.
      Our Migration Portal helps you assess the effort required to move to the enterprise-built version of Postgres - Fujitsu Enterprise Postgres.
      We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.

       

      Explore PostgreSQL Insider >
      Subscribe to be notified of future blog posts
      If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.

      Read our latest blogs

      Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >