<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      PostgreSQL's new load balancing feature in libpq enhances read query scaling and optimizes database performance.

      Before we start…

      It is widely known that PostgreSQL is a popular open-source relational database management system, and before I share the details about the support for load balancing in libpq, I would just like to mention that PostgreSQL was voted DBMS of the Year 2023.

      A huge contributing factor for this impressive achievement is surely the array of improvements added to PostgreSQL 16.

      This version comes with a series of improvements such as query parallelism, bulk data loading, and logical replication, along with numerous features for both developers and administrators. These include expanded SQL/JSON syntax, new monitoring statistics for workloads, and increased flexibility in setting access control rules for management of policies across large fleets. You can view the full list in the PostgreSQL website here.

      Introducing load balancing in libpq

      Here, I'll discuss one of the most eagerly awaited features of this release: support for load balancing in libpq, which will facilitate PostgreSQL read query scaling.

      Now, let's dive in….

      What is libpq?

      libpq is the official C application programming interface to PostgreSQL. It provides a set of library functions that allow client applications to communicate with a PostgreSQL server. Popular utilities like psql, pgbench, and many more use libpq to communicate with PostgreSQL.

      Other application interfaces, such as C++, Perl, Python, TCL, and ECPG, are built on libpq and inherit some of its behaviour and environment configuration.

      What is load balancing?

      To put it simply, load balancing is the process of distributing the connections between multiple read replicas while handling a large number of read requests.

      It is possible to achieve load balancing in two primary ways:

      • At the client level by using the JDBC, which is a very popular Java API, to establish the connection and execute the query with the database.
      • By using a middleware load balancer like HAproxy or pgpool-II, which distribute the requests across multiple read replicas.

      To use libpq to distribute the load across multiple servers, a new variable called load_balance_hosts has been introduced in PostgreSQL 16. So, effective load balancing across multiple PostgreSQL instances becomes possible by this functionality.

      This feature ensures that PostgreSQL deployments are more robust and efficient than ever before, whether it is through distributing read operations or ensuring that no single instance is overloaded. There are 2 possible values for load_balance_hosts, as follows:

      • disable: No load balancing is performed across the multiple hosts.
      • random: Hosts and addresses are tried in random order.

        This value is mostly useful when opening multiple connections at the same time, possibly from different machines. This way connections can be load balanced across multiple PostgreSQL servers.

      Putting it to the test

      Let's test this functionality with 3 nodes of PostgreSQL 16 and with the default postgres database as below. This describes the current support model and soon we will release a new support model with different support options.

      Host name IP address Postgres port Database user Role
      test_node1 192.168.10.107 27500 test_user Primary
      test_node2 192.168.10.102 27501 test_user Standby1
      test_node3 192.168.10.100 27502 test_user Standby2

      Test scenario1: load_balance_hosts=disable

      Let's try to establish the connections with all three active hosts provided in the connection string.

      img-dgm-load-balancing-in-libpq-01-step-1

      [postgres@test-nodel ~]$ psql 'host=192.168.10.107, 192.168.10.102, 192.168.10.100
                                     port=27500,27501,27502 dbname=postgres user=postgres'
                                    -c "SELECT inet server addr(), inet server port();"
       inet_server addr | inet_server port
      ------------------+------------------
      192.168.10.107    |            27500
      (1 row)

      The result is quite as expected: PostgreSQL only made a successful connection to the first available destination - test_node1 (192.168.10.107:27500).

      Now, let's stop the PostgreSQL service on test_node 1 (host 192.168.10.107) and observe the connection attempt behavior.

      img-dgm-load-balancing-in-libpq-01-step-2

      [postgres@test-nodel ~]$ pg_ctl -D /opt/pgsqll6 stop
      waiting for server to shut down.... done
      server stopped
      [postgres@test-nodel ~]$ psql 'host=192.168.10.107,192.166.10.102, 192.168.10.100
                                    port=27500,27501,27502 dbname=postgres user=postgres'
                                     -c "SELECT inet_server_addr(), inet_server_port();"
       inet_server addr | inet_server port
      ------------------+------------------
      192.168.10.102    |            27501
      (1 row)

      Again, the result is quite as expected: PostgreSQL only made a successful connection to the next available destination - test_node2 (192.168.10.102:27501).

      Finally, let's stop the PostgreSQL service on test_node2 (host 192.168.10.102) and try to connect again.

      img-dgm-load-balancing-in-libpq-01-step-3

      [postgres@test-node2 ~]$ pg_ctl -D /opt/pgsqll6 stop
      waiting for server to shut down.... done
      server stopped
      [postgres@test-nodel ~]$ psql 'host=192.168.10.107,192.166.10.102, 192.168.10.100
                                    port=27500,27501,27502 dbname=postgres user=postgres'
                                     -c "SELECT inet_server_addr(), inet_server_port();"
       inet_server addr |  inet_server port
      -------------------+-------------------
      192.168.10.100     |             27502
      (1 row)

      We can see that we manage a successful connection with the only remaining host - test_node3 (192.168.10.100:27502).

      Test scenario 2: load_balance_hosts=random

      As we set load_balance_hosts to random, the connections are randomly assigned to the available destinations, resulting in connection distribution across the multiple available destinations.

      [postgres@test-nodel ~]$ psql 'host=192.168.10.107,192.168.10.102,192.168.10.100
                                     port=27500,27501,27502 dbname=postgres user=postgres
                                     load_balance_hosts=random' -c "SELECT inet_server_addr(), inet_server port();"
      inet_server_ addr | inet server port
      -------------------+------------------
      192.168.10.102    |            27501
      (1 row)

      [postgres@test-nodel ~]$ psql 'host=192.168.10.107,192.168.10.102,192.168.10.100
                                     port=27500,27501,27502 dbname=postgres user=postgres
                                     load_balance_hosts=random' -c "SELECT inet_server_addr(), inet_server port();"
      inet_server_ addr | inet server port
      -------------------+------------------
      192.168.10.107    |            27500
      (1 row)

      [postgres@test-nodel ~]$ psql 'host=192.168.10.107,192.168.10.102,192.168.10.100
                                     port=27500,27501,27502 dbname=postgres user=postgres
                                     load_balance_hosts=random' -c "SELECT inet_server_addr(), inet_server port();"
      inet_server_ addr | inet server port
      -------------------+------------------
      192.168.10.100    |            27502
      (1 row)

      If we stop the PostgreSQL service on test_node1 (host 192.168.10.107), then connections will be randomly assigned to the remaining available hosts - test_node2 (192.168.10.102) and test_node3 (192.168.10.100).

      Summary

      By distributing database connections across multiple servers, developers and database administrators will be able to reduce the load on individual servers particularly replicas and improve system availability and performance.

       

      Topics: PostgreSQL, Database performance, Load balancing, libpq

      Receive our blog

      Search by topic

      Posts by Tag

      See all
      Fujitsu Enterprise Postgres
      The hybrid multi-cloud Postgres backed by Fujitsu
      Nishchay Kothari
      Technical Consultant, Fujitsu Enterprise Postgres Center of Excellence
      Nishchay Kothari is an outstanding technical consultant with over 13 years of expertise in relational database management systems (RDBMS). Nishchay has experience with a wide range of database technologies, including PostgreSQL, SQL Server, and Oracle.
      Nishchay has positioned himself as a go-to resource for organizations wanting to optimize their database infrastructure and architectural solutions driven by his passion for addressing complicated technological challenges.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >