Coming up in Postgres - PostgreSQL 17 and trends and innovations to watch

The PostgreSQL community has released version 17, with some key features enabling PostgreSQL to continue to be adopted by businesses as the most advanced open-source database.

True to tradition, this latest release of PostgreSQL comes with lots of improvements and new capabilities. So let me take you through some of them.

Contents

> Evolution of PostgreSQL

> PostgreSQL 17 enhancements and new features

> Improvements to storage

> Improvements to logical replication

> Improvements to SQL

> Improvements to security/SQL

> Improvements to SQL/JSON

> Improvements to monitoring

> What's next: PostgreSQL 18 and beyond

The evolution of PostgreSQL - versions and major milestones

Before diving into the exciting new features included in PostgreSQL 17, I would like to take a moment to look back at how PostgreSQL has evolved over the years, and how each release builds a foundation to the next one.

img-dgm-postgresql-timeline-to-postgresql-17-and-beyond

Evolution of PostgreSQL with its major milestones across versions

After being inherited from a university project, between version 6.0 to 9.6, numerous enterprise features were added which allowed PostgreSQL to be used for larger and more critical applications. Areas of improvement include streaming replication and synchronous replication for failover and High Availability, support for JSONB and JSON data types, realizing index only scans, and parallelism. All of these capabilities collectively have encouraged more users to adopt PostgreSQL, and the community has seen that PostgreSQL started to become the mainstream database around this time.

Gaining momentum, PostgreSQL continued its evolution from version 10 and has now reached a maturity to be recognized as an enterprise-class database worldwide. It is widely known that major companies such as AWS, Microsoft, and Google are using PostgreSQL and also providing PostgreSQL as the database layer for their solution and services. Declarative partitioning and logical replication were two big features in the version 10 release. SQL stored procedures in version 11 has helped facilitate the migration from other proprietary databases such as Oracle. I believe it is fair to say that one of the key features of version 12 is the introduction of table access methods, which allow pluggable storage engines to write storage needs specific to their needs in PostgreSQL. Note that partitioning enhancements were implemented between these three releases; version 11 introducing partitioning by hash keys and version 12 was released with performance improvements.

From there, continuous feedback from users and passionate engineers in the community have successfully pushed PostgreSQL to higher levels of usability and performance, achieving better support for large data volumes. Version 13 was released with parallel vacuum, de-duplication in B-tree index, and incremental sorting. Read performance was improved with version 14 with its snapshot scalability at very high client count, along with a mechanism to APPLY the updates for in-progress transactions to reduce the lag of large transactions. Other feature improvements are reducing BLOAT for B-tree index updates and parallel foreign scans were added in postgres_fdw.

In version 15, the MERGE command was added, various improvements to logical replication were applied, and statistics were moved from the backend to the shared memory which formed the basis of adding more number of statistics in the coming years. Row filtering and column filtering were added to logical replication, both with server-side compression.

In version 16, workload distribution between primary and standby for logical replication was enabled by allowing logical replication from standby to primary. Parallel apply was created to reduce apply lag for large transactions. Sequel JSON standards was improved and faster relation extension and load balancing by libpq was also developed.

All of the milestone achievements in feature enhancements led to this year’s release of PostgreSQL 17, the highlights of which are:

Failover slots which allow logical replication to continue after failover
Incremental backups for large backup use cases
1GB limit for vacuum clean up has been removed
JSON table feature for better compliance with SQL JSON standards

PostgreSQL 17 enhancements and new features

Now let's take a look at some details of the features of version 17.

Improvements to storage

Incremental backups
- Reduced backup data size - only the changed blocks are copied
- Combine a base backup and incremental backups to reconstruct a full backup using pg_combinebackup
- A new background worker named summarize_wal needs to be enabled to collect WAL
Vacuum improvements
- Faster index cleanup and reduce its memory usage
  This is achieved by changing the mechanism to remove dead tuples during vacuum to use TID store to collect dead tuples from the heap and use them to clean up the index
- Reduced WAL sync and write time
  The freezing and pruning steps in the vacuum process has been combined to produce a single WAL record
- Reduced WAL volume
  Vacuuming of relations with no indexes are optimized by reducing the steps by marking them LP_UNUSED directly
Faster reads by using streaming APIs
Improved performance of subsystems on top of SLRU
SLRU cache sizes are made configurable and the subcache chunks are divided into units named banks so that the impact of eviction buffer search and LWLock is limited to a specific bank
Allow table Ams to skip fetching a block from the heap
Improved execution time of queries that use the IN/ANY clause with a B-tree index by optimizing array matches in the B-tree index
Improved performance of heavily contended WAL writes (client count of 256 and above)

Improvements to logical replication

Failover for logical replication
Even if the primary/publisher node goes down, the subscription can continue by getting data from the physical standby without human intervention
CREATE SUBSCRIPTION sub CONNECTION '$connstr'
PUBLICATION pub WITH (failover = 'true’)
Upgrade of logical replication nodes
Logical replication setups will be retained after major version upgrades where the older version is 17
Faster logical decoding in cases with many subtransactions
Allow hash index to be used for lookup during APPLY when primary key or replica identity are not available on the subscriber
Easier replication setup for subscriptions
The new tool pg_createsubscriber can be used to copy all tables from any physical standby. My colleague Hayato Kuroda has published a blog explaining this feature here.

Regarding the future direction of logical replication to make it enterprise ready, it is my understanding that there are two main elements to work on: a concrete conflict detection resolution mechanism and some cluster-wide sequences. Other features to work on may also include DDL replication feature and node management APIs to be made available. This is an exciting space and I hope in the future releases, we will move towards this direction.

Improvements to SQL

Partitioning
- Support identity columns in partitioned tables
- Allow exclusion constraints on partitioned tables
  As long as the exclusion constraints compare partition key columns for equality, other columns can use exclusion constraint-specific comparisons
  
  CREATE TABLE idxpart (a int4range, b int4range, c int4range,
  EXCLUDE USING GIST (b with =, c with &&)) PARTITION BY RANGE (a);
  ERROR: unique constraint on partitioned table must include all partitioning columns,
  DETAIL: EXCLUDE constraint on table "idxpart" lacks column "a" which is part of the partition key.
  
  CREATE TABLE idxpart (a int4range, b int4range, c int4range,
  EXCLUDE USING GIST (a with =, b with =, c with &&)) PARTITION BY RANGE (a, b);
Use multiple workers to build BRIN indexes
Queries that generate initPlans can use parallel workers to execute initPlan
EXPLAIN (COSTS OFF) SELECT c1 FROM t1 WHERE c1 = (SELECT 1);
               QUERY PLAN
------------------------------------------
  Gather
   Workers Planned: 2
   InitPlan 1
    -> Result
   -> Parallel Seq Scan on t1
        Filter: (c1 = (InitPlan 1).col1
Eliminated IS NOT NULL query restrictions on NOT NULL columns
CREATE TABLE pred_tab (a int NOT NULL, b int, c int NOT NULL);
EXPLAIN (COSTS OFF) SELECT * FROM pred_tab t WHERE t.a IS NOT NULL;
        QUERY PLAN
  ------------------------
   Seq Scan on pred_tab t
Eliminated scans on NOT NULL columns if IS NULL is specified
EXPLAIN (COSTS OFF) SELECT * FROM pred_tab t WHERE t.a IS NULL;
        QUERY PLAN
  --------------------------
   Result
      One-Time Filter: false
Allow correlated IN subqueries to be transformed into joins
EXPLAIN (costs off) SELECT * from tenk1 A WHERE hundred in
(select hundred from tenk2 B where B.odd = A.odd);
                        QUERY PLAN
  ------------------------------------------------------------
  Hash Join
    Hash Cond: ((a.odd = b.odd) AND (a.hundred = b.hundred))
    -> Seq Scan on tenk1 a
    -> Hash
        -> HashAggregate
              Group Key: b.odd, b.hundred
              -> Seq Scan on tenk2 b
Improved CTE plans by considering the statistics and sort order of columns referenced in earlier row output clauses
Allow pushdown of EXISTS and IN subqueries to the postgres_fdw foreign server
EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.c1 FROM ft1 t1 WHERE EXISTS (SELECT 1 FROM ft2 t2 WHERE t1.c1 = t2.c1)
                             ORDER BY t1.c1 OFFSET 100 LIMIT 10;
Foreign Scan
  Output: t1.c1
  Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
  Remote SQL: SELECT r1."C 1" FROM "S 1"."T 1" r1 WHERE EXISTS (SELECT NULL FROM "S 1"."T 1" r2
               WHERE ((r2."C 1" = r1."C 1"))) ORDER BY r1."C 1" ASC NULLS LAST LIMIT 10::bigint OFFSET 100::bigint
Allow joins with non-join qualifications to be pushed down to foreign servers and custom scans
MERGE command feature
- MERGE command now supports RETURNING clause
  MERGE INTO products p USING stock s ON p.product_id = s.product_id
      WHEN MATCHED AND s.quantity > 0 THEN UPDATE SET in_stock = true, quantity = s.quantity
      WHEN NOT MATCHED THEN INSERT (product_id, in_stock, quantity) VALUES (s.product_id, true, s.quantity)
      RETURNING merge_action(), p.*;
  Xmerge_action | product_id | in_stock | quantity
  --------------+------------+----------+----------
  UPDATE        |    1001    | t        |       50
  INSERT        |    1003    | t        |       10
- MERGE command supports WHEN NOT MATCHED BY SOURCE
- MERGE command can modify updatable views
Introduced a trigger in the login event, allowing to fire some actions right on the user connection
Speeded up the serial portion of parallel aggregates and better scales the following in parallel queries: sum(numeric)avg(numeric)var_pop(numeric)variance(numeric)stddev_pop(numericstddev_samp(numeric)stddev(numeric)array_agg(anyarray) string_agg(text)string_agg(bytea)
Introduced built-in collation provider

Improvements to security/SQL

Avoid the need to grant superuser privileges by using the pg_maintain role for VACUUM, ANALYZE, CLUSTER, REFRESH MATERIALIZED VIEW, REINDEX, and LOCK TABLE; alternatively, MAINTAIN privilege can be granted to users on a table
Make TLS connections without a network round trip negotiation
ALTER SYSTEM improvements
- Allow ALTER SYSTEM to set unrecognized custom server variables
- Add system variable allow_alter_system to disallow ALTER SYSTEM

Improvements to SQL/JSON

Introduced function JSON_TABLE () to convert JSON data to a table representation
CREATE TABLE my_films ( js jsonb );

INSERT INTO my_films VALUES (
'{ "favorites" : [
    { "kind" : "horror", "films" : [
    { "title" : "Psycho",
      "director" : "Alfred Hitchcock" } ] }
] }');
SELECT jt.* FROM my_films,
      JSON_TABLE (js, '$.favorites[*]'
        COLUMNS (id FOR ORDINALITY,
                 kind text PATH '$.kind',
                 title text PATH '$.films[*].title’,
                 director text PATH '$.films[*].director')) AS jt;
id |  kind  |  title | director
----+--------+--------+------------------
  1 | horror | Psycho | Alfred Hitchcock
SQL/JSON constructor functions
- JSON() - Converts a given expression specified as text or bytea string (in UTF8 encoding) into a JSON value
  JSON('{"a":123, "b":[true,"foo"], "a":"bar"}'){"a":123, "b":[true,"foo"], "a":"bar"}
- JSON_SCALAR() - Converts a given SQL scalar value into a JSON scalar value
  JSON_SCALAR(123.45)123.45
- JSON_SERIALIZE() - Converts an SQL/JSON expression into a character or binary string
  JSON_SERIALIZE('{ "a" : 1 }' RETURNING bytea)\x7b20226122203a2031207d20
SQL/JSON query functions
- JSON_EXISTS() - Returns true if the SQL/JSON path_expression applied to the JSON value yields any items
  SELECT JSON_EXISTS(jsonb '{"key1": [1,2,3]}', '$.key1[2]’);
  ------------------------
  t
- JSON_QUERY() - Returns the result (JSON, array, or string) of applying the SQL/JSON path_expression to the JSON value
  SELECT JSON_QUERY(jsonb '{"a": "[1, 2]"}', '$.a’);
  ------------------------
  [1, 2]
- JSON_VALUE() - Returns the result (SQL/JSON scalar) of applying the SQL/JSON path_expression to the JSON value
  SELECT JSON_VALUE(jsonb '[1,2]', '$[1]’);
  ------------------------
  2

Improvements to monitoring

New view pg_wait_events
-[ RECORD 1 ]---+---------------------------------------------
pid | 21090
state|
wait_event_type | Activity
wait_event | CheckpointerMain
description | Waiting in main loop of checkpointer process
All checkpointer-related stats could be found in pg_stat_checkpointer
Index Vacuum progress in pg_stat_progress_vacuum
Removed the parameter old_snapshot_threshold

Full list

For the PostgreSQL enthusiasts and the readers interested in the full list of new/enhanced features and other changes, you can view it here.

What's next: PostgreSQL 18 and beyond

Last but not least, I would like to close off this article by sharing some of the active discussions in the community. Please be reminded that none of these features are claimed to be provided in the next or future releases.

Transparent column encryption - automatic, transparent encryption and decryption of particular columns in the client
Asynchronous I/O - index prefetch to improve index performance, allow prefetching data to improve system performance, and vectored I/O for bulk writes
Import/export statistics to help to run queries after upgrade without first running analyze
Enhance table AM APIs to suit different storage engines
Amcheck for Gist and Gin indexes
Various improvements in Logical Replication - DDL Replication, replication of sequences, conflict detection and resolution, node management APIs, slot invalidation for unused slots, among others
Executor improvements - special-case executor expression steps for common combinations (JIT generated code simpler), JIT compilation per plan node, SQL standard Row Pattern Recognition (RPR)
Improvements in partitioning technology, especially in pruning when large number of partitions are present
Improvements in Indexing especially in nbtree
Optimizer improvements to make various kind of queries work better
Introduce compression at wire_protocol_level
Parallelism - parallelize vacuum on tables, parallel Create Index for GIN Indexes, parallelize correlated subqueries, TID range scan
64bit XIDs which allows to avoid freezing and reduce the need of autovacuum
WAL Size reduction with smaller headers in WAL
TOAST improvements such as custom formats and compression dictionaries
Stats - split index and table statistics into different types of stats, more stats
CI and build system improvements

Final thoughts

As we look ahead to the future of PostgreSQL, it’s clear that version 17 marks a significant milestone in its evolution, with its enhancements in performance, scalability, and security. The community continues to adapt PostgreSQL to the ever-changing demands of modern data management, and the commitment to innovation and excellence ensures that PostgreSQL will remain a cornerstone for developers and enterprises alike.

Whether you’re a long-time user or new to PostgreSQL, the release of version 17 offer exciting opportunities to leverage its capabilities for your projects.

As we embrace these new features, we also look forward to the continued growth and development of PostgreSQL, driven by a passionate and dedicated community. I very much look forward to continuing my participation and contributions to the PostgreSQL community.

Fujitsu PostgreSQL blog

< Back to blog home Fujitsu PostgreSQL blog

Coming up in Postgres - PostgreSQL 17 and trends and innovations to watch
Amit Kapila | September 27, 2024

PostgreSQL has released a new version, with a host of updates and improvement to the world’s most advanced open-source database. In this blog post, I want to take you through the key new features implemented in PostgreSQL 17.

True to tradition, this latest release of PostgreSQL comes with lots of improvements and new capabilities. So let me take you through some of them.

The evolution of PostgreSQL - versions and major milestones