Stats collector is a critical module that collects and reports various statistics about database performance and usage. In this post, I'll go over the changes made in PostgreSQL 15, and how they can help database administrators.
Up to PostgreSQL 14, the statistics collector received statistics updates via a UDP socket and shared the received statistics by writing them to temporary files regularly. These files can be quite large, and are frequently written out in a split second. This can create a bottleneck when adding additional useful statistics.
What is the Stats Collector doing in PostgreSQL?
The statistics collector is a background process that collects information about activities on the local server, like table/index access and vacuum/analyze activity. It also tracks the total number of rows in each table. Additionally, it can also count calls to user-defined functions and the total time spent in each one. Information gathered by the statistics collector is made available via the various pg_stat_* and pg_statio_* views.
Below are the parameters relevant for the statistics collector:
- track_activities
- track_activity_query_size
- track_counts
- track_functions
- track_io_timing
- track_wal_io_timing
- stats_temp_directory
- stats_fetch_consistency (from PostgreSQL 15 onwards)
If we compare the list of the PostgreSQL 14 and 15 default background processes, we can easily find that the stats collector background process is not in the list of PostgreSQL 15.
On PostgreSQL-15
[root@localhost opt]# ps -ef | grep postgres
postgres 4942 0 08:43 ? 00:00:00 /usr/pgsg1-15/bin/postgres -D /opt/PGSQL15
postgres 4943 4942 0 08:43 ? 00:00:00 postgres: logger
postgres 4944 4942 0 08:43 ? 00:00:00 postgres: checkpointer
postgros 4945 4942 0 08:43 ? 00:00:00 postgres: background writer
postgres 4947 4942 0 08:43 ? 00:00:00 postgres: walwriter
postgros 4948 4942 0 08:43 ? 00:00:00 postgres: autovacuum launcher
postgres 4949 4942 0 08:43 ? 00:00:00 postgres: logical replication launcher
root 14523 13927 0 21:59 pts/2 00:00:00 grep --color=auto postgres
[root@localhost opt]#
On PostgreSQL-14
[root@localhost opt]# ps -ef | grep postgres
postgres 14641 1 0 22:01 ? 00:00:00 /usr/pgsg1-14/bin/postgres -D /opt/PGSQL14
postgres 14642 14641 0 22:01 ? 00:00:00 postgres: logger
postgres 14644 14641 0 22:01 ? 00:00:00 postgres: checkpointer
postgros 14645 14641 0 22:01 ? 00:00:00 postgres: background writer
postgres 14646 14641 0 22:01 ? 00:00:00 postgres: walwriter
postgros 14647 14641 0 22:01 ? 00:00:00 postgres: autovacuum launcher
postgres 14648 14641 0 22:01 ? 00:00:00 postgres: stats collector
postgres 14649 14641 0 22:01 ? 00:00:00 postgres: logical replication launcher
root 14678 13927 0 22:01 pts/2 00:00:00 grep --color=auto postgres
[root@localhost opt]#
Challenges for statistics collection up to PostgreSQL 14
As we all know, PostgreSQL uses a simple "process per-user" client/server model. So, each backend of a session is an individual process in PostgreSQL. Just imagine collecting statistics and transmitting them by each backend session. This is not an easy task because each backend needs to send information to a single stats collector process about what activity they have done. As I mentioned above, PostgreSQL will use the UDP sockets for this kind of communication.
Naturally, we can sense that there were a lot of problems with this approach. I am sure many of you have faced problems like:
- LOG: using stale statistics instead of current ones because stats collector is not responding
- Autovacuum working inefficiently
- High I/O
If you set appropriate logging level, you can see messages such as the ones below in the PostgreSQL 14 log:
022-12-19 23:22:03.338 +08 [14641] LOG: parameter "log_min_messages" changed to "debug3"
022-12-19 23:22:03.341 +08 [14644] DEBUG: checkpointer updated shared memory configuration values
022-12-19 23:22:03.343 +08 [14648] DEBUG: received inquiry for database 0
022-12-19 23:22:03.343 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/global.stat"
022-12-19 23:22:03.344 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/db_0.stat"
022-12-19 23:23:03.406 +08 [14648] DEBUG: received inquiry for database 0
022-12-19 23:23:03.406 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/global.stat"
022-12-19 23:23:03.406 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/db_0.stat"
022-12-19 23:23:03.418 +08 [16112] DEBUG: InitPostgres
022-12-19 23:23:03.419 +08 [16112] DEBUG: autovacuum: processing database "postgres"
022-12-19 23:23:03.419 +08 [14648] DEBUG: received inquiry for database 14486
022-12-19 23:23:03.419 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/global.stat"
022-12-19 23:23:03.419 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/db_14486.stat"
022-12-19 23:23:03.419 +08 [14648] DEBUG: writing stats file "pg_stat_tmp/db_0.stat"
The above messages indicate that the stats collector process writes statistical data into temporary files, resulting in a significant amount of I/O because it needs to gather the information frequently. During high load, this may lead to the message "LOG: using stale statistics instead of current ones because stats collector is not responding". To overcome this issue, it’s highly recommended to mount pg_stat_tmp on a RAM-based file system. For this, configure the stats_temp_directory parameter effectively set with RAM-based file system path.
On most systems, the default location of stats_temp_directory is within the data directory only.
Statistics collection improvement in PostgreSQL 15
Let's see what changes the community has made in PostgreSQL 15.
As previously stated, until PostgreSQL 14, the stats collector used a file system to write the stats, but starting with PostgreSQL 15, the community switched to using dynamic shared memory. Now, with this approach, there is no requirement to set stats_temp_directory, but to support the functionality of extensions like pg_stat_statements, empty pg_stat_tmp directory will be there.
All statistics changes are initially gathered locally in each process as "pending" in the new PostgreSQL 15 improvement. The stats are "pending" in the sense that they have been gathered but have not yet been entered into the shared stats system. Later, shortly after a commit or by timeout, they will be flushed to shared memory. The challenge of how to manage the retrieval of these statistics now arises. Therefore, the community's introduction of the new option known as stats_fetch_consistency is the answer.
When cumulative statistics are retrieved many times during a transaction, the stats_fetch_consistency option will control how things will behave. There are three possible options for this parameter: none, cache, and snapshot.
- If the value is set to "none," shared memory counts are re-fetched with each access
- When set to "cache," an object's first access to its statistics is stored there until the transaction is complete, unless pg_stat_clear_snapshot() is performed. This is the default option.
- If the option is set to "snapshot," all statistics available in the current database are cached at initial access until the transaction is complete, unless pg_stat_clear_snapshot() is performed.
What happens at restart?
Now that you've read this, the obvious question is "what will happen if PostgreSQL restarts?"
In that situation, PostgreSQL will save the statistics in a file named pgstat.stat in the pg_stat directory on the disk. Since the checkpointer process will have controlled the whole stats writing process right before shutdown, it will be loaded once again during the starting procedure of PostgreSQL.
PostgreSQL log entries during the Stop event
2022-12-21 10:39:54.260 +08 [6967] DEBUG: postmaster received signal 2
2022-12-21 10:39:54.260 +08 [6967] LOG: received fast shutdown request
2022-12-21 10:39:54.262 +08 [6967] LOG: aborting any active transactions
2022-12-21 10:39:54.262 +08 [6974] DEBUG: logical replication launcher shutting down
2022-12-21 10:39:54.263 +08 [6973] DEBUG: autovacuum launcher shutting down
2022-12-21 10:39:54.267 +08 [6967] LOG: background worker "logical replication launcher" (PID 6974) exited with exit code 1
2022-12-21 10:39:54.267 +08 [6969] LOG: shutting down
2022-12-21 10:39:54.270 +08 [6969] LOG: checkpoint starting: shutdown immediate
2022-12-21 10:39:54.270 +08 [6969] DEBUG: performing replication slot checkpoint
2022-12-21 10:39:54.276 +08 [6969] DEBUG: checkpoint sync: number=1 file=pg_multixact/offsets/0000 time=0.715 ms
2022-12-21 10:39:54.276 +08 [6969] DEBUG: checkpoint sync: number=2 file=pg_xact/0000 time=0.582 ms
2022-12-21 10:39:54.279 +08 [6969] DEBUG: attempting to remove WAL segments older than log file 000000000000000000000002
2022-12-21 10:39:54.279 +08 [6969] DEBUG: SlruScanDirectory invoking callback on pg_subtrans/0000
2022-12-21 10:39:54.279 +08 [6969] LOG: checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write
=0.001 s, sync=0.002 s, total=0.011 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB
2022-12-21 10:39:54.279 +08 [6969] DEBUG: writing stats file "pg_stat/pgstat.stat"
2022-12-21 10:39:54.281 +08 [6967] DEBUG: cleaning up orphaned dynamic shared memory with ID 3552075372
2022-12-21 10:39:54.281 +08 [6967] DEBUG: cleaning up dynamic shared memory control segment with ID 2802447064
2022-12-21 10:39:54.283 +08 [6967] LOG: database system is shut down
2022-12-21 10:39:54.288 +08 [6968] DEBUG: logger shutting down
PostgreSQL log entries during the Start event
2022-12-21 10:39:54.409 +08 (6994) LOG: starting PostgreSQL 15.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat
4.8.5 4.8.5-44), 64-bit
2022-12-21 10:39:54.410 +08 (6994) LOG: listening on IPv6 address "::1", port 5432
2022-12-21 10:39:54.410 +08 (6994) LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-12-21 10:39:54.412 +08 [6994) LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-12-21 10:39:54.415 +08 [6994] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-12-21 10:39:54.418 +08 (6996) DEBUG: checkpointer updated shared memory configuration values
2022-12-21 10:39:54.420 +08 (6998) LOG: database system was shut down at 2022-12-21 10:39:54 +08
2022-12-21 10:39:54.420 +08 (6998) DEBUG: checkpoint record is at 0/30E1708
2022-12-21 10:39:54.420 +08 (6998] DEBUG: redo record is at 0/30E1708; shutdown true
2022-12-21 10:39:54.420 +08 [6998] DEBUG: next transaction ID: 4667; next DID: 16414
2022-12-21 10:39:54.420 +08 (6998) DEBUG: next MultiXactId: 1; next MultiXactOffset: 0
2022-12-21 10:39:54.420 +08 (6998) DEBUG: oldest unfrozen transaction ID: 717, in database 1
2022-12-21 10:39:54.420 +08 (6998) DEBUG: oldest MultiXactId: 1, in database 1
2022-12-21 10:39:54.420 +08 (6998] DEBUG: commit timestamp Xid oldest/newest: 0/0
2022-12-21 10:39:54.420 +08 [6998] DEBUG: transaction ID wrap limit is 2147484364, limited by database with OID 1
2022-12-21 10:39:54.420 +08 [6998] DEBUG: MultiXactld wrap limit is 2147483648, limited by database with OID 1
2022-12-21 10:39:54.420 +08 (6998] DEBUG: starting up replication slots
2022-12-21 10:39:54.420 +08 (6998) DEBUG: starting up replication origin progress state
2022-12-21 10:39:54.421 +08 (6998) DEBUG: reading stats file "pg_stat/pgstat.stat"
2022-12-21 10:39:54.421 +08 (6998] DEBUG: removing permanent stats file "pg_stat/pgstat.stat"
2022-12-21 10:39:54.422 +08 [6998] DEBUG: MultiXactId Wrap limit is 2147483648, limited by database with OID 1
2022-12-21 10:39:54.422 +08 (6998] DEBUG: MultiXact member stop limit is now 4294914944 based on MultiXact 1
2022-12-21 10:39:54.425 +08 (6994) DEBUG: starting background worker process "logical replication launcher"
2022-12-21 10:39:54.425 +08 (6994) LOG: database system is ready to accept connections
2022-12-21 10:39:54.426 +08 (7001] DEBUG: logical replication launcher started
2022-12-21 10:39:54.428 +08 [7000] DEBUG: autovacuum launcher started
Before you go
Finally, these new features improve performance by lowering the overhead of collecting statistics. These enhancements help to ensure that PostgreSQL will continue to scale and perform well as database sizes and workloads grow.
leverages and extends the strength and reliability of PostgreSQL with additional enterprise features.