
postgres_fdw is one of the foreign data wrappers (FDWs) that is used to access data stored outside PostgreSQL by using regular SQL. When it connects to foreign PostgreSQL servers, it internally uses the libpq client library.
In general, when a client application uses libpq, the connection setting application_name can be specified. It can be used to identify the client so its activity can be monitored using the server log. The connection setting fallback_application_name can be used to specify the name of the default client application, which can be overwritten by the value used in application_name.
Problem prior to PostgreSQL 15
When postgres_fdw connects to a foreign PostgreSQL server, the parameter fallback_application_name is fixed as postgres_fdw. So, if users do not change anything related to it, foreign servers recognize that the client application called postgres_fdw is connecting to them. Sometimes this lead to difficulties in monitoring and analyzing the system in detail.
Let's take an example, assuming that there are two PostgreSQL servers - one (called Analyzer) is used for data analysis, and the other (called Source) stores data. Analyzer can access data stored on Source by using postgres_fdw. Moreover, to monitor the database activity, the DBA modifies the server configuration on Source and the following is output to the log:
- Connection setting application_name
- Execution time of each SQL
One day while DBA checks the server log, they find that some SQLs are very slow (see below). They check the application_name to investigate which client executed them, and they can see that application_name of all log lines is postgres_fdw.
Who sent the slow SQLs? And in what order did each client send them? We did not have a good solution to distinguish them.
Server access from Analyzer to Source, with each log entry showing the SQL execution time
Note: If some Analyzers access the same data source in the system, we can distinguish the connector by setting the application_name when DBA executes CREATE SERVER on each Analyzer. However, if one Analyzer accesses the same data source simultaneously, the destination server cannot identify the connector.
Feature summary
Based on the above, I proposed to the PostgreSQL community to add a new configuration parameter, postgres_fdw.application_name. When postgres_fdw establishes a new connection, it uses the given string as the connection setting application_name. There are two ways this parameter can be set - by modifying postgresql.conf, or by executing the SET statement.
-- Check the definition of the parameter via pg_settings system view
postgres=# SELECT name, short_desc, context, vartype from pg_settings WHERE name = 'postgres_fdw.application_name';
-[ RECORD 1 ]----------------------------------------------------------
name | postgres_fdw.application_name
short_desc | Sets the application name to be used on the remote server.
context | user
vartype | string
-- SET statement can be used to set locally to a session
postgres=# SET postgres_fdw.application_name TO 'test wrapper'
SET
postgres=# SHOW postgres_fdw.application_name;
postgres_fdw.application_name
-------------------------------
test wrapper
(1 row)
Note that a change of this parameter does not affect any existing connections until they are re-established. If you want to use the modified parameter, established connections to foreign PostgreSQL servers must be discarded via postgres_fdw_disconnect(), and then connections must be established again.
The important characteristic of this parameter is that the status information can be embedded in the application_name, similar to the log_line_prefix parameter. Some special sequences are replaced with the status information of the connector when the parameter is used as the application_name. Details of the escape character strings and their conversion is listed below.
Escape character string | Conversion |
%a | application_name on local server |
%c | Session ID on local server |
%C | Cluster name on local server |
%u | User name on local server |
%d | Database name on local server |
%p | Process ID of backend on local server |
%% | Literal % |
Effect
So, what is the advantage of this feature?
In the above example, it was difficult to track the system from the server log because many users accessed one data source from the same PostgreSQL server.
Let's solve the problem by specifying postgres_fdw.application_name using the ALTER SYSTEM statement.
-- parameter setting via ALTER SYSTEM
postgres=# ALTER SYSTEM SET postgres_fdw.application_name TO 'from %C user=%u';
ALTER SYSTEM
-- send SIGHUP signal to reload postgresql.auto.conf
postgres=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
-- Check whether the parameter is changed.
-- Note that the escape sequence is replaced when postgres_fdw connects to foreign servers,
-- so the application_name which will be really used cannot be shown
postgres=# SHOW postgres_fdw.application_name;
postgres_fdw.application_name
-------------------------------
from %C user=%u
(1 row)
Execute the statement above, and restart the system. When DBAs check the server log again, they will get the following output.
Server access from Analyzer to Source. Unlike the previous example, the log shows the result of specifying postgres_fdw.application_name
Compared with the first example, we can see that the string %C is replaced with the cluster name Analyzer, and %u is replaced with the usernames kuroda and takeda. Based on the outputs, an accurate analysis of the workloads can be done by comparing both logs on Analyzer and Source.
If postgres_fdw.application_name is changed via ALTER SYSTEM statement like above, we can use the parameter without modifying client applications. Alternatively, the application_name can be changed based on the sequence of the workload by modifying the application and using the SET statement.
Summary and beyond
In this post, I described the internals of postgres_fdw, its challenge, and a new configuration parameter to solve it.
The importance of solutions that leverage multiple PostgreSQL servers - like data sharding and scale out - will be larger and larger. This parameter will be helpful for easier monitoring, when fdw is used to deliver those capabilities.
Besides this, new features are added and proposed to postgres_fdw day to day. A proposal I am currently working on is to add a new SQL function which can be used to verify the connection status between PostgreSQL servers. I'm planning to write a post again when it is accepted.
Caution
postgres_fdw.application_name can be set to any string lengths, but when it is used as application_name, it will be truncated to NAMEDATALEN (generally 64). In addition, postgres_fdw.application_name can contain non-ASCII characters, but such characters are replaced with question marks (?) when used as application_name.
References in this post:
For further detail regarding this feature, refer to PostgreSQL documentation or commit information posted on GitHub.
- https://www.postgresql.org/docs/current/postgres-fdw.html#id-1.11.7.47.18
- https://github.com/postgres/postgres/commit/449ab6350526e99d33363706b759951ebad7928e
- https://github.com/postgres/postgres/commit/6e0cb3dec10e460288d68a128e3d79d16a230cdb
Fujitsu Enterprise Postgres
provides to harness your data.