Businesses are changing dramatically with the evolution of digital technology, requiring systems to coordinate with various types of data sources and formats to create new value. PostgreSQL allows this via foreign data wrappers, which enable access to foreign data in various formats such as relational databases, NoSQL, and geospatial.

What are foreign data wrappers?

Foreign data wrappers (or FDW) are a type of PostgreSQL extension that provides access to foreign data using SQL statements such as SELECT and UPDATE.

FDWs can be created by using the library published by PostgreSQL and can be customised according to the format of the data to be linked with. Many companies, projects, and individuals have created FDWs in this way. There are nearly 120 types of FDWs available to the public, allowing to access a wide range of data sources.

Foreign data accessible from PostgreSQL

The main categories of foreign data that can be accessed from PostgreSQL include:

  • External PostgreSQL systems
  • Relational databases such as Oracle, SQL Server, MySQL, etc.
  • NoSQL databases such as MongoDB, Redis, Cassandra, etc.
  • Files in various formats such as CSV, plain text, XML, JSON, ZIP, etc.
  • Geospatial information in PostGIS extender
  • LDAP
  • Big Data stores such as Apache Hadoop, Google BigQuery, Elasticsearch, etc.
  • SNS such as Facebook and Twitter, etc.
  • Specific web wrappers for Google, DynamoDB, and Telegram, among others

The access point for foreign data is called a data source.

Foreign data wrapper mechanism and usage

In addition to referencing foreign data, some FDWs benefit from a pushdown mechanism to run SQL to update, filter, or sort data in the data source. This section gives an overview of the mechanism and usage of FDW.

Foreign data wrapper mechanism

This section describes how FDWs connect to foreign data sources, and its mechanism in PostgreSQL.

To access a foreign data source, install the corresponding FDW as a PostgreSQL extension and define the foreign data as a foreign table. Foreign data will then be accessible through the FDW via SQL statements.

Access to foreign data source
FDWs allow access to various foreign data sources

Process in PostgreSQL

When a client issues an SQL statement that includes a foreign table, the process is run within PostgreSQL according to the following workflow.

FDW execution in PostgreSQL
  1. Client runs the SQL statement specifying a foreign table in the FROM clause
  2. The SQL statement is parsed and rewritten if necessary
  3. The planner (optimiser) requests the FDW for an optimal execution plan
  4. The executor sends a request to the FDW
  5. The FDW obtains data from the foreign data source
  6. The FDW converts the result to PostgreSQL internal format and returns to the executor
  7. The result is returned to the client

Using foreign data wrappers

Before using a FDW, you will need to create related objects - this involves the following:

  • installing the FDW corresponding to the data source
  • defining the remote server to connect to
  • defining the user mapping of users to the foreign server
  • creating a foreign table

The following is an example of using the FDW postgres_fdw to link with an external PostgreSQL system. For postgres_fdw, you will need to create the objects in the order shown below on the local side.

Linking to an external PostgreSQL system
Linking to remote table remote_tbl for user user01
Object type Description
EXTENSION Create the FDW module as a PostgreSQL extension.
SERVER Specify the connection information to the remote server and the FDW to use.
USER MAPPING Associate the remote server with a local user who has privileges to the remote table.
Some FDWs allow a remote user to be specified – this allows you to map local users to remote users if necessary.
FOREIGN TABLE Create the foreign table on the PostgreSQL side adhering to the same column definition as the remote table.

The items and formats that can be specified in OPTIONS of CREATE SERVER, CREATE USER MAPPING, and CREATE FOREIGN TABLE vary depending on the FDW used.

Note

In addition to CREATE FOREIGN TABLE, you can also use IMPORT FOREIGN SCHEMA to create foreign tables using the same column definition as the remote table.

This is useful if you want to create multiple foreign tables in the same schema, as foreign tables will be created locally on a per-schema basis according to the remote side.

Our article "How to link to Oracle databases using oracle_fdw" expands on this.

Using pushdown

FDW has a mechanism called pushdown, which allows the remote execution of clauses in client SQL statements, such as WHERE and ORDER BY. For example, pushdown of WHERE has the effect of reducing communication bottlenecks, because the target data is filtered on the remote side and thus reduces the amount of data transferred between local and remote systems. In a process where 5 out of 1,000 rows of foreign data are extracted by specifying the WHERE clause, the data transfer volume is reduced by 99.5%.

Using FDW without pushdown
Using FDW with pushdown

Clauses available for pushdown differ depending on the FDW. Also, some combination of clauses may not be eligible for pushdown. Be sure to verify the specifications by referring to the documentation of the FDW being used.

Key points for using FDW

The points below refer mainly to postgres_fdw, and may differ for other FDWs – if you plan to use a different FDW, please check its documentation accordingly.

Controlling update transactions

These are the points to keep in mind when using update transactions on foreign tables.

  • Timing of COMMIT on the remote side

    Updates made to the foreign table on the local side are reflected on the remote side when a COMMIT is issued on the local side. If the update is successful and the COMMIT is completed on the remote side, it is also committed on the local side. The same is true for ROLLBACK.

  • Transaction isolation level

    Even if the SQL statement issued to the foreign table is a single SQL statement, multiple SQL statements may be issued on the remote side via FDW. In this case, the remote side operates with REPEATABLE READ or SERIALIZABLE to maintain transaction consistency. If the transaction isolation levels do not match between the sides, transaction execution results may differ.

Refreshing statistics

Up-to-date statistics are important for optimal execution planning, and for this reason autovacuum runs ANALYZE on regular tables to refresh the statistics. But for foreign tables, ANALYZE must be run separately. Note that there are FDWs that can refresh the statistics when accessing the foreign table.

Data type conversion when accessing the remote system

FDWs automatically convert data when transferring between the foreign table and its remote counterpart, but in some cases the conversion may not take place correctly if the remote data source is other than PostgreSQL.

For example, in VARCHAR(n), Oracle treats 'n' as the number of bytes, while PostgreSQL treats it as the number of characters. This may cause an error if the capacity is exceeded when data is converted.

Using FDW with partitioning

By combining FDW with table partitioning, foreign data distributed and stored on multiple servers can be aggregated into a partitioned table via foreign tables (partitions). For example, sales information in branches can be aggregated to a partition table on a server at the head office, where data can be efficiently referenced and analysed.

Using FDW with partitioned tables

Use scenario: Providing new services using IoT data

By giving access to data in such a variety of formats, new services can be extended using existing systems. For example, new services can be created by combining user information in existing core databases with IoT data such as location information and sensor information.

img-pi-dgm-fdw-ove-use-scenario-providing-new-services

The URLS below lists FDWs available for PostgreSQL:

More PostgreSQL Insider articles

This is the first part on our discussion of how to access Oracle databases using PostgreSQL's foreign data wrapper oracle_fdw.
Read more
This article concludes our discussion of the use of PostgreSQL's foreign data wrapper oracle_fdw, and how to use it more effectively.
Read more
Explaining partitioning, the method for storing data in divided tables, based on declarative partitioning.
Read more

FUJITSU Enterprise Postgres is designed for developers, data architects, and DBAs seeking the efficiency of a leading open source based database technology with the enhanced security and performance features plus support required in a serious business environment.

Brochure

See how FUJITSU Enterprise Postgres enhances PostgreSQL providing 5 key enterprise capabilities for your mission-critical applications.

1st  page brochure FUJITSU Enterprise Postgeres - Fujitsu's enhanced open source PostgreSQL
VIEW BROCHURE

Read our latest blogs

Read our most recent articles regarding all aspects of PostgreSQL and FUJITSU Enterprise Postgres.