Managing Fujitsu Enterprise Postgres partitions with pg

The pg_partman extension automates and manages table partitioning in Fujitsu Enterprise Postgres so you can handle large datasets efficiently.

Automating partition management in Fujitsu Enterprise Postgres

In Fujitsu Enterprise Postgres, partitioning large tables is a common approach to managing large datasets in databases, as it natively supports table partitioning, allowing users to split large tables into smaller, more manageable pieces, called partitions.

This feature is invaluable for optimizing query performance, as it can reduce the number of rows scanned by focusing only on relevant partitions. However, managing and creating new partitions manually can be time-consuming, especially when working with high-frequency data INSERTs.

To address this issue, you can use pg_partman, a PostgreSQL extension designed to simplify and automate this process.

What is pg_partman?

pg_partman (Partition Manager) is an open-source PostgreSQL extension developed to handle the automation of table partitioning based on time intervals or serial keys. By managing the creation, maintenance, and cleanup of partitions, pg_partman reduces administrative overhead and ensures partitioned tables are optimized for performance.

Key features of pg_parman include:

Automatic creation of new partitions based on specified time intervals (e.g., daily, weekly, monthly) or integer series
Cleanup and dropping of old partitions based on a retention policy
Ability to pre-create partitions to minimize runtime overhead
Support for native PostgreSQL partitioning as well as legacy inheritance-based partitioning

Why use pg_partman?

Using pg_partman offers several advantages:

Ease of management
pg_partman simplifies the partition management process, automatically creating and dropping partitions as needed.
Improved performance
Proper partitioning can significantly boost query performance by minimizing data scanned during SELECT operations.
Reduced downtime
By pre-creating partitions, pg_partman reduces downtime caused by runtime partition creation.
Retention policies
It allows you to automatically delete older partitions, which helps manage storage costs and keep the dataset lean.

Getting started with pg_partman

Installation

To install pg_partman, you'll need to have superuser privileges. Here are the general steps:

Install the pg_partman extension package from git:
https://github.com/pgpartman/pg_partman
Unzip the source code
In the directory where you unzip the pg_partman, run
make install
The background worker must be loaded on database start by adding the library to shared_preload_libraries in postgresql.conf. This will be useful for partition maintenance operations to run automatically.
shared_preload_libraries = 'pg_partman_bgw'
You can set other settings for the background worker (BGW) in postgresql.conf.
As a minimum, you need to set dbname so maintenance can run on the specified database(s). You can add or change these settings anytime and just reload the configuration to apply them.

pg_partman_bgw.interval = 3600
pg_partman_bgw.role = fepuser
pg_partman_bgw.dbname = 'test_partition'

Configuration options for the background worker

Parameter	Description
pg_partman_bgw.interval	Number of seconds between calls to run_maintenance(). Default: 3600 (1 hour). For time-based partitions, intervals can be chosen based on the partition interval at least daily or twice daily. For serial-based partitions, you can choose the partition interval based on the data inserted on a daily basis.
pg_partman_bgw.dbname	(Required) The database(s) that run_maintenance() will run on. If multiple, use a comma-separated list. If not set, BGW will do nothing.
pg_partman_bgw.role	The role that run_maintenance() will run as. Default: postgres. You are allowed to use only a single role name.
pg_partman_bgw.analyze	(Optional) By default, whenever a new child table is created, an analyze is run on the parent table of the partition set to ensure constraint exclusion works. This analyze can be skipped by setting this to false and help increase the speed of moving large amounts of data. If you set this to false, we strongly recommend performing a manual analyze of the partition upon completion, to ensure the statistics are updated properly.

The run_maintenance() function is part of the pg_partman extension, and is used to perform maintenance tasks on partitioned tables, such as creating new partitions, dropping old partitions, and other necessary upkeep to ensure the partitioning scheme remains efficient and up-to-date.

Add the extension to your database:
test_partition=# CREATE SCHEMA partman;
CREATE SCHEMA
test_partition=# CREATE EXTENSION pg_partman SCHEMA partman;
CREATE EXTENSION
test_partition=#

Once installed, pg_partman creates a set of functions and tables in the Fujitsu Enterprise Postgres instance that facilitate partition management.

Basic usage

To demonstrate pg_partman functionality, let’s assume you’re working with a table events that records temperature data with a timestamp.

Create a partitioned table
CREATE SCHEMA data_mart;
CREATE TABLE data_mart.organization ( org_id BIGSERIAL,
org_name TEXT,
CONSTRAINT pk_organization PRIMARY KEY (org_id)
);

CREATE TABLE data_mart.events(
event_id BIGSERIAL,
operation CHAR(1),
value FLOAT(24),
parent_event_id BIGINT,
event_type VARCHAR(25),
org_id BIGSERIAL,
created_at timestamp,
CONSTRAINT pk_data_mart_event PRIMARY KEY (event_id, created_at),
CONSTRAINT ck_valid_operation CHECK (operation = 'C' OR operation = 'D'),
CONSTRAINT fk_orga_membership
FOREIGN KEY(org_id)
REFERENCES data_mart.organization (org_id),
CONSTRAINT fk_parent_event_id
FOREIGN KEY(parent_event_id, created_at)
REFERENCES data_mart.events (event_id,created_at)
) PARTITION BY RANGE (created_at);

CREATE INDEX idx_org_id ON data_mart.events(org_id);
CREATE INDEX idx_event_type ON data_mart.events(event_type);
Configure pg_partman
Configure the table for time-based partitioning. For example, if you want daily partitions:

SELECT partman.create_parent(
p_parent_table := 'data_mart.events',
p_control := 'created_at',
p_interval := '1 day',
p_premake := 10);
create_parent
---------------
t
(1 row)

Here’s a breakdown of the parameters:

p_parent_table: The table to partition.
p_control: The column to use for partitioning.
p_interval: The partition interval. Options include: YEAR,MONTH,DAY,HOUR,MINUTE,SECOND,YEAR TO MONTH,DAY TO HOUR,DAY TO MINUTE,DAY TO SECOND,HOUR TO MINUTE,HOUR TO SECOND,MINUTE TO SECOND.
p_premake: Additional partitions to stay ahead of the current partition.

Various options for partman.create_parent function

Argument	Data type	Description
p_parent_table	text	Name of the parent table for partitioning.
p_control	text	Column used to control the partitioning process.
p_type	text	Type of partitioning to be used.
p_interval	text	Interval for creating new partitions.
p_template_table	text	Template table that defines the structure and settings for new partitions.
p_premake	integer	Number of future partitions to create in advance.
p_start_partition	text	Starting point for creating partitions.
p_automatic_maintenance	text	Enables or disables automatic maintenance tasks for partitioned tables.
p_epoch	text	Epoch for time-based partitioning.
p_default_table	boolean	Default table where data that doesn’t fit into any of the existing partitions will be stored.
p_constraint_cols	text	Additional columns that should have constraints applied to in each partition.
p_jobmon	boolean	Integration with the pg_jobmon extension. When enabled, it allows pg_partman to log its maintenance jobs to pg_jobmon, providing better tracking and monitoring of these jobs.
p_date_trunc_interval	text	Interval for truncating dates when creating partitions.
p_control_not_null	boolean	Ensures the control column used for partitioning is set to NOT NULL.
p_time_encoder	text	Custom function for encoding time-based partition names.
p_time_decoder	text	Custom function for decoding partition names back into their corresponding time intervals.

Inserting data

Once the partitioned table is set up, insert records as you would with any table. pg_partman will automatically route each record to the appropriate partition:

Viewing partitions

You can check the created partitions using a standard query on the pg_partman metadata tables:

Automating partition management

After creating the partitioned table, you’ll want pg_partman to manage partitions automatically. This includes creating new partitions and dropping old ones based on a retention policy.

Schedule partition maintenance
Run the following function to manage new partition creation and old partition removal. You can automate this process with a cron job or a PostgreSQL job scheduler like pg_cron.

test_partition=# SELECT partman.run_maintenance();
run_maintenance
-----------------
(1 row)
test_partition=#

Adding a cron job

Create partitions every day at midnight

0 0 * * * psql -U fepuser -d test_partition -c "SELECT partman.run_maintenance();"
Set retention policy
If you want to keep only the last 30 days of data, you can set a retention policy:

test_partition=# UPDATE partman.part_config SET retention = '30 days' WHERE parent_table = 'data_mart.events';
UPDATE 1
test_partition=#

Run run_maintenance() after updating the retention setting to apply it immediately.

Conclusion

pg_partman is a powerful tool for automating and simplifying table partitioning. By using this extension, you can minimize administrative overhead, optimize table performance, and implement effective data retention policies. Whether working with time-based or serial key partitioning, pg_partman offers the flexibility and automation needed to scale partitioned tables for high-volume applications.

Setting up pg_partman might take some initial effort, but its benefits in terms of performance, scalability, and maintainability make it an invaluable addition to any Fujitsu Enterprise Postgres environment managing large datasets.

If you would like to learn more about pg_partman and its features, you can check their documentation page in GitHub.

Fujitsu Trunkline

< Back to blog home PostgreSQL Zone

Managing Fujitsu Enterprise Postgres partitions with pg_partman
Nishchay Kothari | November 18, 2024

Automating partition management in Fujitsu Enterprise Postgres

What is pg_partman?

Why use pg_partman?