<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      The pg_partman extension automates and manages table partitioning in Fujitsu Enterprise Postgres so you can handle large datasets efficiently.

      Automating partition management in Fujitsu Enterprise Postgres

      In Fujitsu Enterprise Postgres, partitioning large tables is a common approach to managing large datasets in databases, as it natively supports table partitioning, allowing users to split large tables into smaller, more manageable pieces, called partitions.

      This feature is invaluable for optimizing query performance, as it can reduce the number of rows scanned by focusing only on relevant partitions. However, managing and creating new partitions manually can be time-consuming, especially when working with high-frequency data INSERTs.

      To address this issue, you can use pg_partman, a PostgreSQL extension designed to simplify and automate this process.

      What is pg_partman?

      pg_partman (Partition Manager) is an open-source PostgreSQL extension developed to handle the automation of table partitioning based on time intervals or serial keys. By managing the creation, maintenance, and cleanup of partitions, pg_partman reduces administrative overhead and ensures partitioned tables are optimized for performance.

      Key features of pg_parman include:

      • Automatic creation of new partitions based on specified time intervals (e.g., daily, weekly, monthly) or integer series
      • Cleanup and dropping of old partitions based on a retention policy
      • Ability to pre-create partitions to minimize runtime overhead
      • Support for native PostgreSQL partitioning as well as legacy inheritance-based partitioning

      Why use pg_partman?

      Using pg_partman offers several advantages:

      • Ease of management

        pg_partman simplifies the partition management process, automatically creating and dropping partitions as needed.

      • Improved performance

        Proper partitioning can significantly boost query performance by minimizing data scanned during SELECT operations.

      • Reduced downtime

        By pre-creating partitions, pg_partman reduces downtime caused by runtime partition creation.

      • Retention policies

        It allows you to automatically delete older partitions, which helps manage storage costs and keep the dataset lean.

      Getting started with pg_partman

      Installation

      To install pg_partman, you'll need to have superuser privileges. Here are the general steps:

      1. Install the pg_partman extension package from git:

        https://github.com/pgpartman/pg_partman

      2. Unzip the source code
      3. In the directory where you unzip the pg_partman, run

        make install

      4. The background worker must be loaded on database start by adding the library to shared_preload_libraries in postgresql.conf. This will be useful for partition maintenance operations to run automatically.

        shared_preload_libraries = 'pg_partman_bgw'

      5. You can set other settings for the background worker (BGW) in postgresql.conf.

        As a minimum, you need to set dbname so maintenance can run on the specified database(s). You can add or change these settings anytime and just reload the configuration to apply them.

        pg_partman_bgw.interval = 3600
        pg_partman_bgw.role = fepuser
        pg_partman_bgw.dbname = 'test_partition'

        Configuration options for the background worker

      Parameter Description
      pg_partman_bgw.interval Number of seconds between calls to run_maintenance(). Default: 3600 (1 hour).
      • For time-based partitions, intervals can be chosen based on the partition interval at least daily or twice daily.
      • For serial-based partitions, you can choose the partition interval based on the data inserted on a daily basis.
      pg_partman_bgw.dbname (Required) The database(s) that run_maintenance() will run on.
      If multiple, use a comma-separated list. If not set, BGW will do nothing.

      pg_partman_bgw.role

      The role that run_maintenance() will run as. Default: postgres.
      You are allowed to use only a single role name.

      pg_partman_bgw.analyze

      (Optional) By default, whenever a new child table is created, an analyze is run on the parent table of the partition set to ensure constraint exclusion works. This analyze can be skipped by setting this to false and help increase the speed of moving large amounts of data.
      If you set this to false, we strongly recommend performing a manual analyze of the partition upon completion, to ensure the statistics are updated properly.

      The run_maintenance() function is part of the pg_partman extension, and is used to perform maintenance tasks on partitioned tables, such as creating new partitions, dropping old partitions, and other necessary upkeep to ensure the partitioning scheme remains efficient and up-to-date.

      1. Add the extension to your database:

        test_partition=# CREATE SCHEMA partman;
        CREATE SCHEMA
        test_partition=# CREATE EXTENSION pg_partman SCHEMA partman;
        CREATE EXTENSION
        test_partition=#

      Once installed, pg_partman creates a set of functions and tables in the Fujitsu Enterprise Postgres instance that facilitate partition management.

      Basic usage

      To demonstrate pg_partman functionality, let’s assume you’re working with a table events that records temperature data with a timestamp.

      1. Create a partitioned table

        CREATE SCHEMA data_mart;
        CREATE TABLE data_mart.organization ( org_id BIGSERIAL,
                org_name TEXT,
                CONSTRAINT pk_organization PRIMARY KEY (org_id)  
            );

        CREATE TABLE data_mart.events(
                event_id        BIGSERIAL, 
                operation       CHAR(1), 
                value           FLOAT(24), 
                parent_event_id BIGINT, 
                event_type      VARCHAR(25), 
                org_id          BIGSERIAL, 
                created_at      timestamp, 
                CONSTRAINT pk_data_mart_event PRIMARY KEY (event_id, created_at), 
                CONSTRAINT ck_valid_operation CHECK (operation = 'C' OR operation = 'D'), 
                CONSTRAINT fk_orga_membership 
                    FOREIGN KEY(org_id) 
                    REFERENCES data_mart.organization (org_id),
                CONSTRAINT fk_parent_event_id 
                    FOREIGN KEY(parent_event_id, created_at) 
                    REFERENCES data_mart.events (event_id,created_at)
            ) PARTITION BY RANGE (created_at);

        CREATE INDEX idx_org_id ON data_mart.events(org_id);
        CREATE INDEX idx_event_type ON data_mart.events(event_type);

      2. Configure pg_partman

        Configure the table for time-based partitioning. For example, if you want daily partitions:

        SELECT partman.create_parent(
        p_parent_table := 'data_mart.events',
        p_control := 'created_at',
        p_interval := '1 day',
        p_premake := 10);
        create_parent 
        ---------------
         t
        (1 row)

      Here’s a breakdown of the parameters:

      • p_parent_table: The table to partition.
      • p_control: The column to use for partitioning.
      • p_interval: The partition interval. Options include: YEAR,MONTH,DAY,HOUR,MINUTE,SECOND,YEAR TO MONTH,DAY TO HOUR,DAY TO MINUTE,DAY TO SECOND,HOUR TO MINUTE,HOUR TO SECOND,MINUTE TO SECOND.
      • p_premake: Additional partitions to stay ahead of the current partition.

      Various options for partman.create_parent function

      Argument Data type Description

      p_parent_table

      text

      Name of the parent table for partitioning.

      p_control

      text

      Column used to control the partitioning process.

      p_type

      text

      Type of partitioning to be used.

      p_interval

      text

      Interval for creating new partitions. 

      p_template_table

      text

      Template table that defines the structure and settings for new partitions.

      p_premake

      integer

      Number of future partitions to create in advance.

      p_start_partition

      text

      Starting point for creating partitions. 

      p_automatic_maintenance

      text

      Enables or disables automatic maintenance tasks for partitioned tables.

      p_epoch

      text

      Epoch for time-based partitioning.

      p_default_table

      boolean

      Default table where data that doesn’t fit into any of the existing partitions will be stored.

      p_constraint_cols

      text

      Additional columns that should have constraints applied to in each partition.

      p_jobmon

      boolean

      Integration with the pg_jobmon extension. When enabled, it allows pg_partman to log its maintenance jobs to pg_jobmon, providing better tracking and monitoring of these jobs.

      p_date_trunc_interval

      text

      Interval for truncating dates when creating partitions. 

      p_control_not_null

      boolean

      Ensures the control column used for partitioning is set to NOT NULL.

      p_time_encoder

      text

      Custom function for encoding time-based partition names.

      p_time_decoder

      text

      Custom function for decoding partition names back into their corresponding time intervals.

      Inserting data

      Once the partitioned table is set up, insert records as you would with any table. pg_partman will automatically route each record to the appropriate partition:

      Viewing partitions

      You can check the created partitions using a standard query on the pg_partman metadata tables:

      test_partition=# SELECT * FROM partman.part_config WHERE parent_table = 'data_mart.events';
      -[ RECORD 1 ]--------------+----------------------------------
      parent_table               | data_mart.events
      control                    | created_at
      time_encoder               | 
      time_decoder               | 
      partition_interval         | 1 day
      partition_type             | range
      premake                    | 10
      automatic_maintenance      | on
      template_table             | partman.template_data_mart_events
      retention                  | 
      retention_schema           | 
      retention_keep_index       | t
      retention_keep_table       | t
      epoch                      | none
      constraint_cols            | 
      optimize_constraint        | 30
      infinite_time_partitions   | f
      datetime_string            | YYYYMMDD
      jobmon                     | t
      sub_partition_set_full     | f
      undo_in_progress           | f
      inherit_privileges         | f
      constraint_valid           | t
      ignore_default_data        | t
      date_trunc_interval        | 
      maintenance_order          | 
      retention_keep_publication | f
      maintenance_last_run       | 

      test_partition=#

      Automating partition management

      After creating the partitioned table, you’ll want pg_partman to manage partitions automatically. This includes creating new partitions and dropping old ones based on a retention policy.

      1. Schedule partition maintenance

        Run the following function to manage new partition creation and old partition removal. You can automate this process with a cron job or a PostgreSQL job scheduler like pg_cron.

        test_partition=# SELECT partman.run_maintenance();
        run_maintenance
        -----------------
         (1 row)
        test_partition=#

        Adding a cron job

        Create partitions every day at midnight

        0 0 * * * psql -U fepuser -d test_partition -c "SELECT partman.run_maintenance();"

      2. Set retention policy

        If you want to keep only the last 30 days of data, you can set a retention policy:

        test_partition=# UPDATE partman.part_config SET retention = '30 days' WHERE parent_table = 'data_mart.events';
        UPDATE 1
        test_partition=#

      Run run_maintenance() after updating the retention setting to apply it immediately.

      Conclusion

      pg_partman is a powerful tool for automating and simplifying table partitioning. By using this extension, you can minimize administrative overhead, optimize table performance, and implement effective data retention policies. Whether working with time-based or serial key partitioning, pg_partman offers the flexibility and automation needed to scale partitioned tables for high-volume applications.

      Setting up pg_partman might take some initial effort, but its benefits in terms of performance, scalability, and maintainability make it an invaluable addition to any Fujitsu Enterprise Postgres environment managing large datasets.

       

      If you would like to learn more about pg_partman and its features, you can check their documentation page in GitHub.

       

      Topics: PostgreSQL, Fujitsu Enterprise Postgres, Table partitioning, Database management, PostgreSQL extension, Performance optimization, pg_partman

      Receive our blog

      Search by topic

      Posts by Tag

      See all
      Fujitsu Enterprise Postgres
      The hybrid multi-cloud Postgres backed by Fujitsu
      Nishchay Kothari
      Technical Consultant, Fujitsu Enterprise Postgres Center of Excellence
      Nishchay Kothari is an outstanding technical consultant with over 13 years of expertise in relational database management systems (RDBMS). Nishchay has experience with a wide range of database technologies, including PostgreSQL, SQL Server, and Oracle.
      Nishchay has positioned himself as a go-to resource for organizations wanting to optimize their database infrastructure and architectural solutions driven by his passion for addressing complicated technological challenges.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >