<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      Learn how to effectively benchmark PostgreSQL using the DBT3 toolkit, simplifying TPC-H benchmarks for complex queries and large datasets.

      The importance of database performance benchmarking

      Database performance benchmarking is essential for understanding the capabilities and limitations of a database system, especially when it comes to complex decision support queries.

      TPC-H is a well-known benchmark that simulates decision-support workloads in a business environment. Running TPC-H on PostgreSQL can help database administrators and developers evaluate how the database handles analytical queries, large datasets, and computationally intensive operations.

      To facilitate TPC-H benchmarking on PostgreSQL, the DBT3 toolkit provides a user-friendly and automated approach. While the official TPC-H toolkit from the Transaction Processing Performance Council (TPC) is authoritative, DBT3 is widely used for practical benchmarking purposes.

      In this article, we’ll explore how to use DBT3 to perform TPC-H benchmarking on PostgreSQL, why it’s beneficial, and why DBT3 is often preferred over the official TPC-H toolkit.

      Why benchmark PostgreSQL with TPC-H?

      TPC-H benchmarks are designed to evaluate performance for:

      • Complex queries such as analytical queries involving large datasets, multiple joins, and aggregate computations
      • Scenarios where users need insights derived from large volumes of transactional data
      • Performance under increasing data volumes (scale factors) from GB to TB levels
      • Fine-tuning hardware, indexing strategies, and query planners

      When using TPC-H benchmarking on PostgreSQL, we can:

      • Assess query planner efficiency
      • Highlight potential bottlenecks in query execution
      • Inform optimizations for analytical workloads

      Why use DBT3 instead of the official TPC-H toolkit?

      The TPC-H toolkit provided by TPC includes the dbgen and qgen utilities to generate data and queries, but it comes with several challenges.

      First, setting up TPC-H involves numerous manual steps, including data generation, schema creation, query execution, and result collection, making the process time-intensive and error-prone. Second, the TPC-H specification mandates strict adherence to rules and auditing for results to be valid. This rigidity makes it cumbersome for casual benchmarking, where strict compliance is not a priority. Finally, publishing official TPC-H results requires comprehensive auditing and certification, which is resource-intensive and expensive.

      DBT3 addresses these challenges effectively. It incorporates parts of the official TPC-H toolkit, specifically dbgen and qgen, to generate compliant data and queries. This ensures that the generated data and queries adhere to TPC-H standards while streamlining the execution process. DBT3 automates schema creation, data loading, query execution, and result collection, significantly reducing the manual effort involved. It simplifies the overall process through scripts and workflows, making it accessible to non-experts.

      Additionally, DBT3 offers flexibility, allowing modifications and experimentation without being bound to strict compliance. As an open source tool, DBT3 is free to use, making it an ideal choice for organizations focusing on internal benchmarking. By combining the rigor of TPC-H’s data and query generation tools with user-friendly automation, DBT3 provides a practical solution for benchmarking without the overhead of full TPC compliance.

      The TPC-H (version3.0.1) can be found at the TPC website — here

      Refer to the specification to obtain details about each of the 21 queries that make up the benchmark.

      How to use DBT3 to perform benchmarking

      The following steps have been tested on Ubuntu Linux with PostgreSQL 16.

      1. Install pre-requisites for DBT3
        The following are required and can be installed using APT.
        • C compiler
        • CMake – Cross-platform build system for managing the build process
        • Make – Used to execute build instructions generated by CMake
        • Patch – Used to modify the dbgen and qgen for specific database systems (PostgreSQL or Oracle)
        • SQLite – Database used to store results
        • bc – Calculator used by DBT3 to compute scaling factors and performance metrics
      2. Download the TPC-H tools file from TPC site

        Fill out the form at the TPC website (here) and a link to download the file will be emailed to the supplied email address.

        Once the zip file is downloaded, unzip it into your working folder.

      3. Install DBT3
        DBT3 can be downloaded from GitHub — https://github.com/osdldbt/dbt3
        The documentation can also be downloaded from GitHub — here
        1. Clone the project to your working folder:

          git clone https://github.com/osdldbt/dbt3.git

        2. Change into the dbt3 folder:

          cd dbt3

        3. Generate a build:

          make -f Makefile.cmake release

        4. Install utilities to /usr/local by changing into the release folder and using cmake:

          cd builds/release
          cmake --install . --prefix /usr/local

      4. Modify/Build the files for PostgreSQL.
        1. Change to the folder that contains the unzipped TPC-H toolkit that you downloaded from the TPC website, and execute the following:

          dbt3-build-dbgen –patch-dir=<dbt3 folder>/patches –query-dir=<dbt3 folder>/queries pgsql 'TPC-H V3.0.1'

        2. Create a new database called DBT3 using the PostgreSQL CREATE DATABASE command.
        3. Execute a test using DBT3 by executing:

          dbt3-run –tpchtools='TPC-H V3.0.1' pgsql ./results

          The results will be put into the ./results folder and the summary.rst folder will contain a query summary.

          CREATE INDEX
          CREATE INDEX
          ANALYZE
          Elapsed time for Load Test : 58 seconds
          Using seed: 108152318
          Wed Jan 8 15:23:19 AEST 2025: Power Test : Starting Refresh Stream 1
          Wed Jan 8 15:23:21 AEST 2025: Power Test : Refresh Stream 1 completed.
          Wed Jan 8 15:23:23 AEST 2025: Power Test : Starting Refresh Stream 2
          Wed Jan 8 15:23:23 AEST 2025: Power Test : Refresh Stream 2 completed.
          Wed Jan 8 15:23:23 AEST 2025: Power Test completed.
          Elapsed time for Power Test : 4 seconds
          Wed Jan 8 15:23:23 AEST 2625: Throughput Stream 1: Starting RF 1...
          Throughput Stream 1: Elapsed time for Refresh Stream 1: 4 seconds
          Wed Jan 8 15:23:27 AEST 2625: Throughput Stream 1: Starting RF 2...
          Throughput Steam 1: Elapsed time for RF 2: 6 seconds
          Wed Jan 8 15:23:27 AEST 2625: Throughput Stream 2: Starting RF 1...
          Throughput Stream 2: Elapsed time for Refresh Stream 1: 3 seconds
          Wed Jan 8 15:23:36 AEST 2625: Throughput Stream 2: Starting RF 2...
          Throughput Steam 2: Elapsed time for RF 2: © seconds
          Elapsed time for performance test: 12 seconds
          Getting system configuration...
          Getting query times...
          Post processing query results...

                 Composite Score: 43671.88
          Load Test Time (hours):      .01
               Power Test Score: 78677.86
           Throughput Test Score: 24241.05

          See query summary in: ./results/summary.rst
          postgres ubuntu-linux-22-04-02-desktop: $

          Power Test
          ----------

          * Seed: 108152318

          +--------------------+------------------------+------------------------+------------------------+
          | Duration (seconds) |    Query Start Time    |     RF1 Start Time     |     RF2 Start Time     |
          |                    +------------------------+------------------------+------------------------+
          |                    |     Query End Time     |      RF1 End Time      |      RF2 End Time      |
          +===============================================================================================+
          |               4.60 | 2025-01-08 15:23:21.92 | 2025-01-08 15:23:19.09 | 2025-01-08 15:23:23.63 |
          |                    +------------------------+------------------------+------------------------+
          |                    | 2025-01-08 15:23:23.62 | 2025-01-08 15:23:21.91 | 2025-01-08 15:23:23.66 |
          +--------------------+------------------------+------------------------+------------------------+

          ======= =========================
          Query   Response Time (seconds)
          ======= =========================
                1                      0.03
                2                      0.03
                3                      0.03
                4                      0.03
                5                      0.03
                6                      0.03
                7                      0.03
                8                      0.03
                9                      0.03
               10                      0.03
               11                      0.03
               12                      0.03
               13                      0.34
               14                      0.03
               15                      0.04
               16                      0.20
               17                      0.03
               18                      0.03
               19                      0.03
               20                      0.03
               21                      0.03
               22                      0.28
               RF1                     2.82
               RF2                     0.03

          Throughput Test
          ---------------

          Stream execution summary:

          +--------------------------------------------------------------------------------------------------+
          |   Stream  | Duration  |    Query  Start Time         RF1 Start Time     |     RF2 Start Time     |
          +-----------+ (seconds) +------------------------+------------------------+------------------------+
          |   Seed    |           |     Query End Time     |      RF1 End Time      |      RF2 End Time      |
          +===========+===========+========================+========================+========================+
          |         1 |      2.52 | 2025-01-08 15:23:24.39 | 2025-01-08 15:23:23.74 | 2025-01-08 15:23:27.30 |
          +-----------+           +------------------------+------------------------+------------------------+
          | 108152319 |           | 2025-01-08 15:23:26.83 | 2025-01-08 15:23:27.29 | 2025-01-08 15:23:27.33 |
          +-----------+-----------+------------------------+------------------------+------------------------+
          |         2 |      2.65 | 2025-01-08 15:23:24.50 | 2025-01-08 15:23:27.37 | 2025-01-08 15:23:30.17 |
          +-----------+           +------------------------+------------------------+------------------------+
          | 108152320 |           | 2025-01-08 15:23:27.09 | 2025-01-08 15:23:30.15 | 2025-01-08 15:23:30.20 |
          +--------------------------------------------------------------------------------------------------+

          Query execution duration (seconds):

          ======== ========  ========  ========  ========  ========  ========  ========
          Stream      Q1        Q2        Q3        Q4        Q5        Q6        Q7
          ======== ========  ========  ========  ========  ========  ========  ========
                 1     0.04      0.03      0.04      0.06      0.04      0.03      0.08
                 2     0.04      0.03      0.03      0.03      0.08      0.03      0.05
               Min     0.04      0.03      0.03      0.03      0.04      0.03      0.05
               Max     0.04      0.03      0.04      0.06      0.08      0.03      0.08
               Avg     0.04      0.03      0.03      0.04      0.06      0.03      0.07
          ======== ========  ========  ========  ========  ========  ========  ========

          ======== ========  ========  ========  ========  ========  ========  ========
          Stream      Q8        Q9       Q10       Q11       Q12       Q13       Q14
          ======== ========  ========  ========  ========  ========  ========  ========
                 1     0.06      0.08      0.05      0.06      0.06      0.33      0.06
                 2     0.08      0.04      0.11      0.03      0.04      0.38      0.04
               Min     0.06      0.04      0.05      0.03      0.04      0.33      0.04
               Max     0.08      0.08      0.11      0.06      0.06      0.38      0.06
               Avg     0.07      0.06      0.08      0.04      0.05      0.36      0.05
          ======== ========  ========  ========  ========  ========  ========  ========

          ======== ========  ========  ========  ========  ========  ========  ========
          Stream     Q15       Q16       Q17       018       Q19       Q20       Q21
          ======== ========  ========  ========  ========  ========  ========  ========
                 1     0.04      0.29      0.03      0.05      0.05      0.03      0.03
                 2     0.05      0.34      0.05      0.04      0.04      0.03      0.03
               Min     0.04      0.29      0.03      0.04      0.04      0.03      0.03
               Max     0.05      0.34      0.05      0.05      0.05      0.03      0.03
               Avg     0.05      0.32      0.04      0.05      0.05      0.03      0.03
          ======== ========  ========  ========  ========  ========  ========  ========

          ======== ========  ========  ========
            Stream   Q22       RF1       RF2
          ======== ========  ========  ========
                 1     0.15      3.55      0.03
                 2     0.15      2.78      0.03
               Min     0.15      2.78      0.03
               Max     0.15      3.55      0.03
               Avg     0.15      3.16      0.03
          ======== ========  ========  ========

       

      Topics: PostgreSQL, PostgreSQL performance, Database optimization, TPC-H benchmark, DBT3 toolkit, Database benchmarking

      Receive our blog

      Search by topic

      Posts by Tag

      See all
      Fujitsu Enterprise Postgres
      The hybrid multi-cloud Postgres backed by Fujitsu
      photo-gary-evans-in-hlight-circle-cyan-to-blue-02
      Gary Evans
      Senior Offerings and Center of Excellence Manager
      Gary Evans heads the Center of Excellence team at Fujitsu Software, providing expert services for customers in relation to PostgreSQL and Fujitsu Enterprise Postgres.
      He previously worked in IBM, Cable and Wireless based in London and the Inland Revenue Department of New Zealand, before joining Fujitsu. With over 15 years’ experience in database technology, Gary appreciates the value of data and how to make it accessible across your organization.
      Gary loves working with organizations to create great outcomes through tailored data services and software.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >