<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      SQL tuning involves an analysis of execution plans and tweaking of time-consuming SQLs to optimize the system performance. In this article, we use a sample to perform tuning to show you the overall flow of SQL tuning.

      How tables are scanned and joined

      In SQL tuning, the SQL execution plan needs to be analyzed. Before we jump into SQL tuning, we will briefly go over how tables are scanned and joined, the logic of which analysis of the execution plan will be based on.

      Scanning method

      How are tables accessed? The following are the major scanning methods.

      • Sequential Scan
        Accesses tables sequentially from start to finish. This is an effective method when a large number of items need to be retrieved.
      • Index Scan
        Accesses indexes and tables alternately. This is an effective method when the number of items to be retrieved using the WHERE clause is small or when you want to access a specific data.

      Join method

      How are two tables joined? From the scan results of two tables, one result is output. The following are the major join methods. To distinguish the two tables to be joined, they are called outer table and inner table.

      • Nested Loop Join
        Nested Loop Join
        This method joins the inner table by looping all inner table rows for each outer table row. Processing is accelerated when the outer table has a small number of rows and the inner table has an index.
      • Merge Join
        Merge Join
        This is a method to sort two tables by a join key and then join them in sequence. This method is not efficient if sort takes a long time. Processing is accelerated because by defining a primary key or an index as the join key, the sort results of the two tables can be matched. This is an effective method when joining large tables.
      • Hash Join
        Hash Join
        This is a method to create a hash with the join key of the inner table and join the hash by matching against the rows of the outer table. The hash is created in memory, so once created, join is processed at a high speed. However, if the hash is larger than the memory size, file access will occur, resulting in slow processing. This method is suitable to join relatively small tables with large tables.

      Detecting slow SQL

      Slow SQLs can be detected via statistics views and server logs.

      Using statistics view

      Execute the following SQL to the pg_stat_statements view, and output 3 long-running SQLs in order from the slowest (for PostgreSQL version older than 13, replace total_exec_time with total_time).

      SELECT query, calls, CAST(total_exec_time as numeric(10,3)), CAST(total_time/calls AS numeric(10,3)) AS avg_time
      FROM pg_stat_statements ORDER BY avg_time DESC,calls LIMIT 3;
      The output will be something like the following:
                                     query                                   | calls | total_time | avg_time
      SELECT sales.name,sum(sales.num),(sum(sales.num)) * price.price        |     5 |  31000.071 | 6200.014 
      AS total_price FROM sales,price WHERE sales.name = price.name          |       |            |          
      GROUP BY sales.name.price.price ORDER BY sales.name ASC                |       |            |          
      SELECT sales.name,sum sales.num,sum(sales.num) * price.price           |     2 |   5886.436 | 2943.218
      AS total_price FROM sales,price WHERE sales.name = price.name          |       |            |
      AND sales.datetime BETWEEN $1 AND $2 GROUP BY sales.name,price.price   |       |            |
      SELECT sales.name,sum(sales.num) AS sum FROM sales GROUP BY sales.name |     6 |   6793.545 | 1132.257
      (3 rows)

      From pg_stat_statements view, you can see that the SQL that searches the sales and price tables is taking time.


      In addition to query statistics (pg_stat_statements), planner statistics (pg_statistic) and function statistics (pg_stat_user_functions) can be obtained periodically to use for analysis when performance deterioration occurs.

      Checking the server log

      In the example statistics output above, we have identified a SQL that takes 6 seconds on average. Next, we will check for SQLs that takes more than 3 seconds, using the server log (by setting the log_min_duration_statement parameter in postgresql.conf to 3 seconds).

      00000: 2019-03-05 11:18:08 JST [11408]: (3-1) user = fsep,db = mydb,remote = ::1(50650) app = psql
      LOG: duration: 6518.096 ms statement: SELECT sales.name, sum(sales.num),(sum(sales.num)) * price. price
      AS total_price FROM sales,price WHERE sales.name = price.name GROUP BY sales.name,price.price
      ORDER BY sales.name ASC;

      000000: 2019-03-05 11:21:30 JST [7184]: [7-1] user = ,db = ,remote = app = LOG: checkpoint starting: time
      000000: 2019-03-05 11:21:32 JST [7184]: [8-1] user = ,db = ,remote = app = LOG: checkpoint complete: wrote
      14 buffers (0.0%); O WAL file(s) added, 0 removed, 6 recycled; write=1.556 s, sync=0.014 s, total=1.693 s;
      sync files=13, longest=0.001 s, average=0.001 s; distance=199525 kB, estimate=199525 kB

      You can see that the SQL that looks up the sales and price tables takes a long time to execute, as also confirmed in the statistics view.


      In the server log, you can check the duration for each SQL, as well as the calling user (user), database (db), application (app), etc. However, settings must be done in advance for parameters including log_min_duration_statement and log_line_prefix in postgresql.conf so that this information is output to the log.


      Let's find out why processing search on the sales and price tables take time.

      Viewing the execution plan

      First, execute EXPLAIN to check the execution plan of the target SQL and investigate what is causing the problem. Then, execute the SQL with ANALYZE to check the actual measurement value. Remember, ANALYZE actually executes the SQL statement. When displaying execution plans for commands including INSERT and DELETE, use BEGIN and ROLLBACK statements to avoid affecting the data.

      mydb=# EXPLAIN ANALYZE SELECT sales.name,sum(sales.num),(sum(sales.num)) * price.price AS total FROM sales, pri
      ce WHERE sales.name = price.name GROUP BY sales.name,price.price ORDER BY Sales.name ASC;
                                                      QUERY PLAN
      Sort (cost=853626.74..853626.89 rows=60 width=26) (actual time=16518.220..16518.223 rows=10 loops=1)
        Sort Key: sales.name
        Sort Method: quicksort Memory: 25kB
        -> HashAggregate (cost=853624.21..853624.96 rows=60 width=26) (actual time=16518.195..16518.200 rows=10
             Group Key: sales.name, price.price
             Merge Join1 (cost=728623.09..803624.30 rows=49999914 width=14) (actual time=6149.953..13546.6562
                 rows=50000004 loops=1)
                   Merge Cond: ((sales.name)::text = (price.name) ::t
                -> Sort (cost=728553.76..741053.74 rows=4999991 width=10) (actual time=6148.266..8810.2902
                     rows=5000000 loops=1)
                        Sort Key: sales.name
                        Sort Method: external merge Disk: 108648KB2
                        -> Seq Scan on sales3 (cost0.00..86764.91 rows=49999914 width=10)
                       actual time=0.028..2540.105 rows=50000004 loops=1)
                 -> Sort (cost=65.83..68.33 rows=1000 width=13) (actual time=1.161..1.529 rows=963 loops=1)
                     Sort Key : price.name
                     Sort Method: quicksort Memory: 74KB5
                     -> Seq Scan on price3 (cost=0.00..16.00 rows=10004 width=13) (actual time=0.020..0.445
                    rows=10004 loops=1)
      Planning time: 0.327 ms
      Execution time: 16529.869 ms
      (17 rows)

      From the execution plan, we can tell that the final execution time is 16529.869ms, and the following can also be observed.

      1"Merge Join" is selected as the method of joining the sales and price tables.
      2The sort method of the sales table is "external sort (external merge: writes the sort result to an external file)", which takes considerable time. As a result, the entire merge join takes significant time.
      3"Seq Scan" is selected as the scan method for the sales and price tables.
      4Statistics are up to date, because the number of rows for the estimated and measured values are almost the same.
      5In the price table, "quicksort (in-memory sort)" is selected, so it is processed at high speed in memory (used 74kB).

      To identify a tuning point, refer to the actual time to see which processing is taking a long time. As you can see, there are no particular problems with 4 and 5 above, but the join in 1, 2, and 3 are taking a long time. This means we need to review whether the merge join method is appropriate, and whether the process can be improved.


      In 2, external sort is used because the memory used for sort exceeds the value of the work_mem parameter of postgesql.conf. In this case, one solution is to specify a value larger than the disk (108648kB) to the work_mem parameter as described below and change the sort method to quicksort which speeds up the process.

      • Modify postgesql.conf on the server
      • Change the value using PGOPTIONS on the client
      • Change the value using set on the client

      If you modify postgresql.conf on the server, the changes will take effect for the entire database cluster. This means significant increase of memory usage. For this reason, make changes to the client side as much as possible to localize the impact.

      Checking numbers of data records

      The following shows the number of data records in sales table and price table.

      mydb=# SELECT COUNT(*) FROM sales;
      (1 row)

      mydb=# SELECT COUNT(*) FROM price;
      (1 row)

      The sales table has 5 million records, and the price table has 1000 records. In this case, the tables joined are both relatively large for the database environment, so it can be assumed that merge join is a reasonable join method.

      Checking table definitions

      Next, output the table definitions of sales table and price table to confirm the definition for "name", the join key specified as the condition of WHERE.

      mydb=# ¥d sales
                                      Table "public.sales"
        Column   |            Type             |  Collation |  Nullable   | Default
        id        | integer                     |            |  not null   | nextval('sales_id_seq' :: regclass)
       name       | character varying (255)     |            |  not null   |
       datetime   | timestamp without time zone |            |  not null   |
       num        | integer                     |            |  not null   |

      mydb=# ¥d price
                                      Table "public.price"
         Column   |          Type               |  Collation |  Nullable   | Default
        id        | integer                     |            |  not null   | nextval('price_id_seql' :: regclass)
        name      | character varying (255)     |            |  not null   |
        price     | integer                     |            |  not null   |

        "price_pkey" PRIMARY KEY, btree (name)

      We can see that in the price table, the primary key column "name" is the index. However, even though the sales table is large, there is no index defined on the join key. We can presume that sorting takes long because no index is used for the join key in the merge join process.


      Based on the results described above (3. Investigation), add an index and check if the performance improves.

      Adding an index

      Add the index "sales_name_idx" to the column "name" of the sales table.

      mydb=# CREATE INDEX sales_name_idx ON sales(name);
      mydb=# ANALYZE;

      mydb=# ¥d sales
                           Table "public.sales"
         Column   |          Type               |  Collation |  Nullable   | Default
       id         | integer                     |            | not null    | nextval('sales_id_seq'::regclass)
       name       | character varying(255)      |            | not null    |
       datetime   | timestamp without time zone |            | not null    |
       num        | integer                     |            | not null    |
        "sales_name_idx” btree (name)

      Check the execution plan

      Output the execution plan again.

      mydb=# EXPLAIN ANALYZE SELECT sales.name, sum(sales.num), (sum(sales.num)) * price.price AS total FRON sales,pri
      ce WHERE sales.name = price.name GROUP BY sales.name,price.price ORDER BY sales.name ASC;
                                                      QUERY PLAN
       Sort (cost=390122.04..390122.19 rows=60 width=26) (actual time=9975.664..9975.668 rows=10 loops=1)
         Sort Key: sales.name
         Sort Method: quicksort Memory: 25kB
         -> HashAggregate (cost=390119.52..390120.27 rows=60 width=26)
            (actual time=9975.640..9975.646 rows=10 loops=1)
             Group Key: sales.name, price.price
             -> Merge Join (cost=57.67..340119.59 rows=4999993 width=14)
                (actual time=1.504..7972.011 rows=5000000 loops=1)
                  Merge Cond: ((sales.name)::text = (price.name) ::text)
                  -> Index Scan using sales name idx on sales (cost=0.43..277540.46 rows=4999993 width=10)
                      (actual time=0.066..3083.440 rows=5100000 loops=1)
                  -> Index Scan using price_pkey on price (cost=0.28..79.15 rows=1000 width=13)
                     (actual time=0.009..0.933 rows=963 loops=1)
       Planning time: 0.929 ms
       Execution time: 9975.771 ms
      (11 rows)

      The sort executed as a pre-process of merge join and its associated "Seq Scan" have changed to "Index Scan" using the index "sales_name_idx". As a result, the time required for merge join is reduced (with index: 13546.656ms, without index: 7972.011ms) which shortened the overall execution time (with index: 16529.869ms, without index: 9975.771ms).
      The execution time of EXPLAIN ANALYZE may differ from the SQL execution time (duration) confirmed in "Checking the server log" (it was 6518.096ms). We can check the SQL execution time (duration) in the server log after tuning. Set the log_min_duration_statement parameter in postgresql.conf to 0 (output all).

      00000: 2019-03-05 15:29:44 JST [11408]: [43-1] user = fsep, db = mydb, remote = ::1(50650) app = psql
      LOG: duration: 1479.164 ms statement: SELECT sales.name,sum(sales.num),(sum(sales.num)) * price.price
      AS total FROM sales,price WHERE sales.name = price.name GROUP BY sales.name,price.price
      ORDER BY sales.name ASC;

      The duration shown here is 1479.164ms, so you can also see on the server log that the SQL execution time is now shorter.

      In this article, we explained SQL tuning through an actual example. If performance problems occur during SQL execution, analyze execution plans and perform tuning so that appropriate processing can be implemented.

      Related PostgreSQL Insider articles

      Tuning - Overview of SQL tuning
      This article provides an overview of SQL tuning in PostgreSQL, explaining what is SQL tuning, how SQL is executed, how to detect slow SQL statements, and methods to investigate the cause.
      What are the partitioning types available in PostgreSQL, and how should I choose one?
      This article explains partitioning, the method for storing data in divided tables, based on declarative partitioning.
      Improving database performance using partitioning — step-by-step
      Explaining the mechanism of pruning, wise join, wise aggregation, that can enhance partitioning performance.

      Fujitsu Enterprise Postgres is designed for developers, data architects, and DBAs seeking the efficiency of a leading open source based database technology with the enhanced security and performance features plus support required in a serious business environment.


      See how Fujitsu Enterprise Postgres enhances PostgreSQL providing 5 key enterprise capabilities for your mission-critical applications.

      1st  page brochure FUJITSU Enterprise Postgeres - Fujitsu's enhanced open source PostgreSQL
      View brochure

      Read our Latest Blogs

      Read our most recent articles regarding all aspects of PostgreSQL and FUJITSU Enterprise Postgres.