Fujitsu Logo

By using a BRIN index instead of a BTREE index for certain queries, I can get huge index size savings with very similar performance.

efficienct coding.jpeg

A BRIN index is a Block Range Index. It’s a simple idea that can be applied quickly and easily for some nice and easy speed gains for certain types of queries.

BRIN indexes have been available since PostgreSQL 9.5.

When you set up a BRIN index, PostgreSQL reads your selected column's maximum and minimum values for each 8k page of stored data. PostgreSQL then stores just 3 pieces of information into the BRIN index, the page number, the minimum value and the maximum value for your chosen column.

Once setup, the BRIN values for each page are valid until any row of data stored on that page is updated. At that point the BRIN index for that 8k page is considered invalid and the next time you check for a value with say a SQL filter condition, every row in that page will need to be sequentially scanned to check for a match. If the page has not been updated since creation or it has been reindexed since any row updates, then Postgres can very quickly and easily check to see if a value you are trying to test against is possibly on that page and if it isn’t then the page can immediately be dropped from the result set.

This means that BRIN indexes are really useful for some queries and not as useful for others. A good example of where you could use a BRIN index is for data you would consider immutable, e.g. regularly collected timestamped sensor data is a good example, timestamps will only ever get larger, and once stored, there isn’t really a good reason to be updating any of the row’s values, after all this example ought to be an immutable log of historical sensor values.

This is a very straightforward way to optimise for speed even though the BRIN index only takes up very little space compared to a BTREE index.

BRIN indexes are set up when I create the index. They are updated when I call up the brin_summarize_new_values(regclass) function or when a vacuum happens on a page.

It is beneficial to cluster the column that I am using with the BRIN index. It wouldn’t be much use if each page on disk had one of the smallest data values and one of the largest data values in it as every page would then need to be checked and no pages could be skipped to get more performance. It’s quite likely that the data will already be ordered by the timestamp for my example but it’s worth checking.

For the example that follows; the BRIN index is 72KB in size and the BTREE index is a 676MB index size. Yet the performance is pretty much identical for selects.

Let’s say I have a set of data that has a timestamp and I get lots of data stored on a regular basis. This might be a new row of data from a sensor that logs the temperature of a refrigerator once every second. That would mean a table with an ID value, a sensor ID, a timestamp, and a temperature value with precision to the nearest degree that can then easily be stored as an integer.

I’ll end up with 86,400 rows of data per day. This might not sound like much but it adds up to over 31 million rows of data in a year.

Then let’s also assume that I now have a year’s worth of data stored and I want to search this table for a particular day’s average temperature. I could do a simple sequential scan of all rows in the table looking for every timestamp between a maximum and minimum, however this is not very efficient if I am looking for a result from just 1/365th of the data set.

I could speed this up by adding a simple BTREE index (the default type) to a column on this table, however it will be reasonably large for what I am doing.

This is where the BRIN index is a great idea.

And now for the hands-on part

Test setup:


CREATE TABLE temperature_log (log_id serial, sensor_id int, log_timestamp timestamp without time zone, temperature int);

INSERT INTO temperature_log(sensor_id,log_timestamp,temperature) VALUES (1,generate_series('2016-01-01'::timestamp,'2016-12-31'::timestamp,'1 second'),round(random()*100)::int);
This will create 31536001 rows of sensor test data.


I'll now retrieve the average temperature value for the 4th April 2016 and here are the results:

EXPLAIN ANALYZE SELECT AVG(temperature) FROM temperature_log WHERE log_timestamp>='2016-04-04' AND log_timestamp<'2016-04-05';

QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=674124.07..674124.08 rows=1 width=4) (actual time=3180.288..31 80.289 rows=1 loops=1)

-> Seq Scan on temperature_log (cost=0.00..673907.00 rows=86826 width=4) (a ctual time=629.297..3138.002 rows=86400 loops=1)

Filter: ((log_timestamp >= '2016-04-04 00:00:00'::timestamp without tim e zone) AND (log_timestamp < '2016-04-05 00:00:00'::timestamp without time zone) )

Rows Removed by Filter: 31449601

Planning time: 0.047 ms

Execution time: 3180.448 ms

(6 rows)
I will get the result in 3180 milliseconds with a straightforward sequential scan and filter of the whole table.


When I add a BTREE index and execute the same query as I did earlier for the same day’s average temperature, I will see:

CREATE INDEX idx_temperature_log_log_timestamp ON temperature_log USING btree (log_timestamp);
vacuum analyze;

EXPLAIN ANALYZE SELECT AVG(temperature) FROM temperature_log WHERE log_timestamp>='2016-04-04' AND log_timestamp<'2016-04-05';

QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=3467.15..3467.16 rows=1 width=4) (actual time=103.333..103.333 rows=1 loops=1)

-> Index Scan using idx_temperature_log_log_timestamp on temperature_log (c ost=0.56..3250.08 rows=86826 width=4) (actual time=0.019..55.902 rows=86400 loop s=1)

Index Cond: ((log_timestamp >= '2016-04-04 00:00:00'::timestamp without time zone) AND (log_timestamp < '2016-04-05 00:00:00'::timestamp without time z one))

Planning time: 0.059 ms

Execution time: 103.476 ms

(5 rows)
So this is much quicker, returning the result in 103 milliseconds.


With a BRIN index and the same query as earlier for the same day’s average temperature, I see:

DROP INDEX idx_temperature_log_log_timestamp;

CREATE INDEX idx_temperature_log_log_timestamp ON temperature_log USING BRIN (log_timestamp) WITH (pages_per_range = 128);

vacuum analyse;

EXPLAIN ANALYZE SELECT AVG(temperature) FROM temperature_log WHERE log_timestamp>='2016-04-04' AND log_timestamp<'2016-04-05';
My execution plan looks like:
QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=161148.90..161148.91 rows=1 width=4) (actual time=96.843..96.8 44 rows=1 loops=1) -> Bitmap Heap Scan on temperature_log (cost=935.12..160929.60 rows=87719 w idth=4) (actual time=1.123..53.489 rows=86400 loops=1)

Recheck Cond: ((log_timestamp >= '2016-04-04 00:00:00'::timestamp witho ut time zone) AND (log_timestamp < '2016-04-05 00:00:00'::timestamp without time zone))

Rows Removed by Index Recheck: 14080

Heap Blocks: lossy=640

-> Bitmap Index Scan on idx_temperature_log_log_timestamp (cost=0.00. .913.19 rows=87719 width=0) (actual time=0.638..0.638 rows=6400 loops=1)

Index Cond: ((log_timestamp >= '2016-04-04 00:00:00'::timestamp w ithout time zone) AND (log_timestamp < '2016-04-05 00:00:00'::timestamp without time zone))

Planning time: 0.064 ms

Execution time: 97.023 ms

(9 rows)
For the BRIN index I get a result in 97 milliseconds. This is slightly quicker than using a BTREE index for this particular test but let’s take a look at the index sizes.


This query will return the index size of our new indexes:

nspname AS schema_name,
relname AS index_name,
round(100 * pg_relation_size(indexrelid) / pg_relation_size(indrelid)) / 100 AS index_ratio,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
pg_size_pretty(pg_relation_size(indrelid)) AS table_size

pg_index I

pg_class C

(C.oid = I.indexrelid)

pg_namespace N

(N.oid = C.relnamespace)

C.relkind = 'i' AND
pg_relation_size(indrelid) > 0 AND

pg_relation_size(indexrelid) DESC, index_ratio DESC;
The BRIN index is 72KB in size.
The BTREE index is 676MB index size.

That's a big difference.

BTREE indexes are much more capable in many ways but a BRIN index is perfect for this use case. So in this example, if I want to save myself around 676MB of disk and memory space but still end up with roughly the same performance, I can use a BRIN index for this table instead of a BTREE.

I will also benefit from the fact that for each insert I am are doing once a second, PostgreSQL will not need to add index values to a BTREE index. This will make my inserts faster too.

Topics: Fujitsu Enterprise Postgres, Enhanced Enterprise Open Source Database, PostgreSQL Development