Bring the strength of Elasticsearch to your next PostgreSQL project – it’s surprisingly easy.

Blog - How to enhance your PostgreSQL search power

For many organisations and developers, Elasticsearch has become a bit of a household name. The clever people at Elastic (www.elastic.co) have taken the famous Lucene search engine and turned it into a world-class set of tools for searching and analysing data. The tools include:

What if you were able to index your PostgreSQL data in Elasticsearch via your usual PostgreSQL database connection?

Well, it turns out you can!

The ZomboDB PostgreSQL extension is a mature and active project - two major requirements for using a PostgreSQL extension in production databases. The project was started in 2013 as a closed source, full-text search extension, before being made open source in 2015 by Eric Ridge, and taking on a new role as Index Access Method for Elasticsearch.

Elasticsearch Loves Data

The more data Elasticsearch indexes, the better it gets. However, to keep search indexes relevant, they must be constantly pruned and updated. Now it would be fair to say that much of the data found in an Elasticsearch index originated from SQL databases, and often, entire Elasticsearch indexes can be rebuilt purely from SQL data. As such, there is a parallel between updates to the permanent storage in the RDBMS, and the need to mirror this update in the search index.

So, while organisations are not typically keen on adding yet another data source to their systems, especially when that data source will be mostly a duplicate of existing data, the ZomboDB extension makes the indexing and maintenance of search data relatively straight forward. We all know Elasticsearch is not a database, and is not safe for storing critical data. That’s PostgreSQL’s job. However, it provides speedy searching across huge amounts of data. Organisations wanting to take advantage of this need to be able to get data into the search indexes, and ensure obsolete data is removed in a timely fashion. Removing the intensive full-text search tasks from PostgreSQL allows the DBMS to concentrate on the mission-critical tasks that organisations depend upon.

Additionally, the ZomboDB extension provides a very easy path to the maintenance of data in Elasticsearch instances, which is good for an organization’s data experts, who are often well versed in SQL technology. When a developer uses ZomboDB, maintaining search indexes, and even rebuilding whole indexes becomes a whole lot easier, because the index update can be part of the same work-flow that updates the permanent data in PostgreSQL.

So what kinds of things can a developer do using the ZomboDB PostgreSQL extension? The answer is, just about everything you can do directly using the Elasticsearch API. ZomboDB provides support for searching:

  • Full boolean operators (WITH, AND, OR, NOT)
  • Words
  • Phrases
  • Fielded searching
  • Fuzzy words and phrases
  • Value ranges
  • Wildcards (left, middle, and right truncation)
  • Term boosting
  • Proximity (of word or phrase or combinations)
  • Scripted searching
  • Elasticsearch "bool" queries
  • Direct access to any Elasticsearch query via JSON
  • Query expansion
  • "more like this"
  • and more.


Getting started with PostgreSQL and the ZomboDB extension is very easy

It's quite easy to get started with PostgreSQL and the ZomboDB extension because the API and SQL language extensions are clearly documented, and the tutorials are able to get a developer up and running in no time.

ZomboDB installs a number of custom domain types, “phrase”, “fulltext”, “fulltext_with_shingles” and “phase_array”, as well as domains for 30 different languages. These domains appear as normal text data types to PostgreSQL, but provide a special meaning to ZomboDB and the application developer.

Twenty-eight embedded functions give the developer access to a full range of Elasticsearch capability, including the ability to create, delete and update:

  • Indexes
  • Field Mappings
  • Aggregations
  • Terms
  • Analysers
  • Filters

The developer can also search elastic index tables, multiple tables, do tallies and scoring, and range aggregations, all via a PostgreSQL DBMS connection, using the familiar SQL syntax.

Behind the scenes, ZomboDB uses a HTTP web interface to communicate with the Elasticsearch server(s), which needs to have the ZomboDB elasticsearch extension installed. There is also a small change to the elasticsearch.yml to cater for ZomboDB’s use of the _bulk end-point.

Features

A full set of features mentioned by ZomboDB include:

Transaction-safe, MVCC-correct full text queries

Managed & queried via standard Postgres SQL

Works with tables of any structure

Automatically creates Elasticsearch Mappings supporting most datatypes, including arrays

Supports full set of Elasticsearch language analysers

Support for custom analyser chains

Custom per-field mappings

JSON / JSONB columns as nested objects for flexible schemaless sub-documents

Works with all Postgres query plans, including sequential scans

Use whatever method you currently use for talking to Postgres (JDBC, DBI, libpq, etc)

Extremely fast indexing

Per-row scoring with term/phrase boosting

 

Record count estimation

 

 

Custom full-text query language supporting nearly all of Elasticsearch's search features, such as:

Boolean operations

Proximity (in and out of order)

Phrases, wildcards, fuzzy terms/phrases

Regular expressions, inline scripts

Range queries

"more like this"

Any Elasticsearch query construct through direct JSON

Query expansion and index linking

Support for siren to resolve index links

Search multiple tables at once

High-performance hit highlighting

Common Elasticsearch's aggregations, including ability to nest

Access to all of Elasticsearch's aggregations via direct JSON

Extensive test suite

 

(source: https://github.com/zombodb/zombodb)

This extension opens new possibilities for application developers without introducing yet another external web interface to the application. Combining the enterprise-ready PostgreSQL with one of the top search engines is a sure way of delivering powerful insights into your data within your own applications.

If you are interested in learning more about Elasticsearch and how it may improve your PostgreSQL database, please contact us directly. Fujitsu provides 24/7 Australian-based PostgreSQL support and services, DBA and developer training, and our own enhanced version of PostgreSQL - FUJITSU Enterprise Postgres.

© Fujitsu 2017

Topics: Elasticsearch

Receive our blog

Receive notification of PostgreSQL-based articles for business and technical audiences.

SEARCH BY TOPIC

see all