From facets to prompts: Making RAG smarter in PostgreSQL

img-splash-blog-post-series-postgresql-and-ai-01

Enhance your RAG pipeline in PostgreSQL by incorporating facets and disciplined prompt design for more precise and relevant document retrieval.

In this blog post, we return to the retrieval pipeline, but this time we’ll make it smarter by including facets in the search. By filtering first on structured metadata before running semantic similarity, we can focus retrieval on the exact subset of documentation that matters. This hybrid approach improves precision, reduces noise, and sets us up to hand cleaner, more relevant context to the LLM.

We’ll also go deeper into prompt design, how to structure the context and user query so that the model consistently stays grounded in the retrieved information and avoids drifting into hallucination.

Why facets matter in retrieval

Embeddings are powerful, but without guardrails they can cast too wide a net. A query like “How do I configure replication in version 17 on Linux?” doesn’t just need semantic similarity, it needs to ignore answers that apply to Windows or to older releases. That’s exactly what facets give us.

Because we stored structured metadata alongside each chunk, we can now write hybrid queries in PostgreSQL that filter on facets before applying vector similarity. This narrows the candidate set down to only the chunks that match the context we know is relevant.

Here’s a simplified query that combines both:

SELECT doc_id, heading, content
FROM fep_manual_chunks
WHERE version = '17'
AND os = 'linux'
AND component = 'administration'
ORDER BY embedding <#> ai.embed('text-embedding-3-small', 'configure admin tool’)
LIMIT 6;

This query only considers version 17, Linux, administration docs, then rank the results semantically by closeness to the user query. The effect is that we hand the LLM a much cleaner context set, while also improving speed because PostgreSQL has far fewer rows to scan.

Building context for the LLM

Once we’ve narrowed down to a handful of relevant chunks, the next step is to assemble them into a prompt for the model. This part is deceptively simple but makes a big difference to the quality of the answers.

The key is consistency. The model should always receive the context in a predictable format. One straightforward way is to number each chunk and provide both its heading and body text. That way, the model can ‘see’ the structure of the documentation even before it generates an answer.

For example, given the following slice of the fep_manual_chunks table filtered to V17 Linux administration:

An effective prompt would be:

You are a technical assistant for Fujitsu Enterprise Postgres documentation.

Use only the provided context to answer the question.

If the context does not contain the answer, reply 'The documentation does not specify.'

Context:

[1] I.1 WebAdmin Configurations WebAdmin can be installed in two…

[2] I.2 Installing WebAdmin in a Single-Server Configuration…

[3] I.3 Installing WebAdmin in a Multi-server Configuration…

Question:

How do I configure a multi-server administration tool on version 17 on Linux?

This isn’t flashy prompt engineering, it’s just disciplined formatting. The chunks are clear, scoped, and easy for the model to work with.

Prompt discipline vs. prompt engineering

There’s a bit of hype around ‘prompt engineering’, but in RAG systems, it’s less about clever phrasing and more about discipline. The retrieval side (facets plus semantic search) does the heavy lifting. The prompt’s job is simply to keep the model inside the rails.

That said, you can experiment with variations:

Role framing: Framing the model as a documentation assistant vs. as a senior DBA helping a junior engineer can change tone and depth.
Answer format: Free text for conversational responses, bullet lists for quick steps, or JSON when the output needs to be machine-consumable.
Fallback handling: Reinforcing that the model should explicitly say “The documentation does not specify” when the answer isn’t present.

These tweaks don’t replace good retrieval, but they do make the system more usable in different contexts.

With facets filtering the candidate set, embeddings ranking by semantic closeness, and prompts constraining the model, we now have a RAG pipeline that is far more reliable than the embedding-only version.

The result is not just better answers, but answers you can trust, grounded in the exact slice of documentation that’s relevant to the user’s environment.

Coming up next

In the next post, we’ll push beyond static retrieval and prompts into Agents and agentized RAG. Instead of us hard coding the retrieval flow, we’ll explore how an LLM itself can decide which queries to run, how to refine them, and how to combine multiple passes of retrieval. This is where the line between retrieval and reasoning starts to blur, and where the system becomes more autonomous in answering complex questions.

Fujitsu PostgreSQL blog

< Back to blog home Fujitsu PostgreSQL blog

From facets to prompts: Making RAG smarter in PostgreSQL
Gary Evans | September 4, 2025

Why facets matter in retrieval

Building context for the LLM

Prompt discipline vs. prompt engineering

Coming up next

Other blog posts in this series

Receive our blog

Search by topic

Read our latest blogs

Receive our blog

Fill the form to receive notifications of future posts

Search by topic

Fujitsu PostgreSQL blog

< Back to blog home Fujitsu PostgreSQL blog From facets to prompts: Making RAG smarter in PostgreSQL Gary Evans | September 4, 2025

Why facets matter in retrieval

Building context for the LLM

Prompt discipline vs. prompt engineering

Coming up next

Other blog posts in this series

Receive our blog

Search by topic

Read our latest blogs

Receive our blog

Fill the form to receive notifications of future posts

Search by topic

< Back to blog home Fujitsu PostgreSQL blog

From facets to prompts: Making RAG smarter in PostgreSQL
Gary Evans | September 4, 2025