<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      In a previous blog post, we built a simple retrieval pipeline where we chunked the documentation, embedded the text, and queried PostgreSQL with vector similarity to find relevant passages. Then in the most recent post, we shifted gears and focused on the ingestion side and looked at extraction of facets such as version, operating system, document type, and component, and stored them alongside the embeddings in our database.

      Now it’s time to bring those two threads together.

      img-splash-blog-post-series-postgresql-and-ai-01

      Enhance your RAG pipeline in PostgreSQL by incorporating facets and disciplined prompt design for more precise and relevant document retrieval.

      In this blog post, we return to the retrieval pipeline, but this time we’ll make it smarter by including facets in the search. By filtering first on structured metadata before running semantic similarity, we can focus retrieval on the exact subset of documentation that matters. This hybrid approach improves precision, reduces noise, and sets us up to hand cleaner, more relevant context to the LLM.

      We’ll also go deeper into prompt design, how to structure the context and user query so that the model consistently stays grounded in the retrieved information and avoids drifting into hallucination.

      Why facets matter in retrieval

      Embeddings are powerful, but without guardrails they can cast too wide a net. A query like “How do I configure replication in version 17 on Linux?” doesn’t just need semantic similarity, it needs to ignore answers that apply to Windows or to older releases. That’s exactly what facets give us.

      Because we stored structured metadata alongside each chunk, we can now write hybrid queries in PostgreSQL that filter on facets before applying vector similarity. This narrows the candidate set down to only the chunks that match the context we know is relevant.

      Here’s a simplified query that combines both:

      SELECT doc_id, heading, content
      FROM fep_manual_chunks
      WHERE version = '17'
        AND os = 'linux'
        AND component = 'administration'
      ORDER BY embedding <#> ai.embed('text-embedding-3-small', 'configure admin tool’)
      LIMIT 6;

      This query only considers version 17, Linux, administration docs, then rank the results semantically by closeness to the user query. The effect is that we hand the LLM a much cleaner context set, while also improving speed because PostgreSQL has far fewer rows to scan.

      Building context for the LLM

      Once we’ve narrowed down to a handful of relevant chunks, the next step is to assemble them into a prompt for the model. This part is deceptively simple but makes a big difference to the quality of the answers.

      The key is consistency. The model should always receive the context in a predictable format. One straightforward way is to number each chunk and provide both its heading and body text. That way, the model can ‘see’ the structure of the documentation even before it generates an answer.

      For example, given the following slice of the fep_manual_chunks table filtered to V17 Linux administration:

      An effective prompt would be:

      You are a technical assistant for Fujitsu Enterprise Postgres documentation.

      Use only the provided context to answer the question.

      If the context does not contain the answer, reply 'The documentation does not specify.'

      Context:

      [1] I.1 WebAdmin Configurations WebAdmin can be installed in two…

      [2] I.2 Installing WebAdmin in a Single-Server Configuration…

      [3] I.3 Installing WebAdmin in a Multi-server Configuration…

      Question:

      How do I configure a multi-server administration tool on version 17 on Linux?

      This isn’t flashy prompt engineering, it’s just disciplined formatting. The chunks are clear, scoped, and easy for the model to work with.

      Prompt discipline vs. prompt engineering

      There’s a bit of hype around ‘prompt engineering’, but in RAG systems, it’s less about clever phrasing and more about discipline. The retrieval side (facets plus semantic search) does the heavy lifting. The prompt’s job is simply to keep the model inside the rails.

      That said, you can experiment with variations:

      • Role framing: Framing the model as a documentation assistant vs. as a senior DBA helping a junior engineer can change tone and depth.
      • Answer format: Free text for conversational responses, bullet lists for quick steps, or JSON when the output needs to be machine-consumable.
      • Fallback handling: Reinforcing that the model should explicitly say “The documentation does not specify” when the answer isn’t present.

      These tweaks don’t replace good retrieval, but they do make the system more usable in different contexts.

      With facets filtering the candidate set, embeddings ranking by semantic closeness, and prompts constraining the model, we now have a RAG pipeline that is far more reliable than the embedding-only version. 

      The result is not just better answers, but answers you can trust, grounded in the exact slice of documentation that’s relevant to the user’s environment.

      Coming up next

      In the next post, we’ll push beyond static retrieval and prompts into Agents and agentized RAG. Instead of us hard coding the retrieval flow, we’ll explore how an LLM itself can decide which queries to run, how to refine them, and how to combine multiple passes of retrieval. This is where the line between retrieval and reasoning starts to blur, and where the system becomes more autonomous in answering complex questions.

      Other blog posts in this series

      Topics: PostgreSQL, RAG (Retrieval-Augmented Generation), PostgreSQL AI, Vector search, Gary Evans' AI blog series, Large Language Model (LLM)

      Receive our blog

      Search by topic

      see all >
      photo-matthew-egan-in-hlight-circle-orange-yellow
      Gary Evans
      Senior Offerings and Center of Excellence Manager
      Gary Evans heads the Center of Excellence team at Fujitsu Software, providing expert services for customers in relation to PostgreSQL and Fujitsu Enterprise Postgres.
      He previously worked in IBM, Cable and Wireless based in London and the Inland Revenue Department of New Zealand, before joining Fujitsu. With over 15 years’ experience in database technology, Gary appreciates the value of data and how to make it accessible across your organization.
      Gary loves working with organizations to create great outcomes through tailored data services and software.
      Fujitsu Enterprise Postgres
      is an enhanced distribution of PostgreSQL, 100% compatible and with extended features.
      Compare the list of features.
      Our Migration Portal helps you assess the effort required to move to the enterprise-built version of Postgres - Fujitsu Enterprise Postgres.
      We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.

       

      Explore PostgreSQL Insider >
      Subscribe to be notified of future blog posts
      If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.

      Read our latest blogs

      Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >