<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2826169&amp;fmt=gif">
Start  trial

    Start trial

      As vector search becomes a foundational feature in modern applications—from semantic search and recommendation engines to AI-driven insights—developers are increasingly adopting PostgreSQL with the pgvector extension. However, one concept often creates confusion: the difference between similarity and distance.

      While the two are mathematically related, treating them as interchangeable can lead to inaccurate results and suboptimal queries. In this post, we’ll demystify this crucial distinction using practical SQL examples, so you can design more effective, reliable vector-powered features with PostgreSQL.

      Learn the key differences between similarity and distance in PostgreSQL vector search and how to apply them effectively in your queries.

      Similarity isn’t distance — Understanding the difference

      As more developers integrate vector search into their applications, for semantic search, recommendations, and AI-driven features, one foundational concept often causes confusion:

      Similarity and distance are not the same thing, but they’re directly related.

      If you're using PostgreSQL with the pgvector extension, understanding this distinction is key to writing effective queries and interpreting your results. In this article, we’ll break it down using clear visuals, a real-world SQL example, and a practical mental model for how cosine distance works in vector space.

      Similarity vs distance: Two sides of the same coin

      When we say two things are similar, we intuitively mean they're close in meaning. But in vector search, similarity isn't a vague judgment, it’s computed as a numeric score based on vector math.

      The trick is that PostgreSQL’s vector search doesn't return a similarity score directly, instead, it returns a distance, which is essentially the inverse:

      • Similarity: Higher value = more alike
      • Distance: Lower value = more alike

      This distinction matters, because when we sort rows in a query using the <=> operator in pgvector, we are sorting by cosine distance — a specific mathematical way to measure how far apart two vectors are based on their angle.

      A visual example: Cosine distance and vector angles

      Let’s say we embed the following sailing-related phrases into vector space:

      • “solid built cruiser”
      • “blue water cruiser”
      • “truly blue water”

      These phrases all mean something similar, but they differ in nuance. When embedded using a language model like OpenAI's text-embedding-3-small, their meanings are transformed into high-dimensional vectors. We can then visualize those vectors as arrows pointing in space.

      In the diagram, each arrow represents a phrase. The smaller the angle between the vectors and the vector for our search phrase (“blue water capable”), the more similar they are in meaning. This angle is where cosine similarity comes in.

      Phrase Angle Cosine
      similarity
      Cosine
      distance
      solid built cruiser 19° 0.945 0.055
      truly blue water 24° 0.913 0.087
      blue water cruiser 32° 0.848 0.152

      The cosine similarity is calculated as the cosine of the angle between the vectors. Cosine distance, what pgvector actually uses, is just:

      cosine_distance = 1 - cosine_similarity

      Smaller angle → higher similarity → lower distance

      Larger angle → lower similarity → higher distance

      Real-world vector search in PostgreSQL

      Now let’s look at how this plays out in a real query. Assume you have a table called boat that stores boats details along with vector embeddings of their descriptions.

      name description embedding
      2001 unbekannt Hercules 38 DS 2014; 1. Refit: Hull and deck paint... [-0.052859765, 0.040222537, -0.043923237, 0.031689756, -0.084...
      1984 Feltz Skorpion 11 MS The Feltz Skorpion is a tough, blu... [-0.0015577168, 0.034499902, -0.0038531697, 0.04258901, 0.012...
      1996 Island Packet 37 The Island Packet 37 offers the sta... [-0.026485495, 0.011889433, -0.015693817, 0.060963105, 0.015...
      2023 Jeanneau Sun Odyssey 380 The Sun Odyssey 380 won Best Monoh... [-0.02277462, 0.03360344, -0.024761667, -0.000951637, 0.02930...
      2016 Lagoon 380 Make: Lagoon Model: Lagoon 380 Yea... [-0.036498778, 0.021291345, -0.013093531, -0.000648224, 0.00...
      2022 Prout 38 privately used catamaran from 2001... [-0.021242777, 0.045187734, 0.009782045, 0.025636568, 0.003...
      1982 Chassiron GT 38 A fine example of this much sort a... [-0.065575816, 0.045078347, -0.005168701, 0.0012237568, -0.01...
      1984 Hallberg-Rassy 38 A cherished yacht, which is offer... [0.030524747, 0.0372221, 0.018082908, 0.01155064, -0.01354...

      You want to find listings that are semantically similar to the phrase “blue water capable”, not just as a literal keyword match, but in meaning.

      Here’s the SQL:

      SELECT name, model, description
      FROM staging.boat
      WHERE embedding <=> (SELECT ai.openai_embed('text-embedding-3-small', 'blue water capable')) < 0.5

      This query uses openai_embed() to turn your input phrase into a vector, compares it to every row using cosine distance (<=>), and then filters for rows where the distance is less than 0.5, meaning, semantically close enough.

      The results include boats described as:

      • “truly blue water”
      • “blue water cruise”

      and listings that don’t literally say “blue water capable,” but clearly describe boats suitable for long-distance offshore cruising.

      That’s the power of semantic similarity: you retrieve relevant content even when the wording doesn’t match exactly.

      How to interpret cosine distance scores

      Here’s a practical guide for interpreting the numbers you get from <=>:

      Cosine distance Interpretation
      0.0 – 0.1 Nearly identical meaning
      0.1 – 0.3 Strong similarity
      0.3 – 0.5 Loose/fuzzy match
      > 0.5 Weak or no similarity

      In this case, we used < 0.5 as a filter, which gives us semantically-related entries without being overly strict.

      Want tighter results? Try < 0.2.

      Need broader discovery? Use < 0.7.

      Getting vector search right: Why similarity vs. distance matters

      Understanding the relationship between similarity and distance is essential for building effective vector search functionality in PostgreSQL with pgvector. Whether you're fine-tuning semantic search results, powering recommendation systems, or integrating AI capabilities, knowing how your queries interpret vector proximity will directly impact the relevance and accuracy of your results.

      By distinguishing between distance-based and similarity-based approaches, you can unlock the full potential of vector search in your applications with confidence and precision.

      Topics: PostgreSQL, PostgreSQL AI, pgvector, Vector search, Semantic Search

      Receive our blog

      Search by topic

      see all >
      photo-matthew-egan-in-hlight-circle-orange-yellow
      Gary Evans
      Senior Offerings and Center of Excellence Manager
      Gary Evans heads the Center of Excellence team at Fujitsu Software, providing expert services for customers in relation to PostgreSQL and Fujitsu Enterprise Postgres.
      He previously worked in IBM, Cable and Wireless based in London and the Inland Revenue Department of New Zealand, before joining Fujitsu. With over 15 years’ experience in database technology, Gary appreciates the value of data and how to make it accessible across your organization.
      Gary loves working with organizations to create great outcomes through tailored data services and software.
      Fujitsu Enterprise Postgres
      is an enhanced distribution of PostgreSQL, 100% compatible and with extended features.
      Compare the list of features.
      Our Migration Portal helps you assess the effort required to move to the enterprise-built version of Postgres - Fujitsu Enterprise Postgres.
      We also have a series of technical articles for PostgreSQL enthusiasts of all stripes, with tips and how-to's.

       

      Explore PostgreSQL Insider >
      Subscribe to be notified of future blog posts
      If you would like to be notified of my next blog posts and other PostgreSQL-related articles, fill the form here.

      Read our latest blogs

      Read our most recent articles regarding all aspects of PostgreSQL and Fujitsu Enterprise Postgres.

      Receive our blog

      Fill the form to receive notifications of future posts

      Search by topic

      see all >