While the two are mathematically related, treating them as interchangeable can lead to inaccurate results and suboptimal queries. In this post, we’ll demystify this crucial distinction using practical SQL examples, so you can design more effective, reliable vector-powered features with PostgreSQL.

Learn the key differences between similarity and distance in PostgreSQL vector search and how to apply them effectively in your queries.
Similarity isn’t distance — Understanding the difference
As more developers integrate vector search into their applications, for semantic search, recommendations, and AI-driven features, one foundational concept often causes confusion:
Similarity and distance are not the same thing, but they’re directly related.
If you're using PostgreSQL with the pgvector extension, understanding this distinction is key to writing effective queries and interpreting your results. In this article, we’ll break it down using clear visuals, a real-world SQL example, and a practical mental model for how cosine distance works in vector space.
Similarity vs distance: Two sides of the same coin

When we say two things are similar, we intuitively mean they're close in meaning. But in vector search, similarity isn't a vague judgment, it’s computed as a numeric score based on vector math.
The trick is that PostgreSQL’s vector search doesn't return a similarity score directly, instead, it returns a distance, which is essentially the inverse:
- Similarity: Higher value = more alike
- Distance: Lower value = more alike
This distinction matters, because when we sort rows in a query using the <=> operator in pgvector, we are sorting by cosine distance — a specific mathematical way to measure how far apart two vectors are based on their angle.
A visual example: Cosine distance and vector angles
Let’s say we embed the following sailing-related phrases into vector space:
- “solid built cruiser”
- “blue water cruiser”
- “truly blue water”
These phrases all mean something similar, but they differ in nuance. When embedded using a language model like OpenAI's text-embedding-3-small, their meanings are transformed into high-dimensional vectors. We can then visualize those vectors as arrows pointing in space.
In the diagram, each arrow represents a phrase. The smaller the angle between the vectors and the vector for our search phrase (“blue water capable”), the more similar they are in meaning. This angle is where cosine similarity comes in.
Phrase | Angle | Cosine similarity |
Cosine distance |
solid built cruiser | 19° | 0.945 | 0.055 |
truly blue water | 24° | 0.913 | 0.087 |
blue water cruiser | 32° | 0.848 | 0.152 |
The cosine similarity is calculated as the cosine of the angle between the vectors. Cosine distance, what pgvector actually uses, is just:
cosine_distance = 1 - cosine_similarity
Smaller angle → higher similarity → lower distance
Larger angle → lower similarity → higher distance
Real-world vector search in PostgreSQL
Now let’s look at how this plays out in a real query. Assume you have a table called boat that stores boats details along with vector embeddings of their descriptions.
name | description | embedding |
2001 unbekannt Hercules 38 DS | 2014; 1. Refit: Hull and deck paint... | [-0.052859765, 0.040222537, -0.043923237, 0.031689756, -0.084... |
1984 Feltz Skorpion 11 MS | The Feltz Skorpion is a tough, blu... | [-0.0015577168, 0.034499902, -0.0038531697, 0.04258901, 0.012... |
1996 Island Packet 37 | The Island Packet 37 offers the sta... | [-0.026485495, 0.011889433, -0.015693817, 0.060963105, 0.015... |
2023 Jeanneau Sun Odyssey 380 | The Sun Odyssey 380 won Best Monoh... | [-0.02277462, 0.03360344, -0.024761667, -0.000951637, 0.02930... |
2016 Lagoon 380 | Make: Lagoon Model: Lagoon 380 Yea... | [-0.036498778, 0.021291345, -0.013093531, -0.000648224, 0.00... |
2022 Prout 38 | privately used catamaran from 2001... | [-0.021242777, 0.045187734, 0.009782045, 0.025636568, 0.003... |
1982 Chassiron GT 38 | A fine example of this much sort a... | [-0.065575816, 0.045078347, -0.005168701, 0.0012237568, -0.01... |
1984 Hallberg-Rassy 38 | A cherished yacht, which is offer... | [0.030524747, 0.0372221, 0.018082908, 0.01155064, -0.01354... |
You want to find listings that are semantically similar to the phrase “blue water capable”, not just as a literal keyword match, but in meaning.
Here’s the SQL:
FROM staging.boat
WHERE embedding <=> (SELECT ai.openai_embed('text-embedding-3-small', 'blue water capable')) < 0.5
This query uses openai_embed() to turn your input phrase into a vector, compares it to every row using cosine distance (<=>), and then filters for rows where the distance is less than 0.5, meaning, semantically close enough.
The results include boats described as:
- “truly blue water”
- “blue water cruise”
and listings that don’t literally say “blue water capable,” but clearly describe boats suitable for long-distance offshore cruising.
That’s the power of semantic similarity: you retrieve relevant content even when the wording doesn’t match exactly.
How to interpret cosine distance scores
Here’s a practical guide for interpreting the numbers you get from <=>:
Cosine distance | Interpretation |
0.0 – 0.1 | Nearly identical meaning |
0.1 – 0.3 | Strong similarity |
0.3 – 0.5 | Loose/fuzzy match |
> 0.5 | Weak or no similarity |
In this case, we used < 0.5 as a filter, which gives us semantically-related entries without being overly strict.
Want tighter results? Try < 0.2.
Need broader discovery? Use < 0.7.
Getting vector search right: Why similarity vs. distance matters
Understanding the relationship between similarity and distance is essential for building effective vector search functionality in PostgreSQL with pgvector. Whether you're fine-tuning semantic search results, powering recommendation systems, or integrating AI capabilities, knowing how your queries interpret vector proximity will directly impact the relevance and accuracy of your results.
By distinguishing between distance-based and similarity-based approaches, you can unlock the full potential of vector search in your applications with confidence and precision.