Understanding similarity vs distance in PostgreSQL vector search

While the two are mathematically related, treating them as interchangeable can lead to inaccurate results and suboptimal queries. In this post, we’ll demystify this crucial distinction using practical SQL examples, so you can design more effective, reliable vector-powered features with PostgreSQL.

Learn the key differences between similarity and distance in PostgreSQL vector search and how to apply them effectively in your queries.

Similarity isn’t distance — Understanding the difference

As more developers integrate vector search into their applications, for semantic search, recommendations, and AI-driven features, one foundational concept often causes confusion:

Similarity and distance are not the same thing, but they’re directly related.

If you're using PostgreSQL with the pgvector extension, understanding this distinction is key to writing effective queries and interpreting your results. In this article, we’ll break it down using clear visuals, a real-world SQL example, and a practical mental model for how cosine distance works in vector space.

Similarity vs distance: Two sides of the same coin

When we say two things are similar, we intuitively mean they're close in meaning. But in vector search, similarity isn't a vague judgment, it’s computed as a numeric score based on vector math.

The trick is that PostgreSQL’s vector search doesn't return a similarity score directly, instead, it returns a distance, which is essentially the inverse:

Similarity: Higher value = more alike
Distance: Lower value = more alike

This distinction matters, because when we sort rows in a query using the <=> operator in pgvector, we are sorting by cosine distance — a specific mathematical way to measure how far apart two vectors are based on their angle.

A visual example: Cosine distance and vector angles

Let’s say we embed the following sailing-related phrases into vector space:

“solid built cruiser”
“blue water cruiser”
“truly blue water”

These phrases all mean something similar, but they differ in nuance. When embedded using a language model like OpenAI's text-embedding-3-small, their meanings are transformed into high-dimensional vectors. We can then visualize those vectors as arrows pointing in space.

In the diagram, each arrow represents a phrase. The smaller the angle between the vectors and the vector for our search phrase (“blue water capable”), the more similar they are in meaning. This angle is where cosine similarity comes in.

Phrase	Angle	Cosine similarity	Cosine distance
solid built cruiser	19°	0.945	0.055
truly blue water	24°	0.913	0.087
blue water cruiser	32°	0.848	0.152

The cosine similarity is calculated as the cosine of the angle between the vectors. Cosine distance, what pgvector actually uses, is just:

cosine_distance = 1 - cosine_similarity

Smaller angle → higher similarity → lower distance

Larger angle → lower similarity → higher distance

Real-world vector search in PostgreSQL

Now let’s look at how this plays out in a real query. Assume you have a table called boat that stores boats details along with vector embeddings of their descriptions.

name	description	embedding
2001 unbekannt Hercules 38 DS	2014; 1. Refit: Hull and deck paint...	[-0.052859765, 0.040222537, -0.043923237, 0.031689756, -0.084...
1984 Feltz Skorpion 11 MS	The Feltz Skorpion is a tough, blu...	[-0.0015577168, 0.034499902, -0.0038531697, 0.04258901, 0.012...
1996 Island Packet 37	The Island Packet 37 offers the sta...	[-0.026485495, 0.011889433, -0.015693817, 0.060963105, 0.015...
2023 Jeanneau Sun Odyssey 380	The Sun Odyssey 380 won Best Monoh...	[-0.02277462, 0.03360344, -0.024761667, -0.000951637, 0.02930...
2016 Lagoon 380	Make: Lagoon Model: Lagoon 380 Yea...	[-0.036498778, 0.021291345, -0.013093531, -0.000648224, 0.00...
2022 Prout 38	privately used catamaran from 2001...	[-0.021242777, 0.045187734, 0.009782045, 0.025636568, 0.003...
1982 Chassiron GT 38	A fine example of this much sort a...	[-0.065575816, 0.045078347, -0.005168701, 0.0012237568, -0.01...
1984 Hallberg-Rassy 38	A cherished yacht, which is offer...	[0.030524747, 0.0372221, 0.018082908, 0.01155064, -0.01354...

You want to find listings that are semantically similar to the phrase “blue water capable”, not just as a literal keyword match, but in meaning.

Here’s the SQL:

SELECT name, model, description
FROM staging.boat
WHERE embedding <=> (SELECT ai.openai_embed('text-embedding-3-small', 'blue water capable')) < 0.5

This query uses openai_embed() to turn your input phrase into a vector, compares it to every row using cosine distance (<=>), and then filters for rows where the distance is less than 0.5, meaning, semantically close enough.

The results include boats described as:

“truly blue water”
“blue water cruise”

and listings that don’t literally say “blue water capable,” but clearly describe boats suitable for long-distance offshore cruising.

That’s the power of semantic similarity: you retrieve relevant content even when the wording doesn’t match exactly.

How to interpret cosine distance scores

Here’s a practical guide for interpreting the numbers you get from <=>:

Cosine distance	Interpretation
0.0 – 0.1	Nearly identical meaning
0.1 – 0.3	Strong similarity
0.3 – 0.5	Loose/fuzzy match
> 0.5	Weak or no similarity

In this case, we used < 0.5 as a filter, which gives us semantically-related entries without being overly strict.

Want tighter results? Try < 0.2.

Need broader discovery? Use < 0.7.

Getting vector search right: Why similarity vs. distance matters

Understanding the relationship between similarity and distance is essential for building effective vector search functionality in PostgreSQL with pgvector. Whether you're fine-tuning semantic search results, powering recommendation systems, or integrating AI capabilities, knowing how your queries interpret vector proximity will directly impact the relevance and accuracy of your results.

By distinguishing between distance-based and similarity-based approaches, you can unlock the full potential of vector search in your applications with confidence and precision.

Fujitsu PostgreSQL blog

< Back to blog home Fujitsu PostgreSQL blog

Understanding similarity vs distance in PostgreSQL vector search
Gary Evans | June 18, 2025

While the two are mathematically related, treating them as interchangeable can lead to inaccurate results and suboptimal queries. In this post, we’ll demystify this crucial distinction using practical SQL examples, so you can design more effective, reliable vector-powered features with PostgreSQL.

Similarity isn’t distance — Understanding the difference

Similarity vs distance: Two sides of the same coin

A visual example: Cosine distance and vector angles

Real-world vector search in PostgreSQL

How to interpret cosine distance scores

Getting vector search right: Why similarity vs. distance matters

Receive our blog

Search by topic

Read our latest blogs

Receive our blog

Fill the form to receive notifications of future posts

Search by topic

Fujitsu PostgreSQL blog

< Back to blog home Fujitsu PostgreSQL blog Understanding similarity vs distance in PostgreSQL vector search Gary Evans | June 18, 2025

Similarity isn’t distance — Understanding the difference

Similarity vs distance: Two sides of the same coin

A visual example: Cosine distance and vector angles

Real-world vector search in PostgreSQL

How to interpret cosine distance scores

Getting vector search right: Why similarity vs. distance matters

Receive our blog

Search by topic

Read our latest blogs

Receive our blog

Fill the form to receive notifications of future posts

Search by topic

< Back to blog home Fujitsu PostgreSQL blog

Understanding similarity vs distance in PostgreSQL vector search
Gary Evans | June 18, 2025