Vector search gets expensive fast.
Without an index, every query has to compare your search embedding against every row in the table. That works for demos… but not at scale.
YugabyteDB supports HNSW (Hierarchical Navigable Small World) indexes via pgvector, allowing the database to perform approximate nearest neighbor (ANN) search instead of brute-force scanning.
HNSW is fast not just because of the algorithm… but because YugabyteDB distributes the index alongside the data.
⚡ How HNSW Makes Search Fast
HNSW builds a multi-layer graph of your vectors:
- ● Upper layers = coarse “shortcuts” across the dataset
- ● Lower layers = detailed local neighborhoods
When you search, the algorithm:
- 1. Starts at the top layer
- 2. Quickly jumps toward the closest region
- 3. Refines the search as it moves downward
This avoids scanning every row and instead navigates directly to likely matches.
🧠 The Hidden Advantage in YugabyteDB
HNSW is already fast. But YugabyteDB adds a huge architectural advantage:
In YugabyteDB, the HNSW index is not external and not centralized.
It is distributed and colocated with your table data at the tablet level.
This means:
- ● Vector search runs inside the distributed database
- ● Each tablet handles its own portion of the vector index
- ● No extra system or service is needed
- ● Lookups stay data-local
🔍 Proof: Table and HNSW Index Share the Same Tablets
Let’s look at a real example.
Table Definition
yugabyte=> \d products
Table "public.products"
Column | Type | Collation | Nullable | Default
-----------------------+---------------+-----------+----------+-------------------
id | uuid | | not null | gen_random_uuid()
name | text | | not null |
description | text | | |
price | numeric(10,2) | | |
category | text | | |
description_embedding | vector(3072) | | |
Indexes:
"products_pkey" PRIMARY KEY, lsm (id HASH)
"products_description_embedding_idx" ybhnsw (description_embedding vector_cosine_ops)
Tablet Placement: Base Table
yugabyte=> SELECT tablet_id, replica_role, cloud, region, zone FROM yb_tablet_placement_details('public','products');
tablet_id | replica_role | cloud | region | zone
----------------------------------+--------------+-------+-------------+---------------
dad378edea8642a2ba17d0c505a11db3 | LEADER | gcp | us-east4 | us-east4-a
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-west2 | us-west2-a
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-central1 | us-central1-b
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-east1 | us-east1-c
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-west4 | us-west4-a
(5 rows)
Tablet Placement: HNSW Index
yugabyte=> SELECT tablet_id, replica_role, cloud, region, zone FROM yb_tablet_placement_details('public','products_description_embedding_idx');
tablet_id | replica_role | cloud | region | zone
----------------------------------+--------------+-------+-------------+---------------
dad378edea8642a2ba17d0c505a11db3 | LEADER | gcp | us-east4 | us-east4-a
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-west2 | us-west2-a
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-central1 | us-central1-b
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-east1 | us-east1-c
dad378edea8642a2ba17d0c505a11db3 | FOLLOWER | gcp | us-west4 | us-west4-a
(5 rows)
yb_tablet_placement_details() function used here is covered in detail in this YugabyteDB Tip:
Exploring Tablet Distribution and Leader Placement in YugabyteDB
🚀 Why This Is So Fast
Here’s what’s actually happening under the hood:
🧩 Putting It Together
- ● HNSW reduces the search space
- ● YugabyteDB distributes the work across tablets
- ● Your example proves the index and table are physically aligned
👉 The result: fast, scalable vector search that stays data-local
Have Fun!
