π The Problem: Search is Stillβ¦ Dumb
Most application search still relies on:
- β
LIKE '%term%' - β exact keyword matching
- β brittle full-text search
Which means:
- If users donβt type the exact right wordsβ¦ they get nothing.
At the same time, modern βAI searchβ architectures often swing too far the other way:
- β OLTP database ποΈ
- β Separate vector database π§
- β Sync pipelines π
- β Eventual consistency π¬
π More powerβ¦ but a lot more complexity.
π‘ A Simpler Approach
What if you could:
- β Store relational data and embeddings together
- β Run SQL + semantic search in one query
- β Avoid sync pipelines entirely
With YugabyteDB, you can.
YugabyteDB lets you store vector embeddings alongside relational data and query both using standard SQL… no separate vector database required.
βοΈ The Architecture (Simple + Powerful)
This demo uses:
- β YugabyteDB β distributed SQL + vector storage
- β Google AI Studio β embedding generation
- β pgvector (
vector) β stores embeddings - β
ybhnswindex β fast similarity search
π Result: Intent-based search powered entirely through SQL
π§± Step 1: Create the Extension + Table + Index
We store both structured data and embeddings in the same row.
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
description TEXT,
price NUMERIC(10,2),
category TEXT,
description_embedding vector(3072)
);
CREATE INDEX ON products
USING ybhnsw (description_embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 128);
What this does
- β Stores each productβs meaning as a vector
- β Uses a 3072-dimension embedding (from the model)
- β Builds a distributed HNSW index for fast lookup
π The important part: the embedding lives with the row it represents
The CREATE EXTENSION vector command installs pgvector, adding a native vector data type to the YSQL layer.
What this enables:
- β Store embeddings as arrays (e.g.,
vector(3072)) - β Compare vectors using operators like
<=>(cosine similarity) - β Index vectors using ybhnsw for fast nearest-neighbor search
How YugabyteDB handles it:
- β Vectors live alongside relational data in the same table
- β Similarity search runs using standard SQL
- β The ybhnsw index is distributed across nodes for scale
π§ Step 2: Generate Embeddings
This is where the βAIβ comes in.
To generate embeddings:
- 1. Go to Google AI Studio
- 2. Click Get API Key β Create key
- 3. Copy the key (starts with
AIza...)
In this demo, we use it to:
- β Convert text (product descriptions and user queries) into vector embeddings
- β Ensure both data and queries are represented in the same semantic space
- β Enable similarity search inside YugabyteDB using those vectors
π The free tier is sufficient for this demo
π» Step 3: Configure Your Environment
Create a .env file:
AI_STUDIO_KEY=your_api_key
YB_HOST=your_cluster_host
YB_PORT=5433
YB_USER=admin
YB_PASSWORD=your_password
YB_DB=yugabyte
π¦ Step 4: Generate Data + Embeddings
Install dependencies
requirements.txt:
google-genai==0.3.0
psycopg2-binary==2.9.9
python-dotenv==1.0.1
Run:
pip install -r requirements.txt
Seed data:
seed_data.py:
import os
import random
import psycopg2
from google import genai
from dotenv import load_dotenv
load_dotenv()
if not os.getenv("YB_HOST"):
print("β ERROR: Could not find YB_HOST. Please check your .env file!")
exit(1)
client = genai.Client(api_key=os.getenv("AI_STUDIO_KEY"))
YB_CONFIG = {
"host": os.getenv("YB_HOST"),
"port": os.getenv("YB_PORT", "5433"),
"user": os.getenv("YB_USER", "yugabyte"),
"password": os.getenv("YB_PASSWORD"),
"database": os.getenv("YB_DB", "yugabyte")
}
categories = ["Electronics", "Outdoor", "Home Office", "Kitchen", "Fitness"]
adjectives = ["Professional", "Ultra-durable", "Eco-friendly", "Compact", "Luxury"]
products_base = [
("Noise-cancelling Headphones", "Perfect for focusing in loud offices or airplanes. Advanced active noise cancellation and spatial audio for a focused work environment."),
("Waterproof Tent", "Reliable shelter for heavy rain and stormy camping trips. A rugged, weatherproof shelter designed for high-altitude trekking and extreme cold."),
("Ergonomic Chair", "Supports your back during long hours of remote work. Adjustable lumbar support and breathable mesh designed for long-duration remote work."),
("Cast Iron Skillet", "Retains heat perfectly for gourmet cooking at home. A heavy-duty skillet that offers superior heat retention for professional searing."),
("Yoga Mat", "Extra thick cushioning for joint support during heavy exercise. Non-slip texture and high-density cushioning to protect joints during intense floor workouts.")
]
def generate_bulk_data(count=50):
data = []
for i in range(count):
base_name, base_desc = random.choice(products_base)
adj = random.choice(adjectives)
cat = random.choice(categories)
name = f"{adj} {base_name} Gen-{i}"
price = round(random.uniform(15.0, 250.0), 2)
full_desc = f"{base_desc} This {cat} item is {adj.lower()} and highly rated."
data.append({"name": name, "desc": full_desc, "price": price, "cat": cat})
return data
def seed_database():
products = generate_bulk_data(50)
conn = None
cur = None
try:
conn = psycopg2.connect(**YB_CONFIG, sslmode='disable')
cur = conn.cursor()
print(f"Connected. Starting ingestion of {len(products)} items...")
for p in products:
# Let the model output its native 3,072 dimensions!
res = client.models.embed_content(
model="gemini-embedding-001",
contents=p['desc']
)
vector = res.embeddings[0].values
cur.execute("""
INSERT INTO products (id, name, description, price, category, description_embedding)
VALUES (gen_random_uuid(), %s, %s, %s, %s, %s)
""", (p['name'], p['desc'], p['price'], p['cat'], vector))
print(f"Inserted: {p['name']}")
conn.commit()
print("\n--- Seeding Complete! ---")
except Exception as e:
print(f"\nβ Database Error: {e}")
finally:
if cur: cur.close()
if conn: conn.close()
if __name__ == "__main__":
seed_database()
What this script does
- β Generates ~50 diverse products
- β Builds descriptive text per product
- β Calls Google AI Studio for embeddings
- β Inserts both text + vector into YugabyteDB
To run:
python seed_data.py
π You now have a semantic-ready dataset
Each product description is converted into a dense vector using a transformer-based model. Similar meanings are placed close together in vector space.
π§ Step 5: Understanding the Query Vector (Conceptual)
Before we run our search script, it is important to understand what happens under the hood when a user types a search query.
We don’t search using the raw text. Instead, we have to convert the user’s search intent into a vector using the exact same model we used to seed the database.
Inside our application code, it looks like this:
# Note: You do not need to run this snippet standalone.
# We will include this logic in our final search script in the next step.
res = client.models.embed_content(
model="text-embedding-004",
contents="something for a rainy day at the beach"
)
query_vector = res.embeddings[0].values
What this means:
- β The text query is converted into a vector residing in the exact same semantic space as your product data.
- β The database matching will now be based on mathematical distance (intent), rather than text matching.
π Step 6: Run the Semantic Search
SELECT name, price, category
FROM products
ORDER BY description_embedding <=> '[query_vector]'
LIMIT 5;
What is happening here?
- β
<=>computes cosine distance - β Lower distance = more similar meaning
- β Results are ranked by semantic similarity
π€― Dumb vs Smart Search
In the script below, we use a simple ILIKE query to represent traditional keyword search. This makes it very easy to demonstrate the limitations of exact-string matching.
However, database purists will rightly point out that a production-grade application wouldnβt rely on ILIKE for search. Standard PostgreSQL (and YugabyteDB) offers robust Full-Text Search (FTS) using tsvector and tsquery.
FTS is significantly smarter than ILIKE… it handles tokenization, stop-words, and stemming (knowing that βrainβ, βrainsβ, and βrainingβ share the same root).
But hereβs the catch: even highly optimized Full-Text Search is fundamentally a lexical search. It still depends on the presence of actual words (or their root forms).
Vector search steps in to handle semantic search… the underlying meaning or intent. For the sake of keeping this demo simple and highlighting that contrast, weβre sticking with ILIKE.
This script compares “dumb” syntax matching with “smart” semantic matching.
search_demo.py:
import os
import psycopg2
from google import genai
from dotenv import load_dotenv
load_dotenv()
if not os.getenv("YB_HOST"):
print("β ERROR: Could not find YB_HOST. Please check your .env file!")
exit(1)
client = genai.Client(api_key=os.getenv("AI_STUDIO_KEY"))
YB_CONFIG = {
"host": os.getenv("YB_HOST"),
"port": os.getenv("YB_PORT", "5433"),
"user": os.getenv("YB_USER", "yugabyte"),
"password": os.getenv("YB_PASSWORD"),
"database": os.getenv("YB_DB", "yugabyte")
}
def run_keyword_search(cur, query_text):
sql = "SELECT name, price FROM products WHERE description ILIKE %s LIMIT 3;"
cur.execute(sql, (f"%{query_text}%",))
return cur.fetchall()
def run_ai_search(cur, query_text):
# Native 3,072 dimension output
res = client.models.embed_content(
model="gemini-embedding-001",
contents=query_text
)
vector = res.embeddings[0].values
sql = """
SELECT name, price, (description_embedding <=> %s::vector) as distance
FROM products
ORDER BY distance ASC
LIMIT 3;
"""
cur.execute(sql, (str(vector),))
return cur.fetchall()
def main():
conn = None
cur = None
try:
conn = psycopg2.connect(**YB_CONFIG, sslmode='disable')
cur = conn.cursor()
while True:
print("\n" + "="*50)
user_input = input("Enter search query (or 'exit'): ")
if user_input.lower() == 'exit': break
print(f"\n--- [1] KEYWORD SEARCH (Standard SQL) ---")
kw_results = run_keyword_search(cur, user_input)
if not kw_results:
print("β No results found for these exact keywords.")
for row in kw_results:
print(f"Result: {row[0]} (${row[1]})")
print(f"\n--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---")
ai_results = run_ai_search(cur, user_input)
for row in ai_results:
print(f"Result: {row[0]} (${row[1]}) [Score: {row[2]:.4f}]")
except Exception as e:
print(f"Error: {e}")
finally:
if cur: cur.close()
if conn: conn.close()
if __name__ == "__main__":
main()
Run:
python search_demo.py
This compares:
- β Keyword search (
ILIKE) - β Vector search (
<=>)
Hereβs what happens when users search by intent instead of exact words:
Enter search query (or 'exit'): rainy day at the beach
--- [1] KEYWORD SEARCH (Standard SQL) ---
β No results found for these exact keywords.
--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---
Result: Compact Waterproof Tent Gen-4 ($183.63) [Score: 0.4423]
Result: Eco-friendly Waterproof Tent Gen-19 ($225.33) [Score: 0.4503]
Result: Luxury Waterproof Tent Gen-20 ($119.80) [Score: 0.4536]
==================================================
Enter search query (or 'exit'): noise suppression flight
--- [1] KEYWORD SEARCH (Standard SQL) ---
β No results found for these exact keywords.
--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---
Result: Professional Noise-cancelling Headphones Gen-23 ($162.17) [Score: 0.3874]
Result: Eco-friendly Noise-cancelling Headphones Gen-43 ($212.03) [Score: 0.3928]
Result: Professional Noise-cancelling Headphones Gen-1 ($137.70) [Score: 0.3928]
==================================================
Keyword search fails because the exact words arenβt present. Vector search succeeds because it compares meaning.
| Query | Keyword Search | Semantic Search |
|---|---|---|
| “rainy day at the beach” | β No results |
β Compact Waterproof Tent Gen-4 ($183.63) β Eco-friendly Waterproof Tent Gen-19 ($225.33) β Luxury Waterproof Tent Gen-20 ($119.80) |
| “noise suppression flight” | β No results |
β Professional Noise-cancelling Headphones Gen-3 ($213.34) β Eco-friendly Noise-cancelling Headphones Gen-43 ($212.03) β Professional Noise-cancelling Headphones Gen-1 ($137.70) |
βοΈ When Keyword Search Does Work
So far, weβve shown cases where keyword search fails and semantic search wins.
But letβs look at the opposite scenario.
What happens when the exact word exists?
If a user types:
- “rain”
We know our dataset contains descriptions like:
- “Reliable shelter for heavy rain and stormy camping trips…”
So now standard SQL should work perfectly.
Sample demo output:
Enter search query (or 'exit'): camping
--- [1] KEYWORD SEARCH (Standard SQL) ---
Result: Professional Waterproof Tent Gen-40 ($183.83)
Result: Compact Waterproof Tent Gen-4 ($183.63)
Result: Eco-friendly Waterproof Tent Gen-35 ($52.00)
--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---
Result: Compact Waterproof Tent Gen-4 ($183.63) [Score: 0.3790]
Result: Compact Waterproof Tent Gen-25 ($19.81) [Score: 0.3844]
Result: Eco-friendly Waterproof Tent Gen-19 ($225.33) [Score: 0.3878]
π€ Whatβs Happening Here?
- β Keyword Search (
ILIKE)- β Matches the literal text “camping” because it physically exists in the product description (“stormy camping trips”).
- β Because our standard SQL query doesn’t have an
ORDER BYclause, the database simply returns the first three matching rows it happens to find on disk.
- β Semantic Search (Vector Distance)
- β Also returns the tents, but not just because the word is there. It understands that “camping” is the core semantic use-case for a tent.
- β It retrieves the results and strictly ranks them using mathematical meaning similarity.
π Notice theΒ low distance scores (~0.38)
While it’s not a near-zero perfect match (since a single word “camping” is much shorter than a full product description), a score of 0.38 indicates a highly relevant semantic relationship in the vector space.
Because our script generated 50 random products, there are many βWaterproof Tentsβ in the database. Keyword search simply grabbed the first few results containing the literal word.
AI search, however, evaluated the meaning behind the query and returned the closest semantic matches.
Both methods successfully found the right category of products… but they did it using entirely different mechanisms.
AI search does not replace SQL… it enhances it. When exact matches exist, SQL works great. When users search by intent or broader concepts, vector search fills the gap.
β‘ How Lookups Stay Fast (HNSW)
At first glance, semantic search sounds expensive:
- βDo we compare my query vector to every row?β
π Thankfully… no.
YugabyteDB uses a graph-based index called HNSW (Hierarchical Navigable Small World) to make this fast.
What HNSW Actually Does
Instead of storing vectors in a simple list or tree:
- β Each vector becomes a node in a graph
- β Nodes are connected to their nearest neighbors
- β The graph is organized into multiple layers (hierarchy)
Think of it like:
- β Top layer β long-distance shortcuts (fast navigation)
- β Lower layers β fine-grained local neighbors (precision)
π How a Search Works
When you run:
ORDER BY description_embedding <=> '[query_vector]'
HNSW:
- 1. Starts at a high-level βentry pointβ
- 2. Quickly jumps across the graph (coarse search)
- 3. Moves down layers toward closer matches
- 4. Refines results using nearby neighbors
π Instead of checking every row, it navigates toward the answer
β‘ Why This Is Fast
- β Avoids full table scans
- β Uses approximate nearest neighbor (ANN) search
- β Time complexity is sub-linear
- β Scales well as data grows
π§© Why This Matters for YugabyteDB
Because YugabyteDB distributes data:
- β The HNSW index is distributed across nodes
- β Each node contributes to the search
- β Results are merged efficiently
π You get fast similarity search + horizontal scale
HNSW doesnβt guarantee the mathematically perfect nearest neighbor… it finds extremely close matches much faster, which is exactly what you want for real-time search.
βοΈ Conclusion
Semantic search doesnβt require a brand new stack… it just requires a better way to represent and query meaning.
In this demo:
- β Google AI Studio converts text into embeddings
- β YugabyteDB stores those embeddings alongside your data
- β SQL ranks results using vector similarity
- β HNSW ensures it all runs fast… even at scale
The result is a system that handles both:
- β Exact matches β traditional SQL
- β Intent-based queries β vector search
π Without introducing a separate vector database or sync pipeline.
At the end of the day, this isnβt just about AI search.
Itβs about simplifying your architecture while making your applications smarter:
- One database. One query layer.
- Exact match and semantic match… working together.
Have Fun!
