AI Search on YugabyteDB with Native pgvector Integration + GCP

🌍 The Problem: Search is Still… Dumb

Most application search still relies on:

● LIKE '%term%'
● exact keyword matching
● brittle full-text search

Which means:

If users don’t type the exact right words… they get nothing.

At the same time, modern “AI search” architectures often swing too far the other way:

● OLTP database 🗄️
● Separate vector database 🧠
● Sync pipelines 🔁
● Eventual consistency 😬

👉 More power… but a lot more complexity.

💡 A Simpler Approach

What if you could:

● Store relational data and embeddings together
● Run SQL + semantic search in one query
● Avoid sync pipelines entirely

With YugabyteDB, you can.

💡 Key Insight:
YugabyteDB lets you store vector embeddings alongside relational data and query both using standard SQL… no separate vector database required.

⚙️ The Architecture (Simple + Powerful)

This demo uses:

● YugabyteDB → distributed SQL + vector storage
● Google AI Studio → embedding generation
● pgvector (vector) → stores embeddings
● ybhnsw index → fast similarity search

👉 Result: Intent-based search powered entirely through SQL

🧱 Step 1: Create the Extension + Table + Index

We store both structured data and embeddings in the same row.

				
					CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE products (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT NOT NULL,
    description TEXT,
    price NUMERIC(10,2),
    category TEXT,
    description_embedding vector(3072)
);

CREATE INDEX ON products 
USING ybhnsw (description_embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 128);

What this does

● Stores each product’s meaning as a vector
● Uses a 3072-dimension embedding (from the model)
● Builds a distributed HNSW index for fast lookup

👉 The important part: the embedding lives with the row it represents

🧩 Vector Support in YugabyteDB

The CREATE EXTENSION vector command installs pgvector, adding a native vector data type to the YSQL layer.

What this enables:

● Store embeddings as arrays (e.g., vector(3072))
● Compare vectors using operators like <=> (cosine similarity)
● Index vectors using ybhnsw for fast nearest-neighbor search

How YugabyteDB handles it:

● Vectors live alongside relational data in the same table
● Similarity search runs using standard SQL
● The ybhnsw index is distributed across nodes for scale

🧠 Step 2: Generate Embeddings

This is where the “AI” comes in.

To generate embeddings:

1. Go to Google AI Studio
2. Click Get API Key → Create key
3. Copy the key (starts with AIza...)

🤖 What is Google AI Studio?

Google AI Studio is a developer platform for working with Google’s AI models, including text generation and embedding models like gemini-embedding-001.

In this demo, we use it to:

● Convert text (product descriptions and user queries) into vector embeddings
● Ensure both data and queries are represented in the same semantic space
● Enable similarity search inside YugabyteDB using those vectors

👉 The free tier is sufficient for this demo

💻 Step 3: Configure Your Environment

Create a .env file:

				
					AI_STUDIO_KEY=your_api_key
YB_HOST=your_cluster_host
YB_PORT=5433
YB_USER=admin
YB_PASSWORD=your_password
YB_DB=yugabyte

📦 Step 4: Generate Data + Embeddings

Install dependencies

requirements.txt:

				
					google-genai==0.3.0
psycopg2-binary==2.9.9
python-dotenv==1.0.1

Run:

				
					pip install -r requirements.txt

Seed data:

seed_data.py:

				
					import os
import random
import psycopg2
from google import genai
from dotenv import load_dotenv

load_dotenv()

if not os.getenv("YB_HOST"):
    print("❌ ERROR: Could not find YB_HOST. Please check your .env file!")
    exit(1)

client = genai.Client(api_key=os.getenv("AI_STUDIO_KEY"))
YB_CONFIG = {
    "host": os.getenv("YB_HOST"),
    "port": os.getenv("YB_PORT", "5433"),
    "user": os.getenv("YB_USER", "yugabyte"),
    "password": os.getenv("YB_PASSWORD"),
    "database": os.getenv("YB_DB", "yugabyte")
}

categories = ["Electronics", "Outdoor", "Home Office", "Kitchen", "Fitness"]
adjectives = ["Professional", "Ultra-durable", "Eco-friendly", "Compact", "Luxury"]
products_base = [
    ("Noise-cancelling Headphones", "Perfect for focusing in loud offices or airplanes. Advanced active noise cancellation and spatial audio for a focused work environment."),
    ("Waterproof Tent", "Reliable shelter for heavy rain and stormy camping trips. A rugged, weatherproof shelter designed for high-altitude trekking and extreme cold."),
    ("Ergonomic Chair", "Supports your back during long hours of remote work. Adjustable lumbar support and breathable mesh designed for long-duration remote work."),
    ("Cast Iron Skillet", "Retains heat perfectly for gourmet cooking at home. A heavy-duty skillet that offers superior heat retention for professional searing."),
    ("Yoga Mat", "Extra thick cushioning for joint support during heavy exercise. Non-slip texture and high-density cushioning to protect joints during intense floor workouts.")
]

def generate_bulk_data(count=50):
    data = []
    for i in range(count):
        base_name, base_desc = random.choice(products_base)
        adj = random.choice(adjectives)
        cat = random.choice(categories)
        
        name = f"{adj} {base_name} Gen-{i}"
        price = round(random.uniform(15.0, 250.0), 2)
        full_desc = f"{base_desc} This {cat} item is {adj.lower()} and highly rated."
        
        data.append({"name": name, "desc": full_desc, "price": price, "cat": cat})
    return data

def seed_database():
    products = generate_bulk_data(50)
    conn = None 
    cur = None
    
    try:
        conn = psycopg2.connect(**YB_CONFIG, sslmode='disable')
        cur = conn.cursor()
        print(f"Connected. Starting ingestion of {len(products)} items...")

        for p in products:
            # Let the model output its native 3,072 dimensions!
            res = client.models.embed_content(
                model="gemini-embedding-001", 
                contents=p['desc']
            )
            vector = res.embeddings[0].values
            
            cur.execute("""
                INSERT INTO products (id, name, description, price, category, description_embedding)
                VALUES (gen_random_uuid(), %s, %s, %s, %s, %s)
            """, (p['name'], p['desc'], p['price'], p['cat'], vector))
            print(f"Inserted: {p['name']}")
        
        conn.commit()
        print("\n--- Seeding Complete! ---")
    except Exception as e:
        print(f"\n❌ Database Error: {e}")
    finally:
        if cur: cur.close()
        if conn: conn.close()

if __name__ == "__main__":
    seed_database()

What this script does

● Generates ~50 diverse products
● Builds descriptive text per product
● Calls Google AI Studio for embeddings
● Inserts both text + vector into YugabyteDB

To run:

				
					python seed_data.py

👉 You now have a semantic-ready dataset

🧠 Semantic Encoding:
Each product description is converted into a dense vector using a transformer-based model. Similar meanings are placed close together in vector space.

🧠 Step 5: Understanding the Query Vector (Conceptual)

Before we run our search script, it is important to understand what happens under the hood when a user types a search query.

We don’t search using the raw text. Instead, we have to convert the user’s search intent into a vector using the exact same model we used to seed the database.

Inside our application code, it looks like this:

				
					# Note: You do not need to run this snippet standalone. 
# We will include this logic in our final search script in the next step.

res = client.models.embed_content(
    model="text-embedding-004",
    contents="something for a rainy day at the beach"
)

query_vector = res.embeddings[0].values

What this means:

● The text query is converted into a vector residing in the exact same semantic space as your product data.
● The database matching will now be based on mathematical distance (intent), rather than text matching.

🔍 Step 6: Run the Semantic Search

				
					SELECT name, price, category
FROM products
ORDER BY description_embedding <=> '[query_vector]'
LIMIT 5;

What is happening here?

● <=> computes cosine distance
● Lower distance = more similar meaning
● Results are ranked by semantic similarity

📐 Why Cosine Similarity

Instead of measuring raw distance between two points, cosine similarity compares the angle between vectors.

This works especially well for embeddings, because we care about direction (meaning)… not magnitude.

If a search query and a product description point in the same direction in vector space, the distance approaches 0, indicating a strong semantic match.

🤯 Dumb vs Smart Search

🤓 A Quick Note on “Dumb” Search

In the script below, we use a simple ILIKE query to represent traditional keyword search. This makes it very easy to demonstrate the limitations of exact-string matching.

However, database purists will rightly point out that a production-grade application wouldn’t rely on ILIKE for search. Standard PostgreSQL (and YugabyteDB) offers robust Full-Text Search (FTS) using tsvector and tsquery.

FTS is significantly smarter than ILIKE… it handles tokenization, stop-words, and stemming (knowing that “rain”, “rains”, and “raining” share the same root).

But here’s the catch: even highly optimized Full-Text Search is fundamentally a lexical search. It still depends on the presence of actual words (or their root forms).

Vector search steps in to handle semantic search… the underlying meaning or intent. For the sake of keeping this demo simple and highlighting that contrast, we’re sticking with ILIKE.

This script compares “dumb” syntax matching with “smart” semantic matching.

search_demo.py:

				
					import os
import psycopg2
from google import genai
from dotenv import load_dotenv

load_dotenv()

if not os.getenv("YB_HOST"):
    print("❌ ERROR: Could not find YB_HOST. Please check your .env file!")
    exit(1)

client = genai.Client(api_key=os.getenv("AI_STUDIO_KEY"))
YB_CONFIG = {
    "host": os.getenv("YB_HOST"),
    "port": os.getenv("YB_PORT", "5433"),
    "user": os.getenv("YB_USER", "yugabyte"),
    "password": os.getenv("YB_PASSWORD"),
    "database": os.getenv("YB_DB", "yugabyte")
}

def run_keyword_search(cur, query_text):
    sql = "SELECT name, price FROM products WHERE description ILIKE %s LIMIT 3;"
    cur.execute(sql, (f"%{query_text}%",))
    return cur.fetchall()

def run_ai_search(cur, query_text):
    # Native 3,072 dimension output
    res = client.models.embed_content(
        model="gemini-embedding-001", 
        contents=query_text
    )
    vector = res.embeddings[0].values
    
    sql = """
        SELECT name, price, (description_embedding <=> %s::vector) as distance 
        FROM products 
        ORDER BY distance ASC 
        LIMIT 3;
    """
    cur.execute(sql, (str(vector),))
    return cur.fetchall()

def main():
    conn = None
    cur = None
    try:
        conn = psycopg2.connect(**YB_CONFIG, sslmode='disable')
        cur = conn.cursor()

        while True:
            print("\n" + "="*50)
            user_input = input("Enter search query (or 'exit'): ")
            if user_input.lower() == 'exit': break

            print(f"\n--- [1] KEYWORD SEARCH (Standard SQL) ---")
            kw_results = run_keyword_search(cur, user_input)
            if not kw_results:
                print("❌ No results found for these exact keywords.")
            for row in kw_results:
                print(f"Result: {row[0]} (${row[1]})")

            print(f"\n--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---")
            ai_results = run_ai_search(cur, user_input)
            for row in ai_results:
                print(f"Result: {row[0]} (${row[1]}) [Score: {row[2]:.4f}]")

    except Exception as e:
        print(f"Error: {e}")
    finally:
        if cur: cur.close()
        if conn: conn.close()

if __name__ == "__main__":
    main()

Run:

				
					python search_demo.py

This compares:

● Keyword search (ILIKE)
● Vector search (<=>)

Here’s what happens when users search by intent instead of exact words:

				
					Enter search query (or 'exit'): rainy day at the beach

--- [1] KEYWORD SEARCH (Standard SQL) ---
❌ No results found for these exact keywords.

--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---
Result: Compact Waterproof Tent Gen-4 ($183.63) [Score: 0.4423]
Result: Eco-friendly Waterproof Tent Gen-19 ($225.33) [Score: 0.4503]
Result: Luxury Waterproof Tent Gen-20 ($119.80) [Score: 0.4536]

==================================================
Enter search query (or 'exit'): noise suppression flight

--- [1] KEYWORD SEARCH (Standard SQL) ---
❌ No results found for these exact keywords.

--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---
Result: Professional Noise-cancelling Headphones Gen-23 ($162.17) [Score: 0.3874]
Result: Eco-friendly Noise-cancelling Headphones Gen-43 ($212.03) [Score: 0.3928]
Result: Professional Noise-cancelling Headphones Gen-1 ($137.70) [Score: 0.3928]

==================================================

The Aha! Moment:
Keyword search fails because the exact words aren’t present. Vector search succeeds because it compares meaning.

Query	Keyword Search	Semantic Search
“rainy day at the beach”	❌ No results	✔ Compact Waterproof Tent Gen-4 ($183.63) ✔ Eco-friendly Waterproof Tent Gen-19 ($225.33) ✔ Luxury Waterproof Tent Gen-20 ($119.80)
“noise suppression flight”	❌ No results	✔ Professional Noise-cancelling Headphones Gen-3 ($213.34) ✔ Eco-friendly Noise-cancelling Headphones Gen-43 ($212.03) ✔ Professional Noise-cancelling Headphones Gen-1 ($137.70)

⚖️ When Keyword Search Does Work

So far, we’ve shown cases where keyword search fails and semantic search wins.

But let’s look at the opposite scenario.

What happens when the exact word exists?

If a user types:

“rain”

We know our dataset contains descriptions like:

“Reliable shelter for heavy rain and stormy camping trips…”

So now standard SQL should work perfectly.

Sample demo output:

				
					Enter search query (or 'exit'): camping

--- [1] KEYWORD SEARCH (Standard SQL) ---
Result: Professional Waterproof Tent Gen-40 ($183.83)
Result: Compact Waterproof Tent Gen-4 ($183.63)
Result: Eco-friendly Waterproof Tent Gen-35 ($52.00)

--- [2] AI VECTOR SEARCH (YugabyteDB + AI Studio) ---
Result: Compact Waterproof Tent Gen-4 ($183.63) [Score: 0.3790]
Result: Compact Waterproof Tent Gen-25 ($19.81) [Score: 0.3844]
Result: Eco-friendly Waterproof Tent Gen-19 ($225.33) [Score: 0.3878]

🤓 What’s Happening Here?

● Keyword Search (ILIKE)
- - ○ Matches the literal text “camping” because it physically exists in the product description (“stormy camping trips”).
  - ○ Because our standard SQL query doesn’t have an ORDER BY clause, the database simply returns the first three matching rows it happens to find on disk.
● Semantic Search (Vector Distance)
- - ○ Also returns the tents, but not just because the word is there. It understands that “camping” is the core semantic use-case for a tent.
  - ○ It retrieves the results and strictly ranks them using mathematical meaning similarity.

👉 Notice the low distance scores (~0.38)

While it’s not a near-zero perfect match (since a single word “camping” is much shorter than a full product description), a score of 0.38 indicates a highly relevant semantic relationship in the vector space.

⚖️ Key Insight

Because our script generated 50 random products, there are many “Waterproof Tents” in the database. Keyword search simply grabbed the first few results containing the literal word.

AI search, however, evaluated the meaning behind the query and returned the closest semantic matches.

Both methods successfully found the right category of products… but they did it using entirely different mechanisms.

AI search does not replace SQL… it enhances it. When exact matches exist, SQL works great. When users search by intent or broader concepts, vector search fills the gap.

⚡ How Lookups Stay Fast (HNSW)

At first glance, semantic search sounds expensive:

“Do we compare my query vector to every row?”

👉 Thankfully… no.

YugabyteDB uses a graph-based index called HNSW (Hierarchical Navigable Small World) to make this fast.

What HNSW Actually Does

Instead of storing vectors in a simple list or tree:

● Each vector becomes a node in a graph
● Nodes are connected to their nearest neighbors
● The graph is organized into multiple layers (hierarchy)

Think of it like:

● Top layer → long-distance shortcuts (fast navigation)
● Lower layers → fine-grained local neighbors (precision)

🔍 How a Search Works

When you run:

				
					ORDER BY description_embedding <=> '[query_vector]'

HNSW:

1. Starts at a high-level “entry point”
2. Quickly jumps across the graph (coarse search)
3. Moves down layers toward closer matches
4. Refines results using nearby neighbors

👉 Instead of checking every row, it navigates toward the answer

⚡ Why This Is Fast

● Avoids full table scans
● Uses approximate nearest neighbor (ANN) search
● Time complexity is sub-linear
● Scales well as data grows

⚡ HNSW in Plain English

Imagine trying to find a restaurant in a new city.

You don’t check every street… you jump to a neighborhood, then narrow it down block by block.

HNSW works the same way for vectors: fast jumps first, precise matches last.

🧩 Why This Matters for YugabyteDB

Because YugabyteDB distributes data:

● The HNSW index is distributed across nodes
● Each node contributes to the search
● Results are merged efficiently

👉 You get fast similarity search + horizontal scale

🔍 Key Insight:
HNSW doesn’t guarantee the mathematically perfect nearest neighbor… it finds extremely close matches much faster, which is exactly what you want for real-time search.

✍️ Conclusion

Semantic search doesn’t require a brand new stack… it just requires a better way to represent and query meaning.

In this demo:

● Google AI Studio converts text into embeddings
● YugabyteDB stores those embeddings alongside your data
● SQL ranks results using vector similarity
● HNSW ensures it all runs fast… even at scale

The result is a system that handles both:

● Exact matches → traditional SQL
● Intent-based queries → vector search

👉 Without introducing a separate vector database or sync pipeline.

💡 Final Thought

The goal isn’t to replace SQL… it’s to extend it.

With YugabyteDB, you can combine structured data and semantic meaning in one system, giving your applications the ability to understand both what users say and what they mean.

At the end of the day, this isn’t just about AI search.

It’s about simplifying your architecture while making your applications smarter:

One database. One query layer.
Exact match and semantic match… working together.

Have Fun!