In Tip #3, we improved our index so filtering happened inside the index scan.
In Tip #4, we saw how IN() predicates create multiple index ranges that YugabyteDB must scan.
However, even with those improvements, our query still scanned 59,689 rows:
Storage Table Rows Scanned: 59689
Why does the database still need to read so many rows?
The answer lies in the ORDER BY.
⚡ Key Idea
For queries with ORDER BY ... LIMIT, the fastest index is often the one that preserves the required ordering so the database can stop scanning early.
🧭
The Index Tuning Journey
This series shows how index tuning usually happens in stages. Each tip removes one layer of wasted work so YugabyteDB can answer the query more efficiently.
All tuning ideas come together into one practical design
The Current Query
EXPLAIN (ANALYZE, DIST)
SELECT *
FROM app.activity_log
WHERE tenant_id =
'11111111-1111-1111-1111-111111111111'
AND category IN ('LOGIN','EXPORT','SYNC')
AND status IN ('PENDING','RUNNING')
ORDER BY created_at
LIMIT 100;
Our current index is:
CREATE INDEX activity_log_tenant_status_category_idx
ON app.activity_log (tenant_id, status, category);
The plan looked like this:
Index Scan using activity_log_tenant_status_category_idx
Storage Table Rows Scanned: 59689
Sort Method: top-N heapsort
Execution Time: ~524 ms
Even though filtering happened inside the index scan, YugabyteDB still had to:
1. read all matching rows
2. sort them by created_at
3. return the first 100 rows
That meant scanning nearly 60k rows.
🚨 The Bottleneck
Because the index was not ordered by created_at, YugabyteDB had to read and sort every qualifying row before returning the first results.
First Attempt: Add created_at to the Index
A natural idea is to extend the index:
CREATE INDEX activity_log_tenant_status_category_created_idx
ON app.activity_log (tenant_id, status, category, created_at);
However, the planner will still chose the original index and still perform a sort.
(tenant_id, status, category, created_at)
Why?
Because the query contains:
status IN (...)
category IN (...)
That creates multiple groups of rows.
Rows are ordered by created_atwithin each group, but not across all groups combined.
So YugabyteDB still cannot return rows globally ordered by created_at without sorting.
The Winning Index
Instead, we create a different index:
CREATE INDEX activity_log_tenant_created_cover_idx
ON app.activity_log (tenant_id, created_at)
INCLUDE (status, category);
This index is ordered as:
(tenant_id, created_at)
That means rows for a tenant are already stored in the correct order.
The New Plan
Now the query produces this plan:
Limit
-> Index Scan using activity_log_tenant_created_cover_idx
Index Cond:
tenant_id = '11111111-1111-1111-1111-111111111111'
Storage Index Filter:
status IN (...)
category IN (...)
Storage Index Rows Scanned: 306
Storage Table Rows Scanned: 100
Execution Time: 3.189 ms
What Changed
Several important things happened.
The Sort Disappeared
The plan no longer contains:
Sort Method: top-N heapsort
Because the index already provides rows in created_at order.
The Database Stops Early
Instead of scanning ~60k rows, YugabyteDB can now:
1. read rows in created_at order
2. filter them using the included columns
3. stop as soon as 100 matches are found
That is why the scan shows:
Storage Index Rows Scanned: 306
Storage Table Rows Scanned: 100
Performance Improved Dramatically
Plan
Rows Scanned
Execution Time
(tenant_id, status, category)
~59,689 table rows ~59,689 index rows
~524 ms
(tenant_id, created_at) INCLUDE (status, category)
100 table rows 306 index rows
~3.2 ms
📌 Key Takeaway
The fastest index for this query was not the one that filtered the most aggressively.
It was the one that preserved the required ordering so YugabyteDB could stop scanning after finding the first 100 rows.
Conclusion
In Tip #3, we eliminated wasted filtering.
In Tip #4, we saw how IN() predicates create multiple index ranges.
In Tip #5, we discovered that for ORDER BY ... LIMIT queries, the best index may prioritize row ordering rather than pushing every predicate into the index key.
By aligning the index with the ordering, YugabyteDB can stop scanning much earlier.
Next Tip
The index we created dramatically reduced the amount of work the database performs.
However, the plan still shows:
Storage Table Rows Scanned: 100
That means YugabyteDB still needs to fetch rows from the base table after reading the index.
This happens because the query selects:
SELECT *
and some columns are not stored in the index.
In the next tip, we’ll explore how covering indexes and index-only scans can eliminate table reads entirely, making queries even faster.
Have Fun!
We reserved the entire bar for $200 for some SKO celebrating!