How ORDER BY LIMIT Can Change the Best Index in YugabyteDB

In Tip #3, we improved our index so filtering happened inside the index scan.

In Tip #4, we saw how IN() predicates create multiple index ranges that YugabyteDB must scan.

However, even with those improvements, our query still scanned 59,689 rows:

				
					Storage Table Rows Scanned: 59689

Why does the database still need to read so many rows?

The answer lies in the ORDER BY.

⚡ Key Idea

For queries with ORDER BY ... LIMIT, the fastest index is often the one that preserves the required ordering so the database can stop scanning early.

🧭 The Index Tuning Journey

This series shows how index tuning usually happens in stages. Each tip removes one layer of wasted work so YugabyteDB can answer the query more efficiently.

Tip	Focus	What Improves
1	Choose the right column order in an index	The index matches the query shape better
2	Why your index scan still reads thousands of rows	Shows why index usage alone is not enough
3	How to eliminate “Rows Removed by Filter”	More filtering happens in the index before table reads
4	How IN() predicates create multiple index ranges	Multiple predicate combinations create distinct index ranges
5	How ORDER BY LIMIT Can Change the Best Index in YugabyteDB	Sort work can be reduced or eliminated
6	How covering indexes eliminate table reads	The query may avoid table access entirely
7	Designing the best index for a query	All tuning ideas come together into one practical design

The Current Query

				
					EXPLAIN (ANALYZE, DIST)
SELECT *
FROM app.activity_log
WHERE tenant_id =
'11111111-1111-1111-1111-111111111111'
AND category IN ('LOGIN','EXPORT','SYNC')
AND status IN ('PENDING','RUNNING')
ORDER BY created_at
LIMIT 100;

Our current index is:

				
					CREATE INDEX activity_log_tenant_status_category_idx
ON app.activity_log (tenant_id, status, category);

The plan looked like this:

				
					Index Scan using activity_log_tenant_status_category_idx

Storage Table Rows Scanned: 59689
Sort Method: top-N heapsort
Execution Time: ~524 ms

Even though filtering happened inside the index scan, YugabyteDB still had to:

1. read all matching rows
2. sort them by created_at
3. return the first 100 rows

That meant scanning nearly 60k rows.

🚨 The Bottleneck

Because the index was not ordered by created_at, YugabyteDB had to read and sort every qualifying row before returning the first results.

First Attempt: Add `created_at` to the Index

A natural idea is to extend the index:

				
					CREATE INDEX activity_log_tenant_status_category_created_idx
ON app.activity_log (tenant_id, status, category, created_at);

However, the planner will still chose the original index and still perform a sort.

				
					(tenant_id, status, category, created_at)

Why?

Because the query contains:

				
					status IN (...)
category IN (...)

That creates multiple groups of rows.

Rows are ordered by created_at within each group, but not across all groups combined.

So YugabyteDB still cannot return rows globally ordered by created_at without sorting.

The Winning Index

Instead, we create a different index:

				
					CREATE INDEX activity_log_tenant_created_cover_idx
ON app.activity_log (tenant_id, created_at)
INCLUDE (status, category);

This index is ordered as:

				
					(tenant_id, created_at)

That means rows for a tenant are already stored in the correct order.

The New Plan

Now the query produces this plan:

				
					Limit
 -> Index Scan using activity_log_tenant_created_cover_idx

Index Cond:
  tenant_id = '11111111-1111-1111-1111-111111111111'

Storage Index Filter:
  status IN (...)
  category IN (...)

Storage Index Rows Scanned: 306
Storage Table Rows Scanned: 100

Execution Time: 3.189 ms

What Changed

Several important things happened.

The Sort Disappeared

The plan no longer contains:

				
					Sort Method: top-N heapsort

Because the index already provides rows in created_at order.

The Database Stops Early

Instead of scanning ~60k rows, YugabyteDB can now:

1. read rows in created_at order
2. filter them using the included columns
3. stop as soon as 100 matches are found

That is why the scan shows:

				
					Storage Index Rows Scanned: 306
Storage Table Rows Scanned: 100

Performance Improved Dramatically

Plan	Rows Scanned	Execution Time
`(tenant_id, status, category)`	~59,689 table rows ~59,689 index rows	~524 ms
`(tenant_id, created_at) INCLUDE (status, category)`	100 table rows 306 index rows	~3.2 ms

📌 Key Takeaway

The fastest index for this query was not the one that filtered the most aggressively. It was the one that preserved the required ordering so YugabyteDB could stop scanning after finding the first 100 rows.

Conclusion

In Tip #3, we eliminated wasted filtering.

In Tip #4, we saw how IN() predicates create multiple index ranges.

In Tip #5, we discovered that for ORDER BY ... LIMIT queries, the best index may prioritize row ordering rather than pushing every predicate into the index key.

By aligning the index with the ordering, YugabyteDB can stop scanning much earlier.

Next Tip

The index we created dramatically reduced the amount of work the database performs.

However, the plan still shows:

				
					Storage Table Rows Scanned: 100

That means YugabyteDB still needs to fetch rows from the base table after reading the index.

This happens because the query selects:

				
					SELECT *

and some columns are not stored in the index.

In the next tip, we’ll explore how covering indexes and index-only scans can eliminate table reads entirely, making queries even faster.

Have Fun!