Intro 🎁
Have you ever written a simple multi-column IN clause, something like WHERE ROW(a, b) IN ((1, 1), (2, 2)), and expected it to blaze through your index, only to find a Seq Scan instead?
You’re not alone!
This behavior comes from how PostgreSQL’s planner rewrites ROW IN expressions into a series of AND/OR comparisons under the hood. YugabyteDB’s index layer can handle these efficiently, but the rewritten form hides that opportunity.
In this YugabyteDB Tip, we’ll:
● Show a quick reproduction of the issue 🧪
- ● Share practical SQL workarounds that restore your expected index-driven performance 🚀
Why this tip?
Multi-column point operations are common in apps and CDC/replay pipelines (think deletes/updates for a set of keys). When written with ROW(col1, col2, ...) IN ((...), (...)), you might expect a clean Index Scan. In practice, you may see a Seq Scan or an Index Scan + Storage Filter on disjunctions. This tip shows drop-in SQL patterns that consistently drive index-based execution.
Repro
We’ll use a table similar in spirit to a typical inventory/quantities table, with a composite PK:
-- Sample table for the demo
CREATE TABLE inventory_item_quantities (
id BIGINT,
shop_id BIGINT,
variant_id BIGINT,
location_id BIGINT,
qty INT,
PRIMARY KEY ((id) HASH, variant_id, location_id, shop_id));
-- Sample data (a few rows for the demo)
INSERT INTO inventory_item_quantities
(id, shop_id, variant_id, location_id, qty)
VALUES
(1, 1, 1, 1, 10),
(2, 1, 2, 2, 20),
(3, 2, 3, 3, 30),
(4, 2, 4, 4, 40);
The intuitive but sometimes slow pattern
-- Expectation: index lookup on composite key
-- Reality today (pre-fix): plan may show Seq Scan or Index + Storage Filter on ORs
DELETE FROM inventory_item_quantities
WHERE (id, shop_id, variant_id, location_id) IN ((1,1,1,1), (2,2,2,2));
Typical plan symptom:
●
Seq ScanwithStorage Filter: ((id=...) AND ... ) OR ((id=...) AND ...)or●
Index Scanconstrained by only a prefix column (shop_id), with the rest as aStorage Filterof disjunctions.
The unexpected SLOW query plan:
yugabyte=# EXPLAIN DELETE FROM inventory_item_quantities WHERE (id, shop_id, variant_id, location_id) IN ((1,1,1,1), (2,2,2,2));
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Delete on inventory_item_quantities (cost=0.00..120.00 rows=0 width=0)
-> Seq Scan on inventory_item_quantities (cost=0.00..120.00 rows=1000 width=32)
Storage Filter: (((id = 1) AND (shop_id = 1) AND (variant_id = 1) AND (location_id = 1)) OR ((id = 2) AND (shop_id = 2) AND (variant_id = 2) AND (location_id = 2)))
(3 rows)
Workaround #1: ROW(...) IN (VALUES ...) (Batched Nested Loop) ✅
Convert the tuple list into a VALUES relation. The planner forms a batched nested loop where each “row of keys” drives an exact index probe on the primary key:
EXPLAIN
DELETE FROM inventory_item_quantities
WHERE ROW(id, shop_id, variant_id, location_id) IN (VALUES (1,1,1,1), (2,1,2,2));
You should see YB Batched Nested Loop Join and an Index Scan using inventory_item_quantities_pkey without the giant OR storage filter.
The reason it works is that the VALUES list is treated like a tiny in-memory table; the planner can batch parameterized index probes from it. For background on VALUES as a “constant table,” see the PostgreSQL docs.
Example:
yugabyte=# EXPLAIN
yugabyte-# DELETE FROM inventory_item_quantities
yugabyte-# WHERE ROW(id, shop_id, variant_id, location_id) IN (VALUES (1,1,1,1), (2,1,2,2));
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Delete on inventory_item_quantities (cost=0.04..4.28 rows=0 width=0)
-> YB Batched Nested Loop Join (cost=0.04..4.28 rows=1 width=72)
Join Filter: ((inventory_item_quantities.id = "*VALUES*".column1) AND (inventory_item_quantities.shop_id = "*VALUES*".column2) AND (inventory_item_quantities.variant_id = "*VALUES*".column3) AND (inventory_item_quantities.location_id = "*VALUES*".column4))
-> Unique (cost=0.04..0.06 rows=2 width=56)
-> Sort (cost=0.04..0.04 rows=2 width=56)
Sort Key: "*VALUES*".column1, "*VALUES*".column2, "*VALUES*".column3, "*VALUES*".column4
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=56)
-> Index Scan using inventory_item_quantities_pkey on inventory_item_quantities (cost=0.00..2.11 rows=1 width=64)
Index Cond: (ROW(id, variant_id, location_id, shop_id) = ANY (ARRAY[ROW("*VALUES*".column1, "*VALUES*".column3, "*VALUES*".column4, "*VALUES*".column2), ROW($2, $1026, $2050, $3074), ROW($3, $1027, $2051, $3075), ..., ROW($1024, $2048, $3072, $4096)]))
(9 rows)
Tip: If you prefer a slightly more explicit form that makes the equality join obvious, use a WITH keys AS (VALUES …) CTE + USING keys:
WITH keys(id, shop_id, variant_id, location_id) AS
(VALUES (1,1,1,1), (2,2,2,2))
DELETE FROM inventory_item_quantities p
USING keys
WHERE (p.id, p.shop_id, p.variant_id, p.location_id) =
(keys.id, keys.shop_id, keys.variant_id, keys.location_id);
Example:
yugabyte=# EXPLAIN
yugabyte-# WITH keys(id, shop_id, variant_id, location_id) AS
yugabyte-# (VALUES (1,1,1,1), (2,2,2,2))
yugabyte-# DELETE FROM inventory_item_quantities p
yugabyte-# USING keys
yugabyte-# WHERE (p.id, p.shop_id, p.variant_id, p.location_id) =
yugabyte-# (keys.id, keys.shop_id, keys.variant_id, keys.location_id);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Delete on inventory_item_quantities p (cost=0.00..4.25 rows=0 width=0)
-> YB Batched Nested Loop Join (cost=0.00..4.25 rows=1 width=72)
Join Filter: ((p.id = "*VALUES*".column1) AND (p.shop_id = "*VALUES*".column2) AND (p.variant_id = "*VALUES*".column3) AND (p.location_id = "*VALUES*".column4))
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=56)
-> Index Scan using inventory_item_quantities_pkey on inventory_item_quantities p (cost=0.00..2.11 rows=1 width=64)
Index Cond: (ROW(id, variant_id, location_id, shop_id) = ANY (ARRAY[ROW("*VALUES*".column1, "*VALUES*".column3, "*VALUES*".column4, "*VALUES*".column2), ROW($2, $1026, $2050, $3074), ROW($3, $1027, $2051, $3075), ..., ROW($1024, $2048, $3072, $4096)]))
(6 rows)
Workaround #2: Consider enabling bitmap scans (when appropriate) 🧩
If your original ROW (...) IN ((...), ...) form must be kept, enabling bitmap scans can make disjunctive predicates faster in some cases because multiple single-column indexes* can be combined at the executor. In YSQL, both enable_bitmapscan and yb_enable_bitmapscan must be on to allow bitmap scans on Yugabyte relations:
Example:
yugabyte=# SHOW enable_bitmapscan;
enable_bitmapscan
-------------------
on
(1 row)
yugabyte=# SHOW yb_enable_bitmapscan;
yb_enable_bitmapscan
----------------------
off
(1 row)
yugabyte=# SET yb_enable_bitmapscan = on;
SET
yugabyte=# EXPLAIN
yugabyte-# DELETE FROM inventory_item_quantities
yugabyte-# WHERE (id, shop_id, variant_id, location_id) IN ((1,1,1,1), (2,2,2,2));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Delete on inventory_item_quantities (cost=0.24..4.30 rows=0 width=0)
-> YB Bitmap Table Scan on inventory_item_quantities (cost=0.24..4.30 rows=1 width=32)
-> BitmapOr (cost=0.24..0.24 rows=2 width=0)
-> Bitmap Index Scan on inventory_item_quantities_pkey (cost=0.00..0.12 rows=1 width=0)
Index Cond: ((id = 1) AND (variant_id = 1) AND (location_id = 1) AND (shop_id = 1))
-> Bitmap Index Scan on inventory_item_quantities_pkey (cost=0.00..0.12 rows=1 width=0)
Index Cond: ((id = 2) AND (variant_id = 2) AND (location_id = 2) AND (shop_id = 2))
(7 rows)
Toggle these per session or per statement (with EXPLAIN (SETTINGS on) if you want to see changes in plans on PG 12+). See the YB docs for yb_enable_bitmapscan and the PostgreSQL planner settings reference.
Note: Whether a bitmap plan helps depends on your schema, indexes, and data distribution. Measure!
What’s really going on? 🧠
● Planner rewrite: The PG layer rewrites
ROW(col1, col2, ...) IN ((...), (...))into a series ofAND/ORcomparisons.● Index opportunities are hidden: YB’s index code can handle “row-array” comparisons natively (it already does so during batched nested loop joins), but after rewrite, the executor sees big disjunctions instead of a compact row-array key condition.
● Enhancement in progress: Preserving row-array predicates as ScanKeys allows the index access method to do exactly what you expected in the first place—multiple precise composite-key lookups. Track the enhancement here: YSQL: Row array scan should be captured as ScanKey.
Updates too (not just deletes) ✏️
Everything above works for UPDATE as well:
-- Using VALUES directly
UPDATE inventory_item_quantities p
SET qty = 0
WHERE ROW(p.id, p.shop_id, p.variant_id, p.location_id)
IN (VALUES (1,1,1,1), (2,2,2,2));
-- Or USING a CTE for keys
WITH keys(id, shop_id, variant_id, location_id) AS
(VALUES (1,1,1,1), (2,2,2,2))
UPDATE inventory_item_quantities p
SET qty = 0
FROM keys
WHERE (p.id, p.shop_id, p.variant_id, p.location_id) =
(keys.id, keys.shop_id, keys.variant_id, keys.location_id);
Applies to SELECTs too (not just deletes and udates) 🔍
The same behavior and optimization opportunity apply to SELECT queries that filter on multi-column tuples.
For example, this query may fall back to a Seq Scan or an Index Scan that only uses a prefix of the composite key:
SELECT *
FROM inventory_item_quantities
WHERE (id, shop_id, variant_id, location_id)
IN ((1,1,1,1), (2,2,2,2));
You’ll likely see a Storage Filter full of OR conditions, which can be slow on large tables.
Rewriting it with the ROW(...) IN (VALUES ...) form fixes that:
SELECT *
FROM inventory_item_quantities p
WHERE ROW(p.id, p.shop_id, p.variant_id, p.location_id)
IN (VALUES (1,1,1,1), (2,2,2,2));
This version produces a YB Batched Nested Loop Join with an Index Scan on the primary key … efficient and predictable.
Summary
Multi-column tuple IN clauses like WHERE (id, shop_id, variant_id, location_id) IN ((...), (...))can sometimes trigger a sequential scan in YugabyteDB, even when a composite primary key exists.
Until the planner enhancement that preserves row-array scan keys lands, you can make these statements fast and index-driven by rewriting them as:
●
ROW(...) IN (VALUES (...), (...))● or using a CTE (
WITH keys AS (VALUES ...) ... USING keys)
Both produce a YB Batched Nested Loop Join with an Index Scan on the primary key … exactly what you’d expect for efficient multi-column lookups.
These rewrites work for DELETE, UPDATE and SELECT statements.
If you must keep the legacy IN ((...), (...)) form, enabling bitmap scans (yb_enable_bitmapscan=on) may still improve performance in some cases … measure to confirm.
Have Fun!
