Fast multi-column IN deletes/updates with ROW tuples

Intro 🎁

Have you ever written a simple multi-column IN clause, something like WHERE ROW(a, b) IN ((1, 1), (2, 2)), and expected it to blaze through your index, only to find a Seq Scan instead?

You’re not alone!

This behavior comes from how PostgreSQL’s planner rewrites ROW IN expressions into a series of AND/OR comparisons under the hood. YugabyteDB’s index layer can handle these efficiently, but the rewritten form hides that opportunity.

In this YugabyteDB Tip, we’ll:

● Show a quick reproduction of the issue 🧪
● Share practical SQL workarounds that restore your expected index-driven performance 🚀

Why this tip?

Multi-column point operations are common in apps and CDC/replay pipelines (think deletes/updates for a set of keys). When written with ROW(col1, col2, ...) IN ((...), (...)), you might expect a clean Index Scan. In practice, you may see a Seq Scan or an Index Scan + Storage Filter on disjunctions. This tip shows drop-in SQL patterns that consistently drive index-based execution.

Repro

We’ll use a table similar in spirit to a typical inventory/quantities table, with a composite PK:

				
					-- Sample table for the demo
CREATE TABLE inventory_item_quantities (
  id BIGINT,
  shop_id BIGINT,
  variant_id BIGINT,
  location_id BIGINT,
  qty INT,
PRIMARY KEY ((id) HASH, variant_id, location_id, shop_id));

-- Sample data (a few rows for the demo)
INSERT INTO inventory_item_quantities
(id, shop_id, variant_id, location_id, qty)
VALUES
(1, 1, 1, 1, 10),
(2, 1, 2, 2, 20),
(3, 2, 3, 3, 30),
(4, 2, 4, 4, 40);

The intuitive but sometimes slow pattern

				
					-- Expectation: index lookup on composite key
-- Reality today (pre-fix): plan may show Seq Scan or Index + Storage Filter on ORs
DELETE FROM inventory_item_quantities
WHERE (id, shop_id, variant_id, location_id) IN ((1,1,1,1), (2,2,2,2));

Typical plan symptom:

● Seq Scan with Storage Filter: ((id=...) AND ... ) OR ((id=...) AND ...) or
● Index Scan constrained by only a prefix column (shop_id), with the rest as a Storage Filter of disjunctions.

The unexpected SLOW query plan:

				
					yugabyte=# EXPLAIN DELETE FROM inventory_item_quantities WHERE (id, shop_id, variant_id, location_id) IN ((1,1,1,1), (2,2,2,2));
                                                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on inventory_item_quantities  (cost=0.00..120.00 rows=0 width=0)
   ->  Seq Scan on inventory_item_quantities  (cost=0.00..120.00 rows=1000 width=32)
         Storage Filter: (((id = 1) AND (shop_id = 1) AND (variant_id = 1) AND (location_id = 1)) OR ((id = 2) AND (shop_id = 2) AND (variant_id = 2) AND (location_id = 2)))
(3 rows)

Workaround #1: `ROW(...) IN (VALUES ...)` (Batched Nested Loop) ✅

Convert the tuple list into a VALUES relation. The planner forms a batched nested loop where each “row of keys” drives an exact index probe on the primary key:

				
					EXPLAIN
DELETE FROM inventory_item_quantities
 WHERE ROW(id, shop_id, variant_id, location_id) IN (VALUES (1,1,1,1), (2,1,2,2));

You should see YB Batched Nested Loop Join and an Index Scan using inventory_item_quantities_pkey without the giant OR storage filter.

The reason it works is that the VALUES list is treated like a tiny in-memory table; the planner can batch parameterized index probes from it. For background on VALUES as a “constant table,” see the PostgreSQL docs.

Example:

				
					yugabyte=# EXPLAIN
yugabyte-# DELETE FROM inventory_item_quantities
yugabyte-#  WHERE ROW(id, shop_id, variant_id, location_id) IN (VALUES (1,1,1,1), (2,1,2,2));
                                                                                                                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on inventory_item_quantities  (cost=0.04..4.28 rows=0 width=0)
   ->  YB Batched Nested Loop Join  (cost=0.04..4.28 rows=1 width=72)
         Join Filter: ((inventory_item_quantities.id = "*VALUES*".column1) AND (inventory_item_quantities.shop_id = "*VALUES*".column2) AND (inventory_item_quantities.variant_id = "*VALUES*".column3) AND (inventory_item_quantities.location_id = "*VALUES*".column4))
         ->  Unique  (cost=0.04..0.06 rows=2 width=56)
               ->  Sort  (cost=0.04..0.04 rows=2 width=56)
                     Sort Key: "*VALUES*".column1, "*VALUES*".column2, "*VALUES*".column3, "*VALUES*".column4
                     ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=56)
         ->  Index Scan using inventory_item_quantities_pkey on inventory_item_quantities  (cost=0.00..2.11 rows=1 width=64)
               Index Cond: (ROW(id, variant_id, location_id, shop_id) = ANY (ARRAY[ROW("*VALUES*".column1, "*VALUES*".column3, "*VALUES*".column4, "*VALUES*".column2), ROW($2, $1026, $2050, $3074), ROW($3, $1027, $2051, $3075), ..., ROW($1024, $2048, $3072, $4096)]))
(9 rows)

Tip: If you prefer a slightly more explicit form that makes the equality join obvious, use a WITH keys AS (VALUES …) CTE + USING keys:

				
					WITH keys(id, shop_id, variant_id, location_id) AS
  (VALUES (1,1,1,1), (2,2,2,2))
DELETE FROM inventory_item_quantities p
USING keys
WHERE (p.id, p.shop_id, p.variant_id, p.location_id) =
      (keys.id, keys.shop_id, keys.variant_id, keys.location_id);

Example:

				
					yugabyte=# EXPLAIN
yugabyte-# WITH keys(id, shop_id, variant_id, location_id) AS
yugabyte-#   (VALUES (1,1,1,1), (2,2,2,2))
yugabyte-# DELETE FROM inventory_item_quantities p
yugabyte-# USING keys
yugabyte-# WHERE (p.id, p.shop_id, p.variant_id, p.location_id) =
yugabyte-#       (keys.id, keys.shop_id, keys.variant_id, keys.location_id);
                                                                                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on inventory_item_quantities p  (cost=0.00..4.25 rows=0 width=0)
   ->  YB Batched Nested Loop Join  (cost=0.00..4.25 rows=1 width=72)
         Join Filter: ((p.id = "*VALUES*".column1) AND (p.shop_id = "*VALUES*".column2) AND (p.variant_id = "*VALUES*".column3) AND (p.location_id = "*VALUES*".column4))
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=56)
         ->  Index Scan using inventory_item_quantities_pkey on inventory_item_quantities p  (cost=0.00..2.11 rows=1 width=64)
               Index Cond: (ROW(id, variant_id, location_id, shop_id) = ANY (ARRAY[ROW("*VALUES*".column1, "*VALUES*".column3, "*VALUES*".column4, "*VALUES*".column2), ROW($2, $1026, $2050, $3074), ROW($3, $1027, $2051, $3075), ..., ROW($1024, $2048, $3072, $4096)]))
(6 rows)

Workaround #2: Consider enabling bitmap scans (when appropriate) 🧩

If your original ROW (...) IN ((...), ...) form must be kept, enabling bitmap scans can make disjunctive predicates faster in some cases because multiple single-column indexes* can be combined at the executor. In YSQL, both enable_bitmapscan and yb_enable_bitmapscan must be on to allow bitmap scans on Yugabyte relations:

Example:

				
					yugabyte=# SHOW enable_bitmapscan;
 enable_bitmapscan
-------------------
 on
(1 row)

yugabyte=# SHOW yb_enable_bitmapscan;
 yb_enable_bitmapscan
----------------------
 off
(1 row)

yugabyte=# SET yb_enable_bitmapscan = on;
SET

yugabyte=# EXPLAIN
yugabyte-#   DELETE FROM inventory_item_quantities
yugabyte-#   WHERE (id, shop_id, variant_id, location_id) IN ((1,1,1,1), (2,2,2,2));
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Delete on inventory_item_quantities  (cost=0.24..4.30 rows=0 width=0)
   ->  YB Bitmap Table Scan on inventory_item_quantities  (cost=0.24..4.30 rows=1 width=32)
         ->  BitmapOr  (cost=0.24..0.24 rows=2 width=0)
               ->  Bitmap Index Scan on inventory_item_quantities_pkey  (cost=0.00..0.12 rows=1 width=0)
                     Index Cond: ((id = 1) AND (variant_id = 1) AND (location_id = 1) AND (shop_id = 1))
               ->  Bitmap Index Scan on inventory_item_quantities_pkey  (cost=0.00..0.12 rows=1 width=0)
                     Index Cond: ((id = 2) AND (variant_id = 2) AND (location_id = 2) AND (shop_id = 2))
(7 rows)

Toggle these per session or per statement (with EXPLAIN (SETTINGS on) if you want to see changes in plans on PG 12+). See the YB docs for yb_enable_bitmapscan and the PostgreSQL planner settings reference.

Note: Whether a bitmap plan helps depends on your schema, indexes, and data distribution. Measure!

What’s really going on? 🧠

● Planner rewrite: The PG layer rewrites ROW(col1, col2, ...) IN ((...), (...)) into a series of AND/OR comparisons.
● Index opportunities are hidden: YB’s index code can handle “row-array” comparisons natively (it already does so during batched nested loop joins), but after rewrite, the executor sees big disjunctions instead of a compact row-array key condition.
● Enhancement in progress: Preserving row-array predicates as ScanKeys allows the index access method to do exactly what you expected in the first place—multiple precise composite-key lookups. Track the enhancement here: YSQL: Row array scan should be captured as ScanKey.

Updates too (not just deletes) ✏️

Everything above works for UPDATE as well:

				
					-- Using VALUES directly
UPDATE inventory_item_quantities p
SET qty = 0
WHERE ROW(p.id, p.shop_id, p.variant_id, p.location_id)
      IN (VALUES (1,1,1,1), (2,2,2,2));

-- Or USING a CTE for keys
WITH keys(id, shop_id, variant_id, location_id) AS
  (VALUES (1,1,1,1), (2,2,2,2))
UPDATE inventory_item_quantities p
SET qty = 0
FROM keys
WHERE (p.id, p.shop_id, p.variant_id, p.location_id) =
      (keys.id, keys.shop_id, keys.variant_id, keys.location_id);

Applies to SELECTs too (not just deletes and udates) 🔍

The same behavior and optimization opportunity apply to SELECT queries that filter on multi-column tuples.

For example, this query may fall back to a Seq Scan or an Index Scan that only uses a prefix of the composite key:

				
					SELECT *
FROM inventory_item_quantities
WHERE (id, shop_id, variant_id, location_id)
      IN ((1,1,1,1), (2,2,2,2));

You’ll likely see a Storage Filter full of OR conditions, which can be slow on large tables.

Rewriting it with the ROW(...) IN (VALUES ...) form fixes that:

				
					SELECT *
FROM inventory_item_quantities p
WHERE ROW(p.id, p.shop_id, p.variant_id, p.location_id)
      IN (VALUES (1,1,1,1), (2,2,2,2));

This version produces a YB Batched Nested Loop Join with an Index Scan on the primary key … efficient and predictable.

Summary

Multi-column tuple IN clauses like WHERE (id, shop_id, variant_id, location_id) IN ((...), (...))can sometimes trigger a sequential scan in YugabyteDB, even when a composite primary key exists.

Until the planner enhancement that preserves row-array scan keys lands, you can make these statements fast and index-driven by rewriting them as:

● ROW(...) IN (VALUES (...), (...))
● or using a CTE (WITH keys AS (VALUES ...) ... USING keys)

Both produce a YB Batched Nested Loop Join with an Index Scan on the primary key … exactly what you’d expect for efficient multi-column lookups.

These rewrites work for DELETE, UPDATE and SELECT statements.

If you must keep the legacy IN ((...), (...)) form, enabling bitmap scans (yb_enable_bitmapscan=on) may still improve performance in some cases … measure to confirm.

Have Fun!