When working with distributed databases like YugabyteDB, writing safe, performant, and transactionally consistent SQL can get tricky — especially when your app logic spans multiple tables and mutation types (like updates and inserts).
This is where Common Table Expressions (CTEs) shine.
CTEs — especially data-modifying CTEs — can give you the power to collapse multiple related operations into a single, atomic SQL statement. And in distributed systems, that single-statement approach matters a lot.
Here’s why…
Stronger Atomicity Across Tables and Rows.
In a distributed database, every SQL statement may touch data across multiple nodes. If you execute 4 separate SQL statements — say:
an
UPDATE
a second
UPDATE
and two
INSERT
s
Each of those statements could be routed independently, possibly ending up on different nodes, and could partially succeed or fail without easy rollback. That’s a nightmare for consistency.
Note: YugabyteDB supports distributed transactions, so you can wrap the above DML statements in a BEGIN
/END
block for atomicity. However, executing multiple separate DML statements introduces additional latency — each one incurs its own round-trip and coordination overhead, which can quickly add up.
CTEs let you express all 4 operations as one single SQL statement. YugabyteDB treats it as a single unit — a single distributed transaction — and guarantees atomicity across all operations.
If one part fails, the whole thing rolls back.
Cleaner, More Readable Logic
Compare these:
-- Many separate statements
BEGIN TRANSACTION;
UPDATE table_a ...
UPDATE table_b ...
INSERT INTO table_c ...
INSERT INTO table_d ...
COMMIT;
vs.
-- One statement!
WITH
updated_a AS (...),
updated_b AS (...),
inserted_c AS (...),
inserted_d AS (...)
SELECT 1;
The CTE version is self-contained and declarative. Every data transformation is visible in a single block — which is easier to read, debug, and maintain.
Fewer Network Hops and Lower Latency
In a distributed system like YugabyteDB, each separate SQL statement potentially involves:
- A new round trip from client to coordinator node
- Distributed consensus for write operations (Raft)
- Separate coordination of retries and failure handling
Collapsing everything into a single statement reduces:
Network overhead
Coordination time
Transactional bookkeeping
The result? Faster, leaner, more reliable execution.
Safer Transaction Patterns
Using explicit transactions (BEGIN TRANSACTION; ... COMMIT;
) works fine — but in high-throughput systems, long-lived or multi-step transactions are harder to manage and can impact performance.
A multi-statement transaction:
Increases the chance of deadlocks
May hold locks longer than necessary
May fail partially if the app crashes mid-way
CTEs sidestep this by giving you one, short, powerful statement — which YugabyteDB internally wraps with the right level of isolation and consistency.
Quick Example: Cascade Updates
Imagine we have the following tables:
-- Table A: Holds main data
CREATE TABLE table_a (
id SERIAL PRIMARY KEY,
status TEXT,
reference_code TEXT
);
-- Table B: Linked by reference_code
CREATE TABLE table_b (
code TEXT PRIMARY KEY,
processed BOOLEAN DEFAULT FALSE
);
-- Table C: Receives inserts based on Table B
CREATE TABLE table_c (
id SERIAL PRIMARY KEY,
code TEXT,
source TEXT,
created_at TIMESTAMP
);
-- Table D: Receives inserts based on Table B
CREATE TABLE table_d (
id SERIAL PRIMARY KEY,
code TEXT,
details TEXT,
created_at TIMESTAMP
);
-- Insert into Table B first since Table A references its code
INSERT INTO table_b (code) VALUES
('REF123'),
('REF456'),
('REF789');
-- Insert into Table A referencing codes in Table B
INSERT INTO table_a (status, reference_code) VALUES
('pending', 'REF123'),
('pending', 'REF456'),
('pending', 'REF789');
Initially, the tables contains the following data:
SELECT * FROM table_a ORDER BY id;
id | status | reference_code
----+---------+----------------
1 | pending | REF123
2 | pending | REF456
3 | pending | REF789
(3 rows)
SELECT * FROM table_b;
code | processed
--------+-----------
REF456 | f
REF789 | f
REF123 | f
(3 rows)
SELECT * FROM table_c ORDER BY id;
id | code | source | created_at
----+------+--------+------------
(0 rows)
SELECT * FROM table_d ORDER BY id;
id | code | details | created_at
----+------+---------+------------
(0 rows)
We want to update a row in table_a
, marking a related row in table_b
as processed, and then inserting audit rows into table_c
and table_d
.
We can encapsulate that logic in a stored function, executing a single DML operation per table:
CREATE OR REPLACE FUNCTION update_status_and_cascade(
target_id INT,
new_status TEXT
) RETURNS VOID
LANGUAGE plpgsql
AS $$
DECLARE
ref_code TEXT;
src TEXT := 'system';
now_ts TIMESTAMP := CURRENT_TIMESTAMP;
BEGIN
-- Step 1: Update table_a and fetch reference_code
UPDATE table_a
SET status = new_status
WHERE id = target_id
RETURNING reference_code INTO ref_code;
IF ref_code IS NOT NULL THEN
-- Step 2: Update table_b
UPDATE table_b
SET processed = TRUE
WHERE code = ref_code;
-- Step 3: Insert into table_c
INSERT INTO table_c (code, source, created_at)
SELECT code, src, now_ts
FROM table_b
WHERE code = ref_code;
-- Step 4: Insert into table_d
INSERT INTO table_d (code, details, created_at)
SELECT code, CONCAT('Processed at ', now_ts), now_ts
FROM table_b
WHERE code = ref_code;
END IF;
END;
$$;
Let’s execute the function using EXPLAIN (ANALYZE, DIST)
to observe storage access patterns and performance metrics.
EXPLAIN (ANALYZE, DIST) SELECT update_status_and_cascade(1, 'approved');
QUERY PLAN
------------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=4) (actual time=2193.689..2193.689 rows=1 loops=1)
Storage Table Read Requests: 2
Storage Table Read Execution Time: 0.519 ms
Storage Table Rows Scanned: 2
Storage Table Write Requests: 4
Storage Flush Requests: 2
Storage Flush Execution Time: 156.895 ms
Planning Time: 0.019 ms
Execution Time: 2461.330 ms
Storage Read Requests: 2
Storage Read Execution Time: 0.519 ms
Storage Rows Scanned: 2
Storage Write Requests: 4
Catalog Read Requests: 20
Catalog Read Execution Time: 1326.103 ms
Catalog Write Requests: 0
Storage Flush Requests: 2
Storage Flush Execution Time: 156.895 ms
Storage Execution Time: 1483.516 ms
Peak Memory Usage: 24 kB
(20 rows)
SELECT * FROM table_a ORDER BY id;
id | status | reference_code
----+----------+----------------
1 | approved | REF123
2 | pending | REF456
3 | pending | REF789
(3 rows)
SELECT * FROM table_b;
code | processed
--------+-----------
REF456 | f
REF789 | f
REF123 | t
(3 rows)
SELECT * FROM table_c ORDER BY id;
id | code | source | created_at
----+--------+--------+----------------------------
1 | REF123 | system | 2025-05-01 20:24:01.025168
(1 row)
SELECT * FROM table_d ORDER BY id;
id | code | details | created_at
----+--------+-----------------------------------------+----------------------------
1 | REF123 | Processed at 2025-05-01 20:24:01.025168 | 2025-05-01 20:24:01.025168
(1 row)
Note that the total execution time was 2.461 seconds.
Next, let’s create a stored function that encapsulates the same logic as above, but this time executes it using a single SQL statement:
CREATE OR REPLACE FUNCTION update_status_and_cascade_cte(
target_id INT,
new_status TEXT
) RETURNS VOID
LANGUAGE SQL
AS $$
WITH updated_a AS (
UPDATE table_a
SET status = new_status
WHERE id = target_id
RETURNING reference_code
),
updated_b AS (
UPDATE table_b
SET processed = TRUE
WHERE code IN (SELECT reference_code FROM updated_a)
RETURNING code
),
inserted_c AS (
INSERT INTO table_c (code, source, created_at)
SELECT code, 'system', CURRENT_TIMESTAMP
FROM updated_b
RETURNING *
),
inserted_d AS (
INSERT INTO table_d (code, details, created_at)
SELECT code, 'Processed at ' || CURRENT_TIMESTAMP, CURRENT_TIMESTAMP
FROM updated_b
RETURNING *
)
SELECT;
$$;
Now, let’s execute the latest function using EXPLAIN (ANALYZE, DIST)
to observe storage access patterns and performance metrics.
EXPLAIN (ANALYZE, DIST) SELECT update_status_and_cascade_cte(2, 'approved');
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=4) (actual time=760.674..760.674 rows=1 loops=1)
Storage Table Read Requests: 1
Storage Table Read Execution Time: 0.233 ms
Storage Table Rows Scanned: 1
Storage Table Write Requests: 4
Planning Time: 666.188 ms
Execution Time: 788.145 ms
Storage Read Requests: 1
Storage Read Execution Time: 0.233 ms
Storage Rows Scanned: 1
Storage Write Requests: 4
Catalog Read Requests: 21
Catalog Read Execution Time: 1397.474 ms
Catalog Write Requests: 0
Storage Flush Requests: 1
Storage Flush Execution Time: 27.391 ms
Storage Execution Time: 1425.099 ms
Peak Memory Usage: 358 kB
(18 rows)
SELECT * FROM table_a ORDER BY id;
id | status | reference_code
----+----------+----------------
1 | approved | REF123
2 | approved | REF456
3 | pending | REF789
(3 rows)
SELECT * FROM table_b;
code | processed
--------+-----------
REF456 | t
REF789 | f
REF123 | t
(3 rows)
SELECT * FROM table_c ORDER BY id;
id | code | source | created_at
----+--------+--------+----------------------------
1 | REF123 | system | 2025-05-01 20:24:01.025168
2 | REF456 | system | 2025-05-01 20:24:04.272399
(2 rows)
SELECT * FROM table_d ORDER BY id;
id | code | details | created_at
----+--------+--------------------------------------------+----------------------------
1 | REF123 | Processed at 2025-05-01 20:24:01.025168 | 2025-05-01 20:24:01.025168
2 | REF456 | Processed at 2025-05-01 20:24:04.272399+00 | 2025-05-01 20:24:04.272399
(2 rows)
Notably, the total execution time to perform the same workload was reduced to just 788 milliseconds.
That’s an improvement of roughly 68%!
yugabyte=# SELECT ((2.461 - .788) / 2.461 * 100)::NUMERIC(7,2) "% Improvement";
% Improvement
---------------
67.98
(1 row)
Final thoughts
Using data-modifying CTEs isn’t just a Postgres trick — it’s a best practice when working with distributed systems like YugabyteDB. They give you better performance, stronger guarantees, and cleaner code — all in one go.
If you’re already using YugabyteDB‘s Postgres-compatible SQL, you’re just one WITH
clause away from safer, smarter data logic!
Have Fun!