When inserting a large volume of rows into a YSQL table, the performance you get depends heavily on the SQL pattern you use.
Some patterns turn every row into its own cross-region Raft commit (slow, repetitive). Other patterns let YugabyteDB batch rows and minimize Raft rounds (fast, efficient).
If you repeat that process 100,000 times in a loop, the performance impact multiplies dramatically.
This is especially important in multi-region RF3 clusters, where every committed write must cross regions.
In this tip, we’ll compare three distinct INSERT approaches:
❌ Row-by-row PL/pgSQL loop
⚡ Set-based INSERT … SELECT generate_series
🚀 Multi-row INSERT … VALUES batching
And we’ll show why set-based INSERTs perform dramatically better
Setup: Demo YugabyteDB Cluster
We have a three region YugabyteDB cluster:
yugabyte=# SELECT host, cloud, region, zone, host = host(inet_server_addr()) connected FROM yb_servers() ORDER BY 2, 3, 4;
host | cloud | region | zone | connected
----------------+-------+-----------+------------+-----------
172.152.18.7 | aws | us-east-1 | us-east-1a | f
172.161.27.158 | aws | us-east-2 | us-east-2a | t
172.150.28.137 | aws | us-west-1 | us-west-1a | f
(3 rows)
The output above shows that we are connected to the node in us-east-2a.
🔁 Correct Raft Behavior in RF3 (Critical for Understanding Performance)
When a write occurs:
The leader (us-east-2) sends replication messages to all followers concurrently:
● us-east-1
● us-west-2
The write is considered committed once the leader receives an ACK from any one follower (Leader + 1 follower = majority of 2/3).
The follower with the lowest RTT naturally ACKs first — not because YB “chooses” it, but because all followers receive replication at the same time and the fastest one wins the ACK race.
In this cluster:
● Fast follower ACK: us-east-1 (~14.5 ms)
● Slow follower ACK: us-west-1 (~52.7 ms)
Therefore:
Commit latency ≈ RTT(leader → closest follower), i.e., ~14–15 ms per write in your cluster.
Setup: Simple Demo Table
We’ll use a small demo table in a dedicated schema:
CREATE SCHEMA bulk_demo;
CREATE TABLE bulk_demo.load_test (
id INT PRIMARY KEY,
col1 TEXT,
col2 TEXT
);
Turn on timing in ysqlsh:
\timing
🐌 Option 1: Row-by-Row PL/pgSQL Loop (The “Don’t Do This” Pattern)
Here’s the naïve pattern a lot of people reach for:
TRUNCATE TABLE bulk_demo.load_test;
DO $$
BEGIN
FOR r IN 1..100000 LOOP
INSERT INTO bulk_demo.load_test
VALUES (r, 'some_text', 'some_other_text');
END LOOP;
END;
$$;
Sample run (your numbers will vary):
DO
Time: 2450.720 ms (00:02.451)
SELECT COUNT(*) FROM bulk_demo.load_test;
count
--------
100000
(1 row)
Time: 40.444 ms
So we:
● Inserted 100,000 rows
● Took about 2.45 seconds
Why it’s slow:
● 100,000 executor invocations
● 100,000 Raft rounds
● 100,000 cross-region quorums
● Quorum cost (~14–15 ms) × 100k rows = big number
● DocDB cannot batch these writes effectively
● PL/pgSQL loop adds interpreter overhead
Even in a single-region cluster this is slow. In a multi-region RF3 cluster, the cost multiplies.
⚡ Option 2: Set-Based Insert with generate_series (Massively Faster)
TRUNCATE TABLE bulk_demo.load_test;
INSERT INTO bulk_demo.load_test (id, col1, col2)
SELECT g, 'some_text', 'some_other_text'
FROM generate_series(1, 100000) g;
Example:
yugabyte=# TRUNCATE TABLE bulk_demo.load_test;
TRUNCATE TABLE
Time: 678.545 ms
yugabyte=# INSERT INTO bulk_demo.load_test (id, col1, col2)
SELECT g, 'some_text', 'some_other_text'
FROM generate_series(1, 100000) g;
INSERT 0 100000
Time: 963.660 ms
Why it’s ~2.5× faster:
✔ One SQL statement (not 100,000)
Executor setup happens once.
✔ YSQL can batch thousands of rows at a time
Rows are streamed and accumulated in memory.
✔ DocDB receives large batched writes, not tiny one-row writes
DocDB’s write path becomes dramatically more efficient.
If rows originate in your app (Java, Go, Python, C#, etc.), this is the ideal approach.
🏁 Final Recommendation
When inserting large numbers of rows in YSQL — especially in multi-region RF3 clusters:
❌ Avoid: PL/pgSQL loops and one-row-at-a-time INSERTs
✔ Prefer: INSERT … SELECT
✔ Or: multi-row INSERT ... VALUES batching
Let YSQL batch the work, let DocDB batch the writes and let Raft amortize cross-region latency across larger groups.
Your cluster and your load times will thank you. 😄
Have Fun!
The shopping center near my house, at the Waterfront in Homestead, PA, always puts up great holiday decorations in the square. The tree in the pic is sitting on top of the water fountain display that runs during the summer.