Fast (and Slow) Ways to Insert Lots of Rows in YSQL

When inserting a large volume of rows into a YSQL table, the performance you get depends heavily on the SQL pattern you use.

Some patterns turn every row into its own cross-region Raft commit (slow, repetitive). Other patterns let YugabyteDB batch rows and minimize Raft rounds (fast, efficient).

If you repeat that process 100,000 times in a loop, the performance impact multiplies dramatically.

This is especially important in multi-region RF3 clusters, where every committed write must cross regions.

In this tip, we’ll compare three distinct INSERT approaches:

❌ Row-by-row PL/pgSQL loop
⚡ Set-based INSERT … SELECT generate_series
🚀 Multi-row INSERT … VALUES batching

And we’ll show why set-based INSERTs perform dramatically better

Setup: Demo YugabyteDB Cluster

We have a three region YugabyteDB cluster:

				
					yugabyte=# SELECT host, cloud, region, zone, host = host(inet_server_addr()) connected FROM yb_servers() ORDER BY 2, 3, 4;
      host      | cloud |  region   |    zone    | connected
----------------+-------+-----------+------------+-----------
 172.152.18.7   | aws   | us-east-1 | us-east-1a | f
 172.161.27.158 | aws   | us-east-2 | us-east-2a | t
 172.150.28.137 | aws   | us-west-1 | us-west-1a | f
(3 rows)

The output above shows that we are connected to the node in us-east-2a.

Note: Using the YugabyteDB Tip Measure Node-to-Node Latency (Even When ping Is Disabled), we can see that the latency from us-east-2 to us-east-1 is about 14 ms.

				
					yugabyte=# SELECT * FROM node_tcp_latency;
    src_host    | src_cloud | src_region |  src_zone  |    dst_host    | dst_cloud | dst_region |  dst_zone  | dst_port | rtt_ms |      measured_at
----------------+-----------+------------+------------+----------------+-----------+------------+------------+----------+--------+------------------------
 172.161.27.158 | aws       | us-east-2  | us-east-2a | 172.150.28.137 | aws       | us-west-1  | us-west-1a |     5433 | 52.701 | 2025-11-26 18:02:53+00
 172.161.27.158 | aws       | us-east-2  | us-east-2a | 172.152.18.7   | aws       | us-east-1  | us-east-1a |     5433 | 14.509 | 2025-11-26 18:02:53+00
(2 rows)

🔁 Correct Raft Behavior in RF3 (Critical for Understanding Performance)

When a write occurs:

The leader (us-east-2) sends replication messages to all followers concurrently:
- ● us-east-1
- ● us-west-2
The write is considered committed once the leader receives an ACK from any one follower
(Leader + 1 follower = majority of 2/3).
The follower with the lowest RTT naturally ACKs first — not because YB “chooses” it,
but because all followers receive replication at the same time and the fastest one wins the ACK race.

In this cluster:

● Fast follower ACK: us-east-1 (~14.5 ms)
● Slow follower ACK: us-west-1 (~52.7 ms)

Therefore:

Commit latency ≈ RTT(leader → closest follower), i.e., ~14–15 ms per write in your cluster.

Setup: Simple Demo Table

We’ll use a small demo table in a dedicated schema:

				
					CREATE SCHEMA bulk_demo;

CREATE TABLE bulk_demo.load_test (
  id   INT PRIMARY KEY,
  col1 TEXT,
  col2 TEXT
);

Turn on timing in ysqlsh:

				
					\timing

🐌 Option 1: Row-by-Row PL/pgSQL Loop (The “Don’t Do This” Pattern)

Here’s the naïve pattern a lot of people reach for:

				
					TRUNCATE TABLE bulk_demo.load_test;

DO $$
BEGIN
  FOR r IN 1..100000 LOOP
    INSERT INTO bulk_demo.load_test
    VALUES (r, 'some_text', 'some_other_text');
  END LOOP;
END;
$$;

Sample run (your numbers will vary):

				
					DO
Time: 2450.720 ms (00:02.451)

SELECT COUNT(*) FROM bulk_demo.load_test;
 count  
--------
 100000
(1 row)

Time: 40.444 ms

So we:

● Inserted 100,000 rows
● Took about 2.45 seconds

Why it’s slow:

● 100,000 executor invocations
● 100,000 Raft rounds
● 100,000 cross-region quorums
● Quorum cost (~14–15 ms) × 100k rows = big number
● DocDB cannot batch these writes effectively
● PL/pgSQL loop adds interpreter overhead

Even in a single-region cluster this is slow. In a multi-region RF3 cluster, the cost multiplies.

⚡ Option 2: Set-Based Insert with `generate_series` (Massively Faster)

				
					TRUNCATE TABLE bulk_demo.load_test;

INSERT INTO bulk_demo.load_test (id, col1, col2)
SELECT g, 'some_text', 'some_other_text'
FROM generate_series(1, 100000) g;

Example:

				
					yugabyte=# TRUNCATE TABLE bulk_demo.load_test;
TRUNCATE TABLE
Time: 678.545 ms

yugabyte=# INSERT INTO bulk_demo.load_test (id, col1, col2)
SELECT g, 'some_text', 'some_other_text'
FROM generate_series(1, 100000) g;
INSERT 0 100000
Time: 963.660 ms

Why it’s ~2.5× faster:

✔ One SQL statement (not 100,000)
- Executor setup happens once.
✔ YSQL can batch thousands of rows at a time
- Rows are streamed and accumulated in memory.
✔ DocDB receives large batched writes, not tiny one-row writes
- DocDB’s write path becomes dramatically more efficient.
✔ Far fewer Raft rounds
- Instead of 100k commits, you get tens of commits.
✔ Cross-region RTT is amortized
- You pay ~15 ms per batch, not 100k times.

🚀 Option 3: Multi-Row `INSERT … VALUES` (Great for App-Side Batching)

If rows originate in your application, you can batch them:

				
					INSERT INTO bulk_demo.load_test (id, col1, col2) VALUES
  (1, 'some_text', 'some_other_text'),
  (2, 'some_text', 'some_other_text'),
  ...
  (1000, 'some_text', 'some_other_text');

Benefits:

● Far fewer SQL statements
● Far fewer Raft commits
● Much better RPC amortization
● Drastically lower latency impact

Typical batch sizes: 500–5,000 rows per insert.

If rows originate in your app (Java, Go, Python, C#, etc.), this is the ideal approach.

🏁 Final Recommendation

When inserting large numbers of rows in YSQL — especially in multi-region RF3 clusters:

❌ Avoid: PL/pgSQL loops and one-row-at-a-time INSERTs
✔ Prefer: INSERT … SELECT
✔ Or: multi-row INSERT ... VALUES batching

Let YSQL batch the work, let DocDB batch the writes and let Raft amortize cross-region latency across larger groups.

Your cluster and your load times will thank you. 😄

Have Fun!