Fast (and Slow) Ways to Insert Lots of Rows in YSQL

When inserting a large volume of rows into a YSQL table, the performance you get depends heavily on the SQL pattern you use.

Some patterns turn every row into its own cross-region Raft commit (slow, repetitive). Other patterns let YugabyteDB batch rows and minimize Raft rounds (fast, efficient).

If you repeat that process 100,000 times in a loop, the performance impact multiplies dramatically.

This is especially important in multi-region RF3 clusters, where every committed write must cross regions.

In this tip, we’ll compare three distinct INSERT approaches:

  • ❌ Row-by-row PL/pgSQL loop

  • ⚡ Set-based INSERT … SELECT generate_series

  • 🚀 Multi-row INSERT … VALUES batching

And we’ll show why set-based INSERTs perform dramatically better

Setup: Demo YugabyteDB Cluster

We have a three region YugabyteDB cluster:

				
					yugabyte=# SELECT host, cloud, region, zone, host = host(inet_server_addr()) connected FROM yb_servers() ORDER BY 2, 3, 4;
      host      | cloud |  region   |    zone    | connected
----------------+-------+-----------+------------+-----------
 172.152.18.7   | aws   | us-east-1 | us-east-1a | f
 172.161.27.158 | aws   | us-east-2 | us-east-2a | t
 172.150.28.137 | aws   | us-west-1 | us-west-1a | f
(3 rows)
				
			

The output above shows that we are connected to the node in us-east-2a.

Note: Using the YugabyteDB Tip Measure Node-to-Node Latency (Even When ping Is Disabled), we can see that the latency from us-east-2 to us-east-1 is about 14 ms. 

				
					yugabyte=# SELECT * FROM node_tcp_latency;
    src_host    | src_cloud | src_region |  src_zone  |    dst_host    | dst_cloud | dst_region |  dst_zone  | dst_port | rtt_ms |      measured_at
----------------+-----------+------------+------------+----------------+-----------+------------+------------+----------+--------+------------------------
 172.161.27.158 | aws       | us-east-2  | us-east-2a | 172.150.28.137 | aws       | us-west-1  | us-west-1a |     5433 | 52.701 | 2025-11-26 18:02:53+00
 172.161.27.158 | aws       | us-east-2  | us-east-2a | 172.152.18.7   | aws       | us-east-1  | us-east-1a |     5433 | 14.509 | 2025-11-26 18:02:53+00
(2 rows)
				
			
🔁 Correct Raft Behavior in RF3 (Critical for Understanding Performance)

When a write occurs:

  1. The leader (us-east-2) sends replication messages to all followers concurrently:

    • ● us-east-1

    • ● us-west-2

  2. The write is considered committed once the leader receives an ACK from any one follower
    (Leader + 1 follower = majority of 2/3).

  3. The follower with the lowest RTT naturally ACKs first — not because YB “chooses” it,
    but because all followers receive replication at the same time and the fastest one wins the ACK race.

In this cluster:

  • ● Fast follower ACK: us-east-1 (~14.5 ms)

  • ● Slow follower ACK: us-west-1 (~52.7 ms)

Therefore:

  • Commit latency ≈ RTT(leader → closest follower), i.e., ~14–15 ms per write in your cluster.
Setup: Simple Demo Table

We’ll use a small demo table in a dedicated schema:

				
					CREATE SCHEMA bulk_demo;

CREATE TABLE bulk_demo.load_test (
  id   INT PRIMARY KEY,
  col1 TEXT,
  col2 TEXT
);
				
			

Turn on timing in ysqlsh:

				
					\timing
				
			
🐌 Option 1: Row-by-Row PL/pgSQL Loop (The “Don’t Do This” Pattern)

Here’s the naïve pattern a lot of people reach for:

				
					TRUNCATE TABLE bulk_demo.load_test;

DO $$
BEGIN
  FOR r IN 1..100000 LOOP
    INSERT INTO bulk_demo.load_test
    VALUES (r, 'some_text', 'some_other_text');
  END LOOP;
END;
$$;
				
			

Sample run (your numbers will vary):

				
					DO
Time: 2450.720 ms (00:02.451)

SELECT COUNT(*) FROM bulk_demo.load_test;
 count  
--------
 100000
(1 row)

Time: 40.444 ms
				
			

So we:

  • ● Inserted 100,000 rows

  • ● Took about 2.45 seconds

Why it’s slow:

  • ● 100,000 executor invocations

  • ● 100,000 Raft rounds

  • ● 100,000 cross-region quorums

  • ● Quorum cost (~14–15 ms) × 100k rows = big number

  • ● DocDB cannot batch these writes effectively

  • ● PL/pgSQL loop adds interpreter overhead

Even in a single-region cluster this is slow. In a multi-region RF3 cluster, the cost multiplies.

⚡ Option 2: Set-Based Insert with generate_series (Massively Faster)
				
					TRUNCATE TABLE bulk_demo.load_test;

INSERT INTO bulk_demo.load_test (id, col1, col2)
SELECT g, 'some_text', 'some_other_text'
FROM generate_series(1, 100000) g;
				
			

Example:

				
					yugabyte=# TRUNCATE TABLE bulk_demo.load_test;
TRUNCATE TABLE
Time: 678.545 ms

yugabyte=# INSERT INTO bulk_demo.load_test (id, col1, col2)
SELECT g, 'some_text', 'some_other_text'
FROM generate_series(1, 100000) g;
INSERT 0 100000
Time: 963.660 ms
				
			

Why it’s ~2.5× faster:

  • ✔ One SQL statement (not 100,000)
    • Executor setup happens once.
  • ✔ YSQL can batch thousands of rows at a time
    • Rows are streamed and accumulated in memory.
  • ✔ DocDB receives large batched writes, not tiny one-row writes
    • DocDB’s write path becomes dramatically more efficient.
  • ✔ Far fewer Raft rounds
    • Instead of 100k commits, you get tens of commits.
  • ✔ Cross-region RTT is amortized
    • You pay ~15 ms per batch, not 100k times.
🚀 Option 3: Multi-Row INSERT … VALUES (Great for App-Side Batching)

If rows originate in your application, you can batch them:

				
					INSERT INTO bulk_demo.load_test (id, col1, col2) VALUES
  (1, 'some_text', 'some_other_text'),
  (2, 'some_text', 'some_other_text'),
  ...
  (1000, 'some_text', 'some_other_text');
				
			

Benefits:

  • ● Far fewer SQL statements

  • ● Far fewer Raft commits

  • ● Much better RPC amortization

  • ● Drastically lower latency impact

Typical batch sizes: 500–5,000 rows per insert.

If rows originate in your app (Java, Go, Python, C#, etc.), this is the ideal approach.

🏁 Final Recommendation

When inserting large numbers of rows in YSQL — especially in multi-region RF3 clusters:

  • Avoid: PL/pgSQL loops and one-row-at-a-time INSERTs

  • Prefer: INSERT … SELECT

  • Or: multi-row INSERT ... VALUES batching

Let YSQL batch the work, let DocDB batch the writes and let Raft amortize cross-region latency across larger groups.

Your cluster and your load times will thank you. 😄

Have Fun!

The shopping center near my house, at the Waterfront in Homestead, PA, always puts up great holiday decorations in the square. The tree in the pic is sitting on top of the water fountain display that runs during the summer.