When working with large partitioned tables in YugabyteDB, performance of simple aggregate queries like COUNT(*) can vary significantly depending on how the query is written. In this post, we’ll walk through a full example: creating a partitioned table, loading millions of rows into each partition, analyzing EXPLAIN plans, and ultimately speeding up COUNT(*) operations by rewriting the query.
Create a Partitioned Table
We’ll start by creating a table named sales partitioned by sale_date (range partitioning by month):
CREATE TABLE sales (
id SERIAL,
sale_date DATE NOT NULL,
amount NUMERIC
) PARTITION BY RANGE (sale_date);
Next, create 10 monthly partitions for the year 2025:
DO $$
DECLARE
part_suffix TEXT;
BEGIN
FOR i IN 1..10 LOOP
part_suffix := LPAD(i::text, 2, '0');
EXECUTE format($f$
CREATE TABLE sales_2025_%s PARTITION OF sales
FOR VALUES FROM (DATE '2025-%s-01') TO (DATE '2025-%s-01' + INTERVAL '1 month')
$f$, part_suffix, part_suffix, part_suffix);
END LOOP;
END$$;
Load the Data
We’ll insert between 100,000 and 1,000,000 rows into each partition with random dates and amounts:
DO $$
DECLARE
part_suffix TEXT;
row_count INT;
BEGIN
FOR i IN 1..10 LOOP
part_suffix := LPAD(i::text, 2, '0');
row_count := (random() * 900000 + 100000)::INT;
EXECUTE format($sql$
INSERT INTO sales (sale_date, amount)
SELECT
DATE '2025-%s-01' + (random() * 27)::INT,
round((random() * 1000)::numeric, 2)
FROM generate_series(1, %s);
$sql$, part_suffix, row_count);
RAISE NOTICE 'Inserted % rows into partition sales_2025_%', row_count, part_suffix;
END LOOP;
END$$;
Analyzing the EXPLAIN Plan
Let’s see the default plan when we run COUNT(*) on the parent table:
yugabyte=# EXPLAIN ANALYZE SELECT COUNT(*) FROM sales;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1075.00..1075.01 rows=1 width=8) (actual time=3923.046..3923.051 rows=1 loops=1)
-> Append (cost=0.00..1050.00 rows=10000 width=0) (actual time=2.704..3346.936 rows=5275925 loops=1)
-> Seq Scan on sales_2025_01 sales_1 (cost=0.00..100.00 rows=1000 width=0) (actual time=2.702..222.155 rows=426624 loops=1)
-> Seq Scan on sales_2025_02 sales_2 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.766..408.318 rows=798466 loops=1)
-> Seq Scan on sales_2025_03 sales_3 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.788..371.596 rows=770369 loops=1)
-> Seq Scan on sales_2025_04 sales_4 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.992..130.621 rows=256695 loops=1)
-> Seq Scan on sales_2025_05 sales_5 (cost=0.00..100.00 rows=1000 width=0) (actual time=1.305..75.188 rows=148124 loops=1)
-> Seq Scan on sales_2025_06 sales_6 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.923..330.285 rows=662635 loops=1)
-> Seq Scan on sales_2025_07 sales_7 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.855..224.539 rows=438129 loops=1)
-> Seq Scan on sales_2025_08 sales_8 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.903..355.781 rows=706142 loops=1)
-> Seq Scan on sales_2025_09 sales_9 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.845..218.863 rows=418673 loops=1)
-> Seq Scan on sales_2025_10 sales_10 (cost=0.00..100.00 rows=1000 width=0) (actual time=0.813..341.535 rows=650068 loops=1)
Planning Time: 0.099 ms
Execution Time: 3923.184 ms
Peak Memory Usage: 120 kB
(15 rows)
Time: 3923.783 ms (00:03.924)
Faster: UNION ALL Trick
SELECT SUM(cnt) FROM (
SELECT COUNT(*) AS cnt FROM sales_2025_01
UNION ALL
SELECT COUNT(*) FROM sales_2025_02
UNION ALL
SELECT COUNT(*) FROM sales_2025_03
UNION ALL
SELECT COUNT(*) FROM sales_2025_04
UNION ALL
SELECT COUNT(*) FROM sales_2025_05
UNION ALL
SELECT COUNT(*) FROM sales_2025_06
UNION ALL
SELECT COUNT(*) FROM sales_2025_07
UNION ALL
SELECT COUNT(*) FROM sales_2025_08
UNION ALL
SELECT COUNT(*) FROM sales_2025_09
UNION ALL
SELECT COUNT(*) FROM sales_2025_10
) foo;
Example:
yugabyte=# EXPLAIN ANALYZE
yugabyte-# SELECT SUM(cnt) FROM (
yugabyte(# SELECT COUNT(*) AS cnt FROM sales_2025_01
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_02
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_03
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_04
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_05
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_06
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_07
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_08
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_09
yugabyte(# UNION ALL
yugabyte(# SELECT COUNT(*) FROM sales_2025_10
yugabyte(# ) foo;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1025.28..1025.29 rows=1 width=32) (actual time=786.123..786.133 rows=1 loops=1)
-> Append (cost=102.50..1025.25 rows=10 width=8) (actual time=76.087..786.107 rows=10 loops=1)
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=76.085..76.086 rows=1 loops=1)
-> Seq Scan on sales_2025_01 (cost=0.00..100.00 rows=1000 width=0) (actual time=76.071..76.076 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=112.242..112.242 rows=1 loops=1)
-> Seq Scan on sales_2025_02 (cost=0.00..100.00 rows=1000 width=0) (actual time=112.227..112.232 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=95.647..95.647 rows=1 loops=1)
-> Seq Scan on sales_2025_03 (cost=0.00..100.00 rows=1000 width=0) (actual time=95.630..95.636 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=30.434..30.435 rows=1 loops=1)
-> Seq Scan on sales_2025_04 (cost=0.00..100.00 rows=1000 width=0) (actual time=30.417..30.423 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=31.054..31.055 rows=1 loops=1)
-> Seq Scan on sales_2025_05 (cost=0.00..100.00 rows=1000 width=0) (actual time=31.035..31.041 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=105.183..105.184 rows=1 loops=1)
-> Seq Scan on sales_2025_06 (cost=0.00..100.00 rows=1000 width=0) (actual time=105.164..105.171 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=60.217..60.218 rows=1 loops=1)
-> Seq Scan on sales_2025_07 (cost=0.00..100.00 rows=1000 width=0) (actual time=60.201..60.206 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=100.034..100.035 rows=1 loops=1)
-> Seq Scan on sales_2025_08 (cost=0.00..100.00 rows=1000 width=0) (actual time=100.019..100.024 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=59.814..59.814 rows=1 loops=1)
-> Seq Scan on sales_2025_09 (cost=0.00..100.00 rows=1000 width=0) (actual time=59.797..59.803 rows=3 loops=1)
Partial Aggregate: true
-> Finalize Aggregate (cost=102.50..102.51 rows=1 width=8) (actual time=115.366..115.366 rows=1 loops=1)
-> Seq Scan on sales_2025_10 (cost=0.00..100.00 rows=1000 width=0) (actual time=115.350..115.356 rows=3 loops=1)
Partial Aggregate: true
Planning Time: 0.168 ms
Execution Time: 786.261 ms
Peak Memory Usage: 272 kB
(35 rows)
Time: 787.786 ms
That’s nearly 5x faster!
A Dynamic Count Function
To make this flexible and future-proof, create a function that dynamically counts all partitions:
CREATE OR REPLACE FUNCTION count_all_partitions(p_parent text)
RETURNS bigint
LANGUAGE plpgsql
AS $$
DECLARE
part_name TEXT;
dyn_sql TEXT;
part_count BIGINT;
total_count BIGINT := 0;
BEGIN
FOR part_name IN
SELECT inhrelid::regclass::text
FROM pg_inherits
WHERE inhparent = p_parent::regclass
LOOP
dyn_sql := format('SELECT COUNT(*) FROM %I', part_name);
EXECUTE dyn_sql INTO part_count;
total_count := total_count + part_count;
END LOOP;
RETURN total_count;
END;
$$;
Example:
yugabyte=# SELECT count_all_partitions('public.sales');
count_all_partitions
----------------------
5275925
(1 row)
Time: 759.438 ms
Conclusion
YugabyteDB’s distributed SQL layer performs well out of the box, but sometimes you can squeeze out extra performance with a deeper understanding of how queries are planned. If you’re working with large partitioned tables and want to accelerate analytics, this UNION ALL trick or a dynamic function may be just what you need.
Have Fun!
