Speeding Up COUNT(*) on Partitioned Tables in YugabyteDB

When working with large partitioned tables in YugabyteDB, performance of simple aggregate queries like COUNT(*) can vary significantly depending on how the query is written. In this post, we’ll walk through a full example: creating a partitioned table, loading millions of rows into each partition, analyzing EXPLAIN plans, and ultimately speeding up COUNT(*) operations by rewriting the query.

Create a Partitioned Table

We’ll start by creating a table named sales partitioned by sale_date (range partitioning by month):

				
					CREATE TABLE sales (
    id SERIAL,
    sale_date DATE NOT NULL,
    amount NUMERIC
) PARTITION BY RANGE (sale_date);
				
			

Next, create 10 monthly partitions for the year 2025:

				
					DO $$
DECLARE
  part_suffix TEXT;
BEGIN
  FOR i IN 1..10 LOOP
    part_suffix := LPAD(i::text, 2, '0');
    EXECUTE format($f$
      CREATE TABLE sales_2025_%s PARTITION OF sales
      FOR VALUES FROM (DATE '2025-%s-01') TO (DATE '2025-%s-01' + INTERVAL '1 month')
    $f$, part_suffix, part_suffix, part_suffix);
  END LOOP;
END$$;
				
			
Load the Data

We’ll insert between 100,000 and 1,000,000 rows into each partition with random dates and amounts:

				
					DO $$
DECLARE
  part_suffix TEXT;
  row_count INT;
BEGIN
  FOR i IN 1..10 LOOP
    part_suffix := LPAD(i::text, 2, '0');
    row_count := (random() * 900000 + 100000)::INT;

    EXECUTE format($sql$
      INSERT INTO sales (sale_date, amount)
      SELECT
        DATE '2025-%s-01' + (random() * 27)::INT,
        round((random() * 1000)::numeric, 2)
      FROM generate_series(1, %s);
    $sql$, part_suffix, row_count);

    RAISE NOTICE 'Inserted % rows into partition sales_2025_%', row_count, part_suffix;
  END LOOP;
END$$;
				
			
Analyzing the EXPLAIN Plan

Let’s see the default plan when we run COUNT(*) on the parent table:

				
					yugabyte=# EXPLAIN ANALYZE SELECT COUNT(*) FROM sales;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1075.00..1075.01 rows=1 width=8) (actual time=3923.046..3923.051 rows=1 loops=1)
   ->  Append  (cost=0.00..1050.00 rows=10000 width=0) (actual time=2.704..3346.936 rows=5275925 loops=1)
         ->  Seq Scan on sales_2025_01 sales_1  (cost=0.00..100.00 rows=1000 width=0) (actual time=2.702..222.155 rows=426624 loops=1)
         ->  Seq Scan on sales_2025_02 sales_2  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.766..408.318 rows=798466 loops=1)
         ->  Seq Scan on sales_2025_03 sales_3  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.788..371.596 rows=770369 loops=1)
         ->  Seq Scan on sales_2025_04 sales_4  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.992..130.621 rows=256695 loops=1)
         ->  Seq Scan on sales_2025_05 sales_5  (cost=0.00..100.00 rows=1000 width=0) (actual time=1.305..75.188 rows=148124 loops=1)
         ->  Seq Scan on sales_2025_06 sales_6  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.923..330.285 rows=662635 loops=1)
         ->  Seq Scan on sales_2025_07 sales_7  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.855..224.539 rows=438129 loops=1)
         ->  Seq Scan on sales_2025_08 sales_8  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.903..355.781 rows=706142 loops=1)
         ->  Seq Scan on sales_2025_09 sales_9  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.845..218.863 rows=418673 loops=1)
         ->  Seq Scan on sales_2025_10 sales_10  (cost=0.00..100.00 rows=1000 width=0) (actual time=0.813..341.535 rows=650068 loops=1)
 Planning Time: 0.099 ms
 Execution Time: 3923.184 ms
 Peak Memory Usage: 120 kB
(15 rows)

Time: 3923.783 ms (00:03.924)
				
			
Faster: UNION ALL Trick
				
					SELECT SUM(cnt) FROM (
  SELECT COUNT(*) AS cnt FROM sales_2025_01
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_02
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_03
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_04
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_05
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_06
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_07
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_08
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_09
  UNION ALL
  SELECT COUNT(*) FROM sales_2025_10
) foo;
				
			

Example:

				
					yugabyte=# EXPLAIN ANALYZE
yugabyte-# SELECT SUM(cnt) FROM (
yugabyte(#   SELECT COUNT(*) AS cnt FROM sales_2025_01
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_02
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_03
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_04
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_05
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_06
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_07
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_08
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_09
yugabyte(#   UNION ALL
yugabyte(#   SELECT COUNT(*) FROM sales_2025_10
yugabyte(# ) foo;
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1025.28..1025.29 rows=1 width=32) (actual time=786.123..786.133 rows=1 loops=1)
   ->  Append  (cost=102.50..1025.25 rows=10 width=8) (actual time=76.087..786.107 rows=10 loops=1)
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=76.085..76.086 rows=1 loops=1)
               ->  Seq Scan on sales_2025_01  (cost=0.00..100.00 rows=1000 width=0) (actual time=76.071..76.076 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=112.242..112.242 rows=1 loops=1)
               ->  Seq Scan on sales_2025_02  (cost=0.00..100.00 rows=1000 width=0) (actual time=112.227..112.232 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=95.647..95.647 rows=1 loops=1)
               ->  Seq Scan on sales_2025_03  (cost=0.00..100.00 rows=1000 width=0) (actual time=95.630..95.636 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=30.434..30.435 rows=1 loops=1)
               ->  Seq Scan on sales_2025_04  (cost=0.00..100.00 rows=1000 width=0) (actual time=30.417..30.423 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=31.054..31.055 rows=1 loops=1)
               ->  Seq Scan on sales_2025_05  (cost=0.00..100.00 rows=1000 width=0) (actual time=31.035..31.041 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=105.183..105.184 rows=1 loops=1)
               ->  Seq Scan on sales_2025_06  (cost=0.00..100.00 rows=1000 width=0) (actual time=105.164..105.171 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=60.217..60.218 rows=1 loops=1)
               ->  Seq Scan on sales_2025_07  (cost=0.00..100.00 rows=1000 width=0) (actual time=60.201..60.206 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=100.034..100.035 rows=1 loops=1)
               ->  Seq Scan on sales_2025_08  (cost=0.00..100.00 rows=1000 width=0) (actual time=100.019..100.024 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=59.814..59.814 rows=1 loops=1)
               ->  Seq Scan on sales_2025_09  (cost=0.00..100.00 rows=1000 width=0) (actual time=59.797..59.803 rows=3 loops=1)
                     Partial Aggregate: true
         ->  Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=115.366..115.366 rows=1 loops=1)
               ->  Seq Scan on sales_2025_10  (cost=0.00..100.00 rows=1000 width=0) (actual time=115.350..115.356 rows=3 loops=1)
                     Partial Aggregate: true
 Planning Time: 0.168 ms
 Execution Time: 786.261 ms
 Peak Memory Usage: 272 kB
(35 rows)

Time: 787.786 ms
				
			

That’s nearly 5x faster!

A Dynamic Count Function

To make this flexible and future-proof, create a function that dynamically counts all partitions:

				
					CREATE OR REPLACE FUNCTION count_all_partitions(p_parent text)
RETURNS bigint
LANGUAGE plpgsql
AS $$
DECLARE
  part_name TEXT;
  dyn_sql TEXT;
  part_count BIGINT;
  total_count BIGINT := 0;
BEGIN
  FOR part_name IN
    SELECT inhrelid::regclass::text
    FROM pg_inherits
    WHERE inhparent = p_parent::regclass
  LOOP
    dyn_sql := format('SELECT COUNT(*) FROM %I', part_name);
    EXECUTE dyn_sql INTO part_count;
    total_count := total_count + part_count;
  END LOOP;

  RETURN total_count;
END;
$$;
				
			

Example:

				
					yugabyte=# SELECT count_all_partitions('public.sales');
 count_all_partitions
----------------------
              5275925
(1 row)

Time: 759.438 ms
				
			
Conclusion

YugabyteDB’s distributed SQL layer performs well out of the box, but sometimes you can squeeze out extra performance with a deeper understanding of how queries are planned. If you’re working with large partitioned tables and want to accelerate analytics, this UNION ALL trick or a dynamic function may be just what you need.

Have Fun!

My wife and I drove to Myrtyl Beach, SC for the 4th of July weekend and got to watch the fireworks launed from the 2nd Avenue Peir!