Reset Table Statistics

When working with cost-based optimization (CBO) in a distributed SQL database like YugabyteDB, keeping your table statistics accurate is crucial for optimal query performance. But sometimes, it’s just as important to clear stale or misleading statistics — and that’s where the yb_reset_analyze_statistics function comes in.

				
					yugabyte=# \df yb_reset_analyze_statistics
                                    List of functions
   Schema   |            Name             | Result data type | Argument data types | Type
------------+-----------------------------+------------------+---------------------+------
 pg_catalog | yb_reset_analyze_statistics | void             | table_oid oid       | func
(1 row)
				
			

Let’s dive into what it does, why you’d use it, and how it fits into your performance-tuning toolbox.

yb_reset_analyze_statistics is a YugabyteDB-specific function that clears collected statistics for a given table — essentially resetting what the cost-based optimizer knows about that table’s row count, column distribution, and other planning-relevant metadata.

It works similarly to pg_stat_reset, but is targeted specifically at ANALYZE statistics.

Why Would You Want to Reset Stats?

After Bulk Deletes

If you delete a large number of rows but don’t re-run ANALYZE, the planner may still believe the table contains far more rows than it actually does. That leads to poor plan choices (e.g., expecting sequential scans when indexes would be better).

Testing Query Plans

When tuning queries or evaluating plan stability, you may want to simulate a “cold” statistics state — i.e., what the planner sees when it has no data about a table. Resetting stats is a quick way to return to that baseline.

Forcing the Planner to Re-learn

Sometimes you’re troubleshooting poor plan selection and just want a clean slate. Resetting stats followed by an immediate ANALYZE ensures the planner is working with current, accurate insights.

During ETL or Data Refresh Cycles

In pipelines where large tables are truncated and repopulated regularly, old stats can become instantly misleading. Resetting them before or after data load helps avoid bad plan caching

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

How To Use It

It’s simple!

				
					yugabyte=# CREATE TABLE my_table(c1 INT PRIMARY KEY);
CREATE TABLE

yugabyte=# INSERT INTO my_table SELECT generate_series(1, 100000);
INSERT 0 100000

yugabyte=# ANALYZE my_table;
ANALYZE

yugabyte=# SELECT yb_reset_analyze_statistics('my_table'::regclass);
 yb_reset_analyze_statistics
-----------------------------

(1 row)
				
			
What Actually Gets Reset?

Under the hood, yb_reset_analyze_statistics clears:

  1. Row count estimates (reltuples) in pg_class

  2. Histogram and MCVs in pg_statistic

This simulates the state as if ANALYZE had never been run.

				
					yugabyte=# ANALYZE my_table;
ANALYZE

yugabyte=# SELECT schemaname,
yugabyte-#        relname,
yugabyte-#        last_analyze,
yugabyte-#        analyze_count
yugabyte-#   FROM pg_stat_all_tables
yugabyte-#  WHERE relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY schemaname,
yugabyte-#        relname;
 schemaname | relname  |         last_analyze          | analyze_count
------------+----------+-------------------------------+---------------
 public     | my_table | 2025-05-01 21:39:52.275665+00 |             2
(1 row)

yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname  | reltuples
---------+----------+-----------
 public  | my_table |    100000
(1 row)

yugabyte=# SELECT COUNT(*) FROM pg_stats WHERE tablename = 'my_table';
 count
-------
     1
(1 row)
				
			

Now, let’s reset the stats and check the system tables:

				
					yugabyte=# SELECT yb_reset_analyze_statistics('my_table'::regclass);
 yb_reset_analyze_statistics
-----------------------------

(1 row)

yugabyte=# SELECT schemaname,
yugabyte-#        relname,
yugabyte-#        last_analyze,
yugabyte-#        analyze_count
yugabyte-#   FROM pg_stat_all_tables
yugabyte-#  WHERE relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY schemaname,
yugabyte-#        relname;
 schemaname | relname  |         last_analyze          | analyze_count
------------+----------+-------------------------------+---------------
 public     | my_table | 2025-05-01 21:39:52.275665+00 |             2
(1 row)

yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname  | reltuples
---------+----------+-----------
 public  | my_table |         0
(1 row)

yugabyte=# SELECT COUNT(*) FROM pg_stats WHERE tablename = 'my_table';
 count
-------
     0
(1 row)
				
			

Note that calling yb_reset_analyze_statistics does not clear entries in pg_stat_all_tables.

Pro Tip: To clear statistics for all tables in the current database, simply pass NULL to yb_reset_analyze_statistics function.

				
					yugabyte=# CREATE TABLE t1 (c1 INT PRIMARY KEY);
CREATE TABLE

yugabyte=# CREATE TABLE t2 (c1 INT PRIMARY KEY);
CREATE TABLE

yugabyte=# CREATE TABLE t3 (c1 INT PRIMARY KEY);
CREATE TABLE

yugabyte=# WITH t1_i AS (INSERT INTO t1 SELECT generate_series(1, 100)),
yugabyte-#      t2_i AS (INSERT INTO t2 SELECT generate_series(1, 10)),
yugabyte-#      t3_i AS (INSERT INTO t3 SELECT generate_series(1, 50))
yugabyte-# SELECT 1;
 ?column?
----------
        1
(1 row)

yugabyte=# ANALYZE t1, t2, t3;
ANALYZE

yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE relname IN ('t1', 't2', 't3')
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname | reltuples
---------+---------+-----------
 public  | t1      |       100
 public  | t2      |        10
 public  | t3      |        50
(3 rows)

yugabyte=# SELECT yb_reset_analyze_statistics(NULL);
 yb_reset_analyze_statistics
-----------------------------

(1 row)

yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE relname IN ('t1', 't2', 't3')
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname | reltuples
---------+---------+-----------
 public  | t1      |         0
 public  | t2      |         0
 public  | t3      |         0
(3 rows)

				
			

In distributed systems like YugabyteDB, where query planning spans nodes and partitions, even small mismatches in statistics can lead to big performance surprises. The yb_reset_analyze_statistics function gives you precise control to clear those stats and start fresh — whether for tuning, testing, or recovering from stale metadata.

Remember: Accurate statistics make for faster queries. But sometimes, the first step to accuracy is clearing the deck.

Have Fun!

This was only my third time setting up this tent for a camping trip — this one at Shenandoah National Park. For the first time ever, it wasn’t crazy windy during setup, so naturally I got lazy and didn’t anchor it down like I usually do. Big mistake. That night, the wind showed up like it had a grudge, and the tent nearly took flight while we were sleeping. Moral of the story: always stake like a storm’s coming!