When working with cost-based optimization (CBO) in a distributed SQL database like YugabyteDB, keeping your table statistics accurate is crucial for optimal query performance. But sometimes, it’s just as important to clear stale or misleading statistics — and that’s where the yb_reset_analyze_statistics
function comes in.
yugabyte=# \df yb_reset_analyze_statistics
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+-----------------------------+------------------+---------------------+------
pg_catalog | yb_reset_analyze_statistics | void | table_oid oid | func
(1 row)
Let’s dive into what it does, why you’d use it, and how it fits into your performance-tuning toolbox.
yb_reset_analyze_statistics
is a YugabyteDB-specific function that clears collected statistics for a given table — essentially resetting what the cost-based optimizer knows about that table’s row count, column distribution, and other planning-relevant metadata.
It works similarly to pg_stat_reset
, but is targeted specifically at ANALYZE
statistics.
Why Would You Want to Reset Stats?
After Bulk Deletes
If you delete a large number of rows but don’t re-run ANALYZE
, the planner may still believe the table contains far more rows than it actually does. That leads to poor plan choices (e.g., expecting sequential scans when indexes would be better).
Testing Query Plans
When tuning queries or evaluating plan stability, you may want to simulate a “cold” statistics state — i.e., what the planner sees when it has no data about a table. Resetting stats is a quick way to return to that baseline.
Forcing the Planner to Re-learn
Sometimes you’re troubleshooting poor plan selection and just want a clean slate. Resetting stats followed by an immediate ANALYZE
ensures the planner is working with current, accurate insights.
During ETL or Data Refresh Cycles
In pipelines where large tables are truncated and repopulated regularly, old stats can become instantly misleading. Resetting them before or after data load helps avoid bad plan caching
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
How To Use It
It’s simple!
yugabyte=# CREATE TABLE my_table(c1 INT PRIMARY KEY);
CREATE TABLE
yugabyte=# INSERT INTO my_table SELECT generate_series(1, 100000);
INSERT 0 100000
yugabyte=# ANALYZE my_table;
ANALYZE
yugabyte=# SELECT yb_reset_analyze_statistics('my_table'::regclass);
yb_reset_analyze_statistics
-----------------------------
(1 row)
What Actually Gets Reset?
Under the hood, yb_reset_analyze_statistics
clears:
Row count estimates (
reltuples
) inpg_class
Histogram and MCVs in
pg_statistic
This simulates the state as if ANALYZE had never been run.
yugabyte=# ANALYZE my_table;
ANALYZE
yugabyte=# SELECT schemaname,
yugabyte-# relname,
yugabyte-# last_analyze,
yugabyte-# analyze_count
yugabyte-# FROM pg_stat_all_tables
yugabyte-# WHERE relname = 'my_table'
yugabyte-# ORDER
yugabyte-# BY schemaname,
yugabyte-# relname;
schemaname | relname | last_analyze | analyze_count
------------+----------+-------------------------------+---------------
public | my_table | 2025-05-01 21:39:52.275665+00 | 2
(1 row)
yugabyte=# SELECT nspname, relname, reltuples
yugabyte-# FROM pg_class
yugabyte-# JOIN pg_namespace
yugabyte-# ON relnamespace = pg_namespace.oid
yugabyte-# WHERE relname = 'my_table'
yugabyte-# ORDER
yugabyte-# BY nspname, relname;
nspname | relname | reltuples
---------+----------+-----------
public | my_table | 100000
(1 row)
yugabyte=# SELECT COUNT(*) FROM pg_stats WHERE tablename = 'my_table';
count
-------
1
(1 row)
Now, let’s reset the stats and check the system tables:
yugabyte=# SELECT yb_reset_analyze_statistics('my_table'::regclass);
yb_reset_analyze_statistics
-----------------------------
(1 row)
yugabyte=# SELECT schemaname,
yugabyte-# relname,
yugabyte-# last_analyze,
yugabyte-# analyze_count
yugabyte-# FROM pg_stat_all_tables
yugabyte-# WHERE relname = 'my_table'
yugabyte-# ORDER
yugabyte-# BY schemaname,
yugabyte-# relname;
schemaname | relname | last_analyze | analyze_count
------------+----------+-------------------------------+---------------
public | my_table | 2025-05-01 21:39:52.275665+00 | 2
(1 row)
yugabyte=# SELECT nspname, relname, reltuples
yugabyte-# FROM pg_class
yugabyte-# JOIN pg_namespace
yugabyte-# ON relnamespace = pg_namespace.oid
yugabyte-# WHERE relname = 'my_table'
yugabyte-# ORDER
yugabyte-# BY nspname, relname;
nspname | relname | reltuples
---------+----------+-----------
public | my_table | 0
(1 row)
yugabyte=# SELECT COUNT(*) FROM pg_stats WHERE tablename = 'my_table';
count
-------
0
(1 row)
Note that calling yb_reset_analyze_statistics
does not clear entries in pg_stat_all_tables
.
Pro Tip: To clear statistics for all tables in the current database, simply pass NULL
to yb_reset_analyze_statistics
function.
yugabyte=# CREATE TABLE t1 (c1 INT PRIMARY KEY);
CREATE TABLE
yugabyte=# CREATE TABLE t2 (c1 INT PRIMARY KEY);
CREATE TABLE
yugabyte=# CREATE TABLE t3 (c1 INT PRIMARY KEY);
CREATE TABLE
yugabyte=# WITH t1_i AS (INSERT INTO t1 SELECT generate_series(1, 100)),
yugabyte-# t2_i AS (INSERT INTO t2 SELECT generate_series(1, 10)),
yugabyte-# t3_i AS (INSERT INTO t3 SELECT generate_series(1, 50))
yugabyte-# SELECT 1;
?column?
----------
1
(1 row)
yugabyte=# ANALYZE t1, t2, t3;
ANALYZE
yugabyte=# SELECT nspname, relname, reltuples
yugabyte-# FROM pg_class
yugabyte-# JOIN pg_namespace
yugabyte-# ON relnamespace = pg_namespace.oid
yugabyte-# WHERE relname IN ('t1', 't2', 't3')
yugabyte-# ORDER
yugabyte-# BY nspname, relname;
nspname | relname | reltuples
---------+---------+-----------
public | t1 | 100
public | t2 | 10
public | t3 | 50
(3 rows)
yugabyte=# SELECT yb_reset_analyze_statistics(NULL);
yb_reset_analyze_statistics
-----------------------------
(1 row)
yugabyte=# SELECT nspname, relname, reltuples
yugabyte-# FROM pg_class
yugabyte-# JOIN pg_namespace
yugabyte-# ON relnamespace = pg_namespace.oid
yugabyte-# WHERE relname IN ('t1', 't2', 't3')
yugabyte-# ORDER
yugabyte-# BY nspname, relname;
nspname | relname | reltuples
---------+---------+-----------
public | t1 | 0
public | t2 | 0
public | t3 | 0
(3 rows)
In distributed systems like YugabyteDB, where query planning spans nodes and partitions, even small mismatches in statistics can lead to big performance surprises. The yb_reset_analyze_statistics
function gives you precise control to clear those stats and start fresh — whether for tuning, testing, or recovering from stale metadata.
Remember: Accurate statistics make for faster queries. But sometimes, the first step to accuracy is clearing the deck.
Have Fun!
