How to Tell If ANALYZE Was Explicitly Run on a Table in YugabyteDB

In YugabyteDB, collecting up-to-date table statistics is critical when the cost-based optimizer (CBO) is enabled. These statistics guide the optimizer in choosing efficient execution plans. YugabyteDB supports the standard PostgreSQL ANALYZE command — currently as a beta feature — and it plays an essential role in query performance tuning.

Note: The ANALYZE command is expected to reach general availability (GA) soon in YugabyteDB. I’ll update this tip once it does!

But how can you tell if a table has been explicitly analyzed?

This tip walks through how to verify whether a manual ANALYZE was run on a table in YugabyteDB.

YugabyteDB‘s CBO relies on table statistics like row counts and value distributions to make smart decisions. If these stats are missing or stale, your queries may end up using suboptimal plans — especially for complex joins or filters.

Since ANALYZE is still in beta, it’s not always run automatically. When the CBO is enabled (yb_enable_optimizer_stats = true), it’s best practice to manually run ANALYZE after major data changes.

Like in PostgreSQL, YugabyteDB tracks the last time ANALYZE or auto-analyze was run using pg_stat_all_tables. The key columns are:

  1. last_analyze: Manual ANALYZE timestamp

  2. last_autoanalyze: Auto-analyze timestamp

Example:

				
					yugabyte=# CREATE TABLE my_table(c1 INT PRIMARY KEY, c2 TEXT);
CREATE TABLE

yugabyte=# SELECT schemaname,
yugabyte-#        relname,
yugabyte-#        last_analyze
yugabyte-#   FROM pg_stat_all_tables
yugabyte-#  WHERE relname = 'my_table';
 schemaname | relname  | last_analyze
------------+----------+--------------
 public     | my_table |
(1 row)

yugabyte=# ANALYZE my_table;
ANALYZE

yugabyte=# SELECT schemaname,
yugabyte-#        relname,
yugabyte-#        last_analyze
yugabyte-#   FROM pg_stat_all_tables
yugabyte-#  WHERE relname = 'my_table';
 schemaname | relname  |         last_analyze
------------+----------+-------------------------------
 public     | my_table | 2025-05-01 15:17:35.262759+00
(1 row)
				
			

This approach works well if you’re connected to the same YugabyteDB node where ANALYZE was run. However, in most real-world setups, your YugabyteDB cluster will have multiple nodes. Since pg_stat_all_tables reflects statistics local to the node you’re connected to, you might not see the last_analyze timestamp if the ANALYZE command was executed on a different node.

				
					yugabyte=# SELECT host,
yugabyte-#        CASE WHEN host = split_part(inet_server_addr()::TEXT, '/', 1) THEN 'X' ELSE NULL END connected_host
yugabyte-#   FROM yb_servers()
yugabyte-#  ORDER BY host;
   host    | connected_host
-----------+----------------
 127.0.0.1 | X
 127.0.0.2 |
 127.0.0.3 |
 127.0.0.4 |
 127.0.0.5 |
(5 rows)

yugabyte=# \! ysqlsh -h 127.0.0.2 -c "SELECT schemaname, relname, last_analyze FROM pg_stat_all_tables WHERE relname = 'my_table';"
 schemaname | relname  | last_analyze
------------+----------+--------------
 public     | my_table |
(1 row)
				
			

A better alternative is to check the pg_class table, whose data is visible across all nodes in the cluster. If the reltuples column contains a value greater than zero, it indicates that ANALYZE has been executed on the table.

				
					yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE reltuples > 0
yugabyte-#    AND relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname | reltuples
---------+---------+-----------
(0 rows)
				
			

Wait! How come there were no records returned? It’s because the table has no rows as we didn’t insert any.

Let’s insert some new data and run ANALYZE again to refresh the table statistics.

				
					yugabyte=# INSERT INTO my_table SELECT generate_series(1, 100000);
INSERT 0 100000

yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname  | reltuples
---------+----------+-----------
 public  | my_table |         0
(1 row)

yugabyte=# ANALYZE my_table;
ANALYZE

yugabyte=# SELECT nspname, relname, reltuples
yugabyte-#   FROM pg_class
yugabyte-#   JOIN pg_namespace
yugabyte-#     ON relnamespace = pg_namespace.oid
yugabyte-#  WHERE reltuples > 0
yugabyte-#    AND relname = 'my_table'
yugabyte-#  ORDER
yugabyte-#     BY nspname, relname;
 nspname | relname  | reltuples
---------+----------+-----------
 public  | my_table |    100000
(1 row)
				
			

Notice that the pg_stat_all_tables system view reflects an updated last_analyze timestamp after we ran ANALYZE a second time. I’ve also included the analyze_count column below to show how many times the table has been analyzed.

				
					yugabyte=# SELECT schemaname,
yugabyte-#        relname,
yugabyte-#        last_analyze,
yugabyte-#        analyze_count
yugabyte-#   FROM pg_stat_all_tables
yugabyte-#  WHERE relname = 'my_table';
 schemaname | relname  |         last_analyze          | analyze_count
------------+----------+-------------------------------+---------------
 public     | my_table | 2025-05-01 15:38:08.920405+00 |             2
(1 row)
				
			

Quick Summary for YugabyteDB

  1. Use pg_stat_all_tables to check if ANALYZE was manually run — but keep in mind, this view is node-local, so you’ll need to query it on each node individually.

  2. Focus on the last_analyze timestamp to determine when the last manual ANALYZE occurred.

  3. Alternatively, query pg_class, which is cluster-wide. If the table has data and reltuples is greater than zero, that’s a strong indicator that ANALYZE has been executed.

  4. Always run ANALYZE manually after major data modifications, especially when using the cost-based optimizer.

  5. Keeping statistics up to date ensures the optimizer makes smarter decisions — resulting in faster queries and better overall performance.

Helpful SQL used in today’s tip:

				
					-- Tables Analyzed on this server
SELECT schemaname,
       relname,
       last_analyze,
       analyze_count
  FROM pg_stat_all_tables
 WHERE analyze_count > 0
 ORDER
    BY schemaname,
       relname;
				
			
				
					-- Tables analyzed previously on any node
SELECT nspname, relname, reltuples
  FROM pg_class
  JOIN pg_namespace
    ON relnamespace = pg_namespace.oid
 WHERE nspname NOT IN ('information_schema', 'pg_catalog', 'pg_toast')
   AND reltuples > 0
 ORDER
    BY nspname, relname;
				
			

Have Fun!