Audit Table Stats & Last ANALYZE in YSQL

Ever see a query plan that suddenly flips to a seq scan, or cardinality estimates that are off by 100×? Nine times out of ten, it’s statistics.

In PostgreSQL (and in YugabyteDB’s YSQL layer) accurate plans depend on fresh column stats (from ANALYZE).

On top of that, distributed deployments and partitioned tables add a little twist to “where” those stats live and “how” you should look at them.

Today’s YugabyteDB Tip gives you a one‑stop query that:

  • • Tells you the table type (regular / partitioned parent / partition (child) / foreign / materialized view)

  • • Shows whether per‑column stats exist (and how many)

  • • Flags columns with STATISTICS 0 (disabled)

  • • Surfaces last_analyze / last_autoanalyze (feature coming soon!) and how many rows changed since

  • • Works across all non‑system tables or one specific table

Why stats matter for the Cost‑Based Optimizer

YugabyteDB’s YSQL planner now uses a cost‑based optimizer (CBO) in recent releases. The CBO compares alternative plans (join orders/algorithms, scans, index usage) and picks the lowest‑cost route. 

Those cost and selectivity estimates depend heavily on column statistics from ANALYZE. When stats are missing or stale, the CBO can pick an expensive plan (e.g., the wrong join order or scan type). Keeping stats fresh is essential to get the best of the distributed‑aware cost model (which considers remote lookups, network round‑trips, and storage characteristics).

For background, see the official docs: Query Planner / CBO.

The One-Stop Query

I promised a one-stop query… but I prefer to wrap the logic in a database view!

Meet your pocket-sized stats health dashboard: it scans all non-system tables and surfaces exactly what the cost-based optimizer needs… table type, estimated rows, which columns have stats (or are disabled), and the node-specific ANALYZE timestamps and counters.

				
					CREATE OR REPLACE VIEW table_stats_analyze_info_vw AS
SELECT
  t.schemaname,
  t.tablename,
  CASE
    WHEN t.relkind = 'f' THEN 'foreign table'
    WHEN t.relkind = 'p' THEN 'partitioned table (parent)'
    WHEN t.relispartition THEN 'partition (child)'
    WHEN t.relkind = 'r' THEN 'regular table'
    WHEN t.relkind = 'm' THEN 'materialized view'
    ELSE 'other'
  END AS table_type,
  t.est_rows,
  COALESCE(s.total_cols, 0) AS total_cols,
  COALESCE(s.stats_cols, 0) AS stats_cols,
  (COALESCE(s.stats_cols, 0) > 0) AS has_stats,
  COALESCE(s.columns_with_stats_disabled, 0) AS columns_with_stats_disabled,
  st.last_analyze,
  st.last_autoanalyze,
  st.analyze_count,
  st.autoanalyze_count,
  st.n_mod_since_analyze
FROM
(
SELECT
  c.oid AS relid,
  n.nspname AS schemaname,
  c.relname AS tablename,
  c.relkind,
  c.relispartition,
  c.reltuples::bigint AS est_rows
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','p','f','m')
  AND n.nspname <> 'information_schema'
  AND n.nspname NOT LIKE 'pg_%'
) AS t
LEFT JOIN
(
SELECT
  a.attrelid AS relid,
  COUNT(*) FILTER (WHERE a.attnum > 0 AND NOT a.attisdropped) AS total_cols,
  COUNT(*) FILTER (WHERE a.attnum > 0 AND NOT a.attisdropped AND a.attstattarget = 0) AS columns_with_stats_disabled,
  COUNT(ps.attname) AS stats_cols
FROM pg_attribute a
JOIN pg_class c2 ON c2.oid = a.attrelid
JOIN pg_namespace n2 ON n2.oid = c2.relnamespace
LEFT JOIN pg_stats ps ON ps.schemaname = n2.nspname
  AND ps.tablename = c2.relname
  AND ps.attname = a.attname
WHERE a.attnum > 0 AND NOT a.attisdropped
  AND n2.nspname <> 'information_schema'
  AND n2.nspname NOT LIKE 'pg_%'
GROUP BY a.attrelid
) AS s
ON s.relid = t.relid
LEFT JOIN pg_stat_all_tables st ON st.relid = t.relid
ORDER BY t.schemaname, t.tablename;
				
			
Interpreting the Output
  • • table_type – Partitioned parents often have no stats; check the children.

  • • est_rows – Planner’s row estimate; ~0 usually means empty or never analyzed.

  • • has_stats / stats_cols – false on non‑empty tables → disabled stats, never analyzed, or parent table.

  • • columns_with_stats_disabled – Columns set to STATISTICS 0.

  • • last_analyze / last_autoanalyze (node‑specific) – Timestamps from the YSQL server you’re connected to.

  • • n_mod_since_analyze (node‑specific) – Counter of rows modified since last analyze.

Examples
				
					[root@localhost yb]# y -h 127.0.0.1 -c "SELECT * FROM table_stats_analyze_info_vw;"
 schemaname | tablename |         table_type         | est_rows | total_cols | stats_cols | has_stats | columns_with_stats_disabled |         last_analyze          | last_autoanalyze | analyze_count | autoanalyze_count | n_mod_since_analyze
------------+-----------+----------------------------+----------+------------+------------+-----------+-----------------------------+-------------------------------+------------------+---------------+-------------------+---------------------
 public     | test1     | regular table              |       -1 |          1 |          0 | f         |                           0 |                               |                  |             0 |                 0 |                   0
 public     | test2     | partitioned table (parent) |   343468 |          2 |          2 | t         |                           0 | 2025-08-29 23:10:45.39364+00  |                  |             1 |                 0 |                   0
 public     | test2_p1  | partition (child)          |     1000 |          2 |          2 | t         |                           0 | 2025-08-29 23:10:45.416014+00 |                  |             1 |                 0 |                   0
 public     | test2_p2  | partition (child)          |   342234 |          2 |          2 | t         |                           0 | 2025-08-29 23:10:45.635014+00 |                  |             1 |                 0 |                   0
 public     | test2_p3  | partition (child)          |      234 |          2 |          2 | t         |                           0 | 2025-08-29 23:10:45.654398+00 |                  |             1 |                 0 |                   0
(5 rows)

[root@localhost yb]# y -h 127.0.0.2 -c "SELECT * FROM table_stats_analyze_info_vw;"
 schemaname | tablename |         table_type         | est_rows | total_cols | stats_cols | has_stats | columns_with_stats_disabled | last_analyze | last_autoanalyze | analyze_count | autoanalyze_count | n_mod_since_analyze
------------+-----------+----------------------------+----------+------------+------------+-----------+-----------------------------+--------------+------------------+---------------+-------------------+---------------------
 public     | test1     | regular table              |       -1 |          1 |          0 | f         |                           0 |              |                  |             0 |                 0 |                   0
 public     | test2     | partitioned table (parent) |   343468 |          2 |          2 | t         |                           0 |              |                  |             0 |                 0 |                   0
 public     | test2_p1  | partition (child)          |     1000 |          2 |          2 | t         |                           0 |              |                  |             0 |                 0 |                   0
 public     | test2_p2  | partition (child)          |   342234 |          2 |          2 | t         |                           0 |              |                  |             0 |                 0 |                   0
 public     | test2_p3  | partition (child)          |      234 |          2 |          2 | t         |                           0 |              |                  |             0 |                 0 |                   0
(5 rows)
				
			

Notice that column stats are visible from any node, but ANALYZE metadata (timestamps/counters) is node-specific—you’ll only see the values for the tserver you query the view from.

Notes & Gotchas
  • • Permissions: pg_stats is readable by everyone, but MCV/histogram arrays are hidden without SELECT on the table. This query works fine without superuser.

  • • Foreign tables: some FDWs don’t support ANALYZE; consider manual column stats settings if needed.

  • • YugabyteDB specifics: YSQL’s auto‑analyze mirrors Postgres. For many partitions or heavy churn, consider targeted ANALYZE jobs during low‑traffic windows.

Have Fun!

My wife only planted a small garden, not a farmers market… but here we are with today's harvest! 🍅🥒