YugabyteDB’s Auto Analyze service helps keep table statistics fresh automatically. These statistics are important because the YSQL planner, including the cost-based optimizer, relies on accurate row-count and column statistics to choose efficient query plans.
Auto Analyze automates the execution of ANALYZE after enough table mutations occur. But when trying to confirm whether Auto Analyze has run, there is an important YugabyteDB nuance:
🔎 Key Insight
In YugabyteDB, Auto Analyze may update pg_stat_all_tables.last_analyze rather than last_autoanalyze. Because this timestamp may only be visible from the YSQL node where the analyze activity was recorded, use yb_servers() to check every YSQL endpoint and take the maximum non-null last_analyze. Also check pg_class.reltuples to confirm that optimizer statistics were refreshed.
Why Auto Analyze Matters
The planner uses table statistics to estimate row counts, selectivity, join order, and access paths. In a distributed SQL database like YugabyteDB, good statistics are especially important because a bad estimate can lead to an inefficient distributed query plan.
You can manually refresh statistics with:
ANALYZE table_name;
Auto Analyze reduces the need to do this manually by tracking table mutations and automatically running ANALYZE when configured thresholds are met.
The Question
Where can you track the last time Auto Analyze ran?
At first glance, this query seems like the obvious answer:
SELECT
schemaname,
relname,
last_analyze,
last_autoanalyze
FROM pg_stat_all_tables
WHERE relname = 'big';
The last_autoanalyze column sounds like exactly what we want.
But in practice, with YugabyteDB Auto Analyze, the timestamp may appear in last_analyze instead.
Demo
Create a test table:
CREATE TABLE big(c1 INT PRIMARY KEY);
Check the table statistics before loading data:
SELECT
schemaname,
relname,
last_analyze,
last_autoanalyze
FROM pg_stat_all_tables
WHERE relname = 'big';
Example output:
.schemaname | relname | last_analyze | last_autoanalyze
------------+---------+--------------+------------------
public | big | |
(1 row)
Now insert enough rows to trigger Auto Analyze:
INSERT INTO big
SELECT generate_series(1, 1000000);
After a short wait, check the table again from each YSQL endpoint.
for h in 127.0.0.1 127.0.0.2 127.0.0.3; do
echo "=== $h ==="
ysqlsh -h "$h" -c "
SELECT
schemaname,
relname,
last_analyze,
last_autoanalyze
FROM pg_stat_all_tables
WHERE relname = 'big';
"
done
Example result:
=== 127.0.0.1 ===
schemaname | relname | last_analyze | last_autoanalyze
------------+---------+--------------+------------------
public | big | |
(1 row)
=== 127.0.0.2 ===
schemaname | relname | last_analyze | last_autoanalyze
------------+---------+--------------+------------------
public | big | |
(1 row)
=== 127.0.0.3 ===
schemaname | relname | last_analyze | last_autoanalyze
------------+---------+-------------------------------+------------------
public | big | 2026-05-15 22:23:13.469067+00 |
(1 row)
Notice the important detail:
last_autoanalyzeis empty.
But last_analyze is populated on one YSQL node.
That means Auto Analyze ran, but the visible timestamp was recorded in last_analyze.
Why is last_analyze populated instead of last_autoanalyze?
YugabyteDB Auto Analyze refreshes optimizer statistics by running ANALYZE on the table. In current observed behavior, that activity is reflected in pg_stat_all_tables.last_analyze, while last_autoanalyze may remain empty. So for YugabyteDB, treat last_analyze as the practical timestamp to check, and use pg_class.reltuples to confirm that the planner’s row-count estimate was refreshed.
Cluster-Wide Auto Analyze Check
Because pg_stat_all_tables.last_analyze may only be visible from the YSQL node where the analyze activity was recorded, checking a single YSQL endpoint can be misleading.
To make this easier, let’s build a small shell script that performs a cluster-wide Auto Analyze check. The script uses yb_servers() to discover the YSQL nodes, connects to each one, and reports the latest non-null last_analyze timestamp found across the cluster.
The script also checks pg_class.reltuples, which stores the planner’s estimated row count for the table. After ANALYZE runs, including when triggered by Auto Analyze, reltuples should update to reflect the approximate number of rows in the table. That makes it a useful confirmation that table statistics were refreshed, even when the last_analyze timestamp is only visible from one YSQL node.
#!/usr/bin/env bash
set -euo pipefail
# Usage:
# ./check_auto_analyze.sh [seed_host] [schema_name] [table_name]
#
# Examples:
# ./check_auto_analyze.sh
# ./check_auto_analyze.sh 127.0.0.1 public big
#
# Environment variable overrides:
# SEED_HOST=127.0.0.1 TABLE_SCHEMA=public TABLE_NAME=big ./check_auto_analyze.sh
# DB_NAME=yugabyte DB_USER=yugabyte YSQL_PORT=5433 ./check_auto_analyze.sh
SEED_HOST="${1:-${SEED_HOST:-127.0.0.1}}"
TABLE_SCHEMA="${2:-${TABLE_SCHEMA:-public}}"
TABLE_NAME="${3:-${TABLE_NAME:-big}}"
DB_NAME="${DB_NAME:-yugabyte}"
DB_USER="${DB_USER:-yugabyte}"
YSQL_PORT="${YSQL_PORT:-5433}"
echo "Seed host : ${SEED_HOST}"
echo "Database : ${DB_NAME}"
echo "User : ${DB_USER}"
echo "YSQL port : ${YSQL_PORT}"
echo "Table schema : ${TABLE_SCHEMA}"
echo "Table name : ${TABLE_NAME}"
echo
HOSTS=$(ysqlsh \
-h "$SEED_HOST" \
-p "$YSQL_PORT" \
-U "$DB_USER" \
-d "$DB_NAME" \
-Atc "
SELECT host
FROM yb_servers()
WHERE node_type = 'primary'
ORDER BY host;
")
if [[ -z "$HOSTS" ]]; then
echo "No YSQL hosts returned by yb_servers()."
exit 1
fi
TMP_RESULTS=$(mktemp)
trap 'rm -f "$TMP_RESULTS"' EXIT
echo "Checking table statistics across YSQL nodes..."
echo
for h in $HOSTS; do
echo "=== $h ==="
ysqlsh \
-h "$h" \
-p "$YSQL_PORT" \
-U "$DB_USER" \
-d "$DB_NAME" \
-c "
SELECT
inet_server_addr() AS connected_to,
n.nspname AS schema_name,
c.relname,
c.reltuples::bigint AS reltuples,
s.last_analyze,
s.last_autoanalyze
FROM pg_class c
JOIN pg_namespace n
ON n.oid = c.relnamespace
JOIN pg_stat_all_tables s
ON s.relid = c.oid
WHERE n.nspname = '${TABLE_SCHEMA}'
AND c.relname = '${TABLE_NAME}';
"
ysqlsh \
-h "$h" \
-p "$YSQL_PORT" \
-U "$DB_USER" \
-d "$DB_NAME" \
-Atc "
SELECT
inet_server_addr(),
n.nspname,
c.relname,
c.reltuples::bigint,
s.last_analyze,
s.last_autoanalyze
FROM pg_class c
JOIN pg_namespace n
ON n.oid = c.relnamespace
JOIN pg_stat_all_tables s
ON s.relid = c.oid
WHERE n.nspname = '${TABLE_SCHEMA}'
AND c.relname = '${TABLE_NAME}';
" >> "$TMP_RESULTS"
echo
done
echo "Latest non-null last_analyze found across YSQL nodes:"
echo
LATEST=$(awk -F'|' '
$5 != "" {
print
}
' "$TMP_RESULTS" | sort -t'|' -k5,5 | tail -1)
if [[ -z "$LATEST" ]]; then
echo "No non-null last_analyze value found across the checked YSQL nodes."
echo
echo "This does not necessarily mean statistics were not refreshed."
echo "Check the reltuples values shown above to see whether planner row estimates were updated."
else
echo "$LATEST" | awk -F'|' '
{
print "connected_to : " $1
print "schema_name : " $2
print "table_name : " $3
print "reltuples : " $4
print "last_analyze : " $5
print "last_autoanalyze : " $6
}
'
fi
Run it like this:
chmod +x check_auto_analyze.sh
./check_auto_analyze.sh 127.0.0.1 public big
Or with environment variables:
SEED_HOST=127.0.0.1 \
TABLE_SCHEMA=public \
TABLE_NAME=big \
DB_NAME=yugabyte \
DB_USER=yugabyte \
YSQL_PORT=5433 \
./check_auto_analyze.sh
Example:
[root@localhost ~]# ./check_auto_analyze.sh
Seed host : 127.0.0.1
Database : yugabyte
User : yugabyte
YSQL port : 5433
Table schema : public
Table name : big
Checking table statistics across YSQL nodes...
=== 127.0.0.1 ===
connected_to | schema_name | relname | reltuples | last_analyze | last_autoanalyze
--------------+-------------+---------+-----------+--------------+------------------
127.0.0.1 | public | big | 1000000 | |
(1 row)
=== 127.0.0.2 ===
connected_to | schema_name | relname | reltuples | last_analyze | last_autoanalyze
--------------+-------------+---------+-----------+--------------+------------------
127.0.0.2 | public | big | 1000000 | |
(1 row)
=== 127.0.0.3 ===
connected_to | schema_name | relname | reltuples | last_analyze | last_autoanalyze
--------------+-------------+---------+-----------+-------------------------------+------------------
127.0.0.3 | public | big | 1000000 | 2026-05-15 22:23:13.469067+00 |
(1 row)
Latest non-null last_analyze found across YSQL nodes:
connected_to : 127.0.0.3
schema_name : public
table_name : big
reltuples : 1000000
last_analyze : 2026-05-15 22:23:13.469067+00
last_autoanalyze :
Final Takeaway
Do not rely on pg_stat_all_tables.last_autoanalyze alone when checking Auto Analyze in YugabyteDB. In practice, the useful timestamp may appear in last_analyze, and that value may only be visible from the YSQL node where the analyze activity was recorded.
For the most reliable check, query every YSQL node returned by yb_servers() and use pg_class.reltuples to confirm that the planner’s row-count estimate was refreshed.
Bottom line: use last_analyze across all YSQL nodes to find when analyze activity was recorded, and use reltuples to confirm that optimizer statistics were refreshed.
Have Fun!
