When a new YSQL backend session executes a query for the first time, PostgreSQL must read metadata from system catalog tables
In distributed systems like YugabyteDB, those catalog lookups may involve remote RPCs before the catalog cache is populated. For workloads that open many short-lived connections (or during the first query on new connections), these catalog reads can add measurable latency.
YugabyteDB provides catalog preloading gFlags that allow commonly accessed catalog tables to be cached locally inside each backend process.
The challenge, however, is figuring out:
Which catalog tables should you preload?
The official documentation suggests enabling logging and examining the PostgreSQL logs to identify catalog cache misses.
But in a multi-node cluster, that can mean collecting logs from many nodes.
Fortunately, there is a much easier workflow.
π‘
Key Insight
You do not need to search PostgreSQL logs across a distributed cluster to determine which catalog tables your queries access.
Instead, you can:
1
Launch a single-node YugabyteDB instance using yugabyted
2
Recreate the schema and indexes
3
Enable catalog cache debug logging
4
Pipe the session output directly into a shell command
This allows you to automatically generate the exact list of catalog tables to preload.
Why Catalog Cache Preloading Helps
Each new backend session must populate its local catalog cache. If an entry is not cached yet, PostgreSQL performs a lookup against the system catalogs.
In YugabyteDB that lookup can involve:
β RPC calls
β tablet reads
β additional metadata resolution
Preloading certain catalogs can remove this overhead.
Common candidates include:
Catalog Table
Why It Appears
pg_cast
Type casting between datatypes
pg_operator
Operator resolution used in expressions and predicates
pg_opclass
Index operator classes used by B-tree and other index types
pg_amop
Mapping between operators and index access methods
pg_proc
Function metadata for built-in and user-defined functions
This node does not need to match the production topology.
The only important things are:
β Same YugabyteDB version
β Same schema
β Same queries
Step 2 – Create a Schema That Exercises Catalog Tables
The example below intentionally includes:
β partitioning
β triggers
β functions
β indexes
β operator usage
All of these cause catalog lookups.
CREATE DATABASE catcache_demo;
\c catcache_demo
CREATE SCHEMA app;
CREATE TABLE app.events (
tenant_id int NOT NULL,
event_id bigserial NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
payload text,
amount_text text,
PRIMARY KEY (tenant_id, created_at, event_id)
) PARTITION BY RANGE (created_at);
CREATE TABLE app.events_2026_01
PARTITION OF app.events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE app.events_2026_02
PARTITION OF app.events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
Create some functions and triggers.
CREATE OR REPLACE FUNCTION app.amount_as_numeric(t text)
RETURNS numeric
LANGUAGE sql
IMMUTABLE
AS $$
SELECT NULLIF(t,'')::numeric;
$$;
Add indexes.
CREATE INDEX events_created_at_idx
ON app.events (created_at DESC);
CREATE INDEX events_payload_idx
ON app.events (payload);
CREATE INDEX events_amount_expr_idx
ON app.events (app.amount_as_numeric(amount_text));
Insert sample data.
INSERT INTO app.events (tenant_id, created_at, payload, amount_text)
SELECT
1,
timestamp '2026-02-10' + (g * interval '1 minute'),
'payload',
CASE WHEN g % 5 = 0 THEN '12.34' ELSE '' END
FROM generate_series(1,5000) g;
ANALYZE app.events;
Step 3 – Enable Catalog Cache Debug Logging
Now enable debugging for catalog cache lookups.
SET yb_debug_log_catcache_events = 1;
SET backtrace_functions = 'SearchCatCacheMiss';
SET client_min_messages = LOG;
These settings cause PostgreSQL to emit log lines when catalog entries are read.
We want to list the catalog tables that are needed for the following query:
SELECT tenant_id, event_id, created_at
FROM app.events
WHERE tenant_id = 1
AND created_at >= '2026-02-10'
AND created_at < '2026-02-11'
AND app.amount_as_numeric(amount_text) > 10
ORDER BY created_at DESC
LIMIT 50;
Here is the trick.
Rather than manually inspecting PostgreSQL logs across nodes, we can reproduce the workload locally and extract the catalog tables directly from the session output.
The following shell command runs the query and pipes both stdout and stderr through a small shell pipeline that extracts the catalog tables referenced during planning.
DEFAULT="pg_am|pg_amproc|pg_cast|pg_inherits|pg_policy|pg_proc|pg_tablespace|pg_trigger"
ysqlsh -d catcache_demo 2>&1 <<'SQL' \
| grep "Target rel:" \
| awk '{print $3}' \
| sort -u \
| awk -v re="$DEFAULT" 're=="" || $0 !~ re' \
| paste -sd, -
SET yb_debug_log_catcache_events = 1;
SET backtrace_functions = 'SearchCatCacheMiss';
SET client_min_messages = LOG;
EXPLAIN (ANALYZE, DIST)
SELECT tenant_id, event_id, created_at
FROM app.events
WHERE tenant_id = 1
AND created_at >= '2026-02-10'
AND created_at < '2026-02-11'
AND app.amount_as_numeric(amount_text) > 10
ORDER BY created_at DESC
LIMIT 50;
SQL
The DEFAULT variable defines a list of catalog tables that YugabyteDB already preloads automatically when the flag:
ysql_catalog_preload_additional_tables=true
is enabled. Those default preloaded catalogs are:
β pg_am
β pg_amproc
β pg_cast
β pg_inherits
β pg_policy
β pg_proc
β pg_tablespace
β pg_trigger
This ensures the output only contains additional catalog tables referenced by the workload that may benefit from explicit preloading.
If you did not enable that flag, then set the DEFAULT variable to “”.
Step 5 – Configure Additional Catalog Tables to Preload
The list produced in the previous step represents catalog tables accessed by the query planner that are not already covered by the default preload list.
If desired, these tables can be supplied to the YugabyteDB TServer configuration using:
Preloading additional catalog tables can improve query planning latency, but adding too many tables may have unintended side effects. Keep the following considerations in mind:
β Memory usage per backend increases. Every backend session maintains its own catalog cache, so preloading many catalog tables increases memory consumption per session.
β Backend startup time can increase. Preloading requires catalog entries to be loaded when a backend starts. Large preload lists may slightly increase connection startup time.
β Schema changes may invalidate cached entries. DDL operations can invalidate catalog cache entries, which means the cache must be repopulated.
β Only preload catalogs that appear frequently in your workload. The goal is to eliminate repeated catalog lookups for commonly used metadata, not to preload every system catalog.
Conclusion
Catalog cache tuning is a simple but powerful technique for reducing first-query planning latency in YugabyteDB.
In traditional PostgreSQL environments, catalog reads occur locally and are usually inexpensive. In a distributed system like YugabyteDB, however, catalog lookups may require RPC calls to system catalog tablets, which can introduce additional latency during query planning.
By recreating your schema on a lightweight single-node yugabyted instance and enabling catalog cache debug logging, you can easily identify which catalog tables are accessed during query planning.
Using a small shell pipeline (grep, awk, sort, and paste), the catalog tables referenced by your query workload can be extracted automatically, without searching logs across multiple nodes.
This approach turns catalog cache tuning into a repeatable, automated workflow that helps ensure YugabyteDB backends start with the metadata they need already cached, reducing planning overhead and improving query responsiveness for real-world workloads.