Automating LIST Partitions

⚠️ Review in Progress This YugabyteDB Tip is currently under review for technical correctness by YugabyteDB engineering. This notice will be removed once the tip is finalized and published as official guidance.

LIST partitioning is often the best choice when rows naturally belong to discrete, enumerated values rather than continuous ranges.

Examples include:

● Batch or run identifiers
● Job execution IDs
● Tenant or customer IDs
● Workflow or pipeline instance identifiers
● Region or environment codes

In these cases, each value may represent a large volume of data, and retention is typically managed by dropping entire partitions, not by deleting rows.

However, LIST partitioning introduces an operational challenge:

LIST partitions must exist before rows can be inserted.

Unlike RANGE partitioning, there is no built-in mechanism to automatically create LIST partitions as new values appear. This is true in both PostgreSQL and YugabyteDB.

This tip shows two safe automation patterns for managing LIST partitions in YugabyteDB (YSQL), using a concrete example for clarity while keeping the solution fully general.

YugabyteDB vs PostgreSQL: important differences

Before diving into automation patterns, it’s important to understand a few YugabyteDB-specific rules for partitioned tables.

🔹 Primary key requirements (YugabyteDB)

⚠️ YugabyteDB requires the partition key to be part of the PRIMARY KEY.

				
					PRIMARY KEY (partition_key, id)

PostgreSQL does not have this requirement, but YugabyteDB enforces it to guarantee correctness and distribution across tablets.

🔹 Native partitioning only

ℹ️ YugabyteDB supports native (declarative) partitioning only.

● No trigger-based routing tables
● No inheritance-only partitioning
● No “insert into parent and redirect later” magic

This makes partition behavior predictable, but also means partitions must exist at insert time.

🔹 Trigger behavior on partitioned tables

● Triggers can be attached to individual partitions
● Triggers cannot change partition routing
● An insert that fails partition routing will never reach an AFTER trigger

This has direct implications for automation strategies, discussed below.

Example used throughout this tip

To keep the discussion concrete, we’ll use the following example:

● A table is partitioned by a LIST key named control_flag
● Each value represents a logical group (for example, an execution run)
● The value is encoded as YYYYMMDDHH24MISS
● Each group may contain hundreds of thousands of rows
● Only the most recent N values are retained

This is just an example… the patterns below apply to any LIST-partitioned key.

Step 1: Create the LIST-partitioned table

				
					CREATE TABLE run_events (
  id            bigserial,
  control_flag  bigint NOT NULL,
  payload       jsonb,
  created_at    timestamptz DEFAULT now(),
  PRIMARY KEY (control_flag, id)
)
PARTITION BY LIST (control_flag);

Step 2: Add a DEFAULT partition (required)

				
					CREATE TABLE run_events_default
  PARTITION OF run_events DEFAULT;

⚠️ Why DEFAULT matters
Without a DEFAULT partition, inserts will fail immediately if a new LIST value appears before its partition exists.

The DEFAULT partition acts as a safety net.

Step 3: Create an idempotent partition-creation procedure

This procedure creates a LIST partition for a value only if it doesn’t already exist.

				
					CREATE OR REPLACE PROCEDURE ensure_list_partition(p_value bigint)
LANGUAGE plpgsql
AS $$
DECLARE
  part_name text := format('run_events_p_%s', p_value);
BEGIN
  EXECUTE format(
    'CREATE TABLE IF NOT EXISTS %I PARTITION OF run_events FOR VALUES IN (%s)',
    part_name, p_value
  );
END;
$$;

This procedure is safe to call repeatedly and concurrently. It’s a no-op if the partition already exists.

Step 4: Create a trigger to route rows from DEFAULT

This pattern prioritizes zero application changes.

How it works

1. Inserts land in the DEFAULT partition
2. A BEFORE INSERT trigger fires
3. The trigger:
- ● Acquires an advisory lock for the LIST value
- ● Creates the partition if needed
- ● Re-inserts the row into the parent table
4. YugabyteDB routes the row to the correct LIST partition
5. The original insert into DEFAULT is suppressed

Trigger function (with advisory locking)

				
					CREATE OR REPLACE FUNCTION route_from_default()
RETURNS trigger
LANGUAGE plpgsql
AS $$
DECLARE
  list_value bigint := NEW.control_flag;
  lock_key   bigint;
BEGIN
  -- One lock per LIST value to avoid races
  lock_key := hashtextextended(list_value::text, 0);
  PERFORM pg_advisory_xact_lock(lock_key);

  -- Ensure target partition exists
  CALL ensure_list_partition(list_value);

  -- Reinsert row into parent (routes correctly)
  INSERT INTO run_events VALUES (NEW.*);

  -- Suppress insert into DEFAULT
  RETURN NULL;
END;
$$;

Attach the trigger only to the DEFAULT partition:

				
					CREATE TRIGGER trg_route_from_default
BEFORE INSERT ON run_events_default
FOR EACH ROW
EXECUTE FUNCTION route_from_default();

Step 5: Test the automation

Insert into the DEFAULT partition

				
					INSERT INTO run_events_default (control_flag, payload)
VALUES (
  20260106183015,
  '{"msg": "first row for this run"}'
);

Expected result:

				
					INSERT 0 0

⚠️ This is expected
The trigger intentionally suppresses the insert into the DEFAULT partition after routing the row.

Step 6: Verify the result

				
					SELECT * FROM run_events;

Example output:

				
					 id  |  control_flag  |              payload              |          created_at
----+----------------+-----------------------------------+-------------------------------
  1 | 20260106183015 | {"msg": "first row for this run"} | 2026-01-12 14:15:44.204279+00

Confirm the partition was created

				
					\d+ run_events

Example output:

				
					                                                             Partitioned table "public.run_events"
    Column    |           Type           | Collation | Nullable |                Default                 | Storage  | Compression | Stats target | Description
--------------+--------------------------+-----------+----------+----------------------------------------+----------+-------------+--------------+-------------
 id           | bigint                   |           | not null | nextval('run_events_id_seq'::regclass) | plain    |             |              |
 control_flag | bigint                   |           | not null |                                        | plain    |             |              |
 payload      | jsonb                    |           |          |                                        | extended |             |              |
 created_at   | timestamp with time zone |           |          | now()                                  | plain    |             |              |
Partition key: LIST (control_flag)
Indexes:
    "run_events_pkey" PRIMARY KEY, lsm (control_flag HASH, id ASC)
Partitions: run_events_p_20260106183015 FOR VALUES IN ('20260106183015'),
            run_events_default DEFAULT

At this point:

✅ The partition exists
✅ The row is stored in the correct LIST partition
✅ The DEFAULT partition remains empty

Optional verification: counts per partition

				
					SELECT
  tableoid::regclass AS partition_name,
  count(*)
FROM run_events
GROUP BY 1
ORDER BY 1;

Example output:

				
					       partition_name               | count
-----------------------------+-------
 run_events_p_20260106183015 |     1
(1 row)

Callout: DDL during ingest

⚠️ This pattern executes DDL on the write path

Trade-offs:

● CREATE TABLE … PARTITION OF runs during inserts
● Extra write amplification (row inserted twice)
● Catalog updates and locking
● First insert for a new LIST value may be slower

Use this pattern only when new LIST values appear infrequently.

Sidebar: Partition count hygiene

⚠️ Partition count matters… even in YugabyteDB

While YugabyteDB scales well, very high partition counts can still impact:

● Planning time
● Catalog operations
● DDL latency
● Operational complexity

General guidance:

● Prefer dozens or hundreds, not tens of thousands, of partitions
● Drop old partitions regularly
● Avoid creating partitions for values that will never receive data
● Monitor partition counts as part of schema hygiene

Automation should always include retention management.

Alternative: Scheduled pre-creation (recommended)

For most production workloads, it is safer to create LIST partitions before inserts occur.

This avoids DDL on the write path and produces more predictable performance.

⚠️ What NOT to do

Do not blindly derive LIST values from clock_timestamp():

				
					-- ❌ Dangerous: creates a new partition every run
to_char(clock_timestamp(), 'YYYYMMDDHH24MISS')

This can silently create an unbounded number of partitions.

LIST partitions are about enumerated values, not time.

Correct pre-creation patterns

✅ Pattern 1: Create partition at job start (simplest)

				
					CALL ensure_list_partition(<list_value>);

This is often the cleanest and safest solution.

✅ Pattern 2: Drive creation from metadata

				
					CREATE TABLE active_runs (
  run_id bigint PRIMARY KEY,
  started_at timestamptz DEFAULT now(),
  PRIMARY KEY(run_id)
);

Scheduled job:

				
					SELECT cron.schedule(
  'ensure_partitions_for_active_runs',
  '* * * * *',
  $$
    DO $do$
    DECLARE r record;
    BEGIN
      FOR r IN (SELECT run_id FROM active_runs) LOOP
        CALL ensure_list_partition(r.run_id);
      END LOOP;
    END;
    $do$;
  $$
);

This ensures partitions are created only for known values.

Why this is preferred in YugabyteDB

Benefit	Reason
Predictable performance	No DDL during ingest
Clean failure modes	Inserts either work or fail fast
Better scalability	Safe under concurrency
Easier operations	No trigger side effects

A DEFAULT partition may still be kept as a safety net, but should remain mostly empty.

Sidebar: Why `pg_partman` doesn’t fully solve LIST partition automation

pg_partman does support LIST partitioning, but with very specific constraints that are important to understand.

Internally, pg_partman treats LIST partitioning as a special case of ID-based partitioning, rather than as a fully general LIST automation feature.

How LIST works in `pg_partman`

For LIST partitioning to work in pg_partman, all of the following must be true:

● The partition key must be an integer
● The interval must be exactly 1
● Each partition represents one discrete integer value

In practice, this means pg_partman assumes a sequence like:

				
					1, 2, 3, 4, 5, …

When configured with:

				
					p_type     = 'list'
p_interval = '1'

pg_partman can:

● Predict the “next” partition using +1 arithmetic
● Pre-create partitions
● Manage retention

The key limitation

Because of this design, pg_partman does not natively support automatic LIST partitioning for:

● Text or string values (e.g. country_code = 'US', tenant = 'acme')
● Arbitrary enumerated values
● Time-encoded values that do not increment by exactly 1
● Semantic or business-defined keys

In other words, LIST partitioning in pg_partman is not equivalent to “automatic LIST partition creation” in the general sense.

Final guidance

● LIST partitioning is ideal for enumerated, high-volume groupings
● YugabyteDB requires explicit partition management
● DEFAULT + trigger provides maximum automation, at a cost
● Scheduled or explicit pre-creation is the preferred production pattern

Have Fun!