When working with YugabyteDB, especially in large or distributed deployments, understanding how your tables and indexes are partitioned is critical for performance, scaling, and predictable query behavior. But here’s the challenge: you may not always remember how you defined it.
Did you specify HASH or RANGE when creating that index? What if you didn’t specify it at all?
Let’s explore how to check, why it matters, and what YugabyteDB does by default.
Why Partitioning Method Matters
YugabyteDB (YSQL) distributes table and index data across tablets using either:
Hash partitioning: evenly distributes rows by hashing one or more columns.
Range partitioning: sorts data by one or more columns, useful for ordered queries or time-series data.
Choosing the wrong one can result in:
Hotspots (for range-based ingestion on a hash index)
Poor performance on range queries (on a hash-partitioned table)
Unexpected query plans or tablet layouts
What If You Don’t Specify HASH or RANGE?
If you don’t explicitly declare the partitioning method in your PRIMARY KEY or INDEX definition, YugabyteDB uses the default defined by the GUC:
yugabyte=# SELECT setting, short_desc FROM pg_settings WHERE name = 'yb_use_hash_splitting_by_default';
setting | short_desc
---------+-----------------------------------------------------------------------------------------------
on | Enables hash splitting as the default method for primary key and index sorting in LSM indexes
(1 row)
That’s the default setting as of YugbayteDB 2024.2.4.0 and below. However, that’s subject to change in future releases.
yb_use_hash_splitting_by_default is switched to off! How to Tell What You Used (After the Fact)
To check how each index (including primary keys) is partitioned, use the following SQL query:
SELECT
c.relname AS index_name,
t.relname AS table_name,
i.indisprimary AS is_primary,
pg_get_indexdef(c.oid) AS index_def,
CASE
WHEN pg_get_indexdef(c.oid) ~* '\(\s*\(' AND pg_get_indexdef(c.oid) ~* 'hash' THEN 'HASH (composite expr)'
WHEN pg_get_indexdef(c.oid) ~* 'hash' THEN 'HASH'
WHEN pg_get_indexdef(c.oid) ~* 'desc' THEN 'DESC'
WHEN pg_get_indexdef(c.oid) ~* 'asc' THEN 'ASC'
ELSE 'UNKNOWN'
END AS partition_type
FROM
pg_class c
JOIN pg_index i ON c.oid = i.indexrelid
JOIN pg_class t ON i.indrelid = t.oid
WHERE
c.relkind = 'i'
AND t.relnamespace IN (
SELECT oid FROM pg_namespace WHERE nspname NOT IN ('pg_catalog', 'information_schema')
)
ORDER BY
table_name, is_primary DESC, index_name;
This will give you a list of all indexes and whether they use HASH, ASC (range), or something else. You’ll also see whether the index is a PRIMARY KEY.
EXAMPLE:
yugabyte=# SELECT
yugabyte-# c.relname AS index_name,
yugabyte-# t.relname AS table_name,
yugabyte-# i.indisprimary AS is_primary,
yugabyte-# pg_get_indexdef(c.oid) AS index_def,
yugabyte-# CASE
yugabyte-# WHEN pg_get_indexdef(c.oid) ~* '\(\s*\(' AND pg_get_indexdef(c.oid) ~* 'hash' THEN 'HASH (composite expr)'
yugabyte-# WHEN pg_get_indexdef(c.oid) ~* 'hash' THEN 'HASH'
yugabyte-# WHEN pg_get_indexdef(c.oid) ~* 'desc' THEN 'DESC'
yugabyte-# WHEN pg_get_indexdef(c.oid) ~* 'asc' THEN 'ASC'
yugabyte-# ELSE 'UNKNOWN'
yugabyte-# END AS partition_type
yugabyte-# FROM
yugabyte-# pg_class c
yugabyte-# JOIN pg_index i ON c.oid = i.indexrelid
yugabyte-# JOIN pg_class t ON i.indrelid = t.oid
yugabyte-# WHERE
yugabyte-# c.relkind = 'i'
yugabyte-# AND t.relnamespace IN (
yugabyte(# SELECT oid FROM pg_namespace WHERE nspname NOT IN ('pg_catalog', 'information_schema')
yugabyte(# )
yugabyte-# ORDER BY
yugabyte-# table_name, is_primary DESC, index_name;
index_name | table_name | is_primary | index_def | partition_type
-----------------------+------------+------------+---------------------------------------------------------------------------------------------+-----------------------
sales_2024_pkey | sales_2024 | t | CREATE UNIQUE INDEX sales_2024_pkey ON public.sales_2024 USING lsm (id HASH, sale_date ASC) | HASH
sales_2024_amount_idx | sales_2024 | f | CREATE INDEX sales_2024_amount_idx ON public.sales_2024 USING lsm (amount HASH) | HASH
sales_2025_pkey | sales_2025 | t | CREATE UNIQUE INDEX sales_2025_pkey ON public.sales_2025 USING lsm (id HASH, sale_date ASC) | HASH
t_pkey | t | t | CREATE UNIQUE INDEX t_pkey ON public.t USING lsm ((c1, c2) HASH) | HASH (composite expr)
t_idx1 | t | f | CREATE INDEX t_idx1 ON public.t USING lsm (c2 HASH) | HASH
t_idx2 | t | f | CREATE INDEX t_idx2 ON public.t USING lsm (c2 ASC) | ASC
t_idx3 | t | f | CREATE INDEX t_idx3 ON public.t USING lsm ((c2, c4) HASH) | HASH (composite expr)
t_idx4 | t | f | CREATE INDEX t_idx4 ON public.t USING lsm (c3 ASC, c4 ASC) | ASC
t_idx6 | t | f | CREATE INDEX t_idx6 ON public.t USING lsm (c2 HASH, c5 ASC) | HASH
(9 rows)
Reference: How Partitioning Works in YugabyteDB
Under the hood, YugabyteDB uses DocDB, which divides tables/indexes into tablets. These are shards distributed across the cluster. The partitioning method determines how rows are mapped to tablets.
From YugabyteDB’s documentation on sharding:
- When using hash partitioning, the row key is hashed to a 16-byte value. Tablets are then assigned contiguous hash key ranges… For range partitioning, tablets are assigned non-overlapping ranges of the primary key values in order.
Understanding this structure is key to building systems that scale effectively, and avoid surprises later on.
Best Practices
Be explicit: Always specify
HASHorASC/DESCwhen definingPRIMARY KEYandINDEXclauses.Check your GUCs: Use
SHOW yb_use_hash_splitting_by_defaultto know what the default behavior is.Review your schema: Use the query above to inspect existing objects and validate assumptions.
Use range partitioning carefully: Especially for time-series data, to avoid hotspots and manual tablet splits.
Have Fun!
