Map Tablespace Placement to Available Nodes

Tablespaces in YugabyteDB provide a way to control how and where data is stored across a distributed cluster.

By defining tablespaces with specific placement policies, users can determine the regions, zones, or nodes where tables and indexes should reside. This enables better control over data locality, fault tolerance, and performance, especially in geo-distributed environments.

Imagine a YugabyteDB cluster with nodes spread across three distinct regions:

				
					yugabyte=# SELECT host, cloud, region, zone FROM yb_servers() ORDER BY cloud, region, zone;
   host    | cloud |  region   |    zone
-----------+-------+-----------+------------
 127.0.0.1 | aws   | us-east-1 | us-east-1a
 127.0.0.4 | aws   | us-east-1 | us-east-1b
 127.0.0.7 | aws   | us-east-1 | us-east-1c
 127.0.0.2 | aws   | us-east-2 | us-east-2a
 127.0.0.5 | aws   | us-east-2 | us-east-2b
 127.0.0.8 | aws   | us-east-2 | us-east-2c
 127.0.0.3 | aws   | us-west-2 | us-west-2a
 127.0.0.6 | aws   | us-west-2 | us-west-2b
 127.0.0.9 | aws   | us-west-2 | us-west-2c
(9 rows)
				
			

We’ll create a distinct tablespace for each region, enabling us to geo-locate specific tables to individual regions for better data locality and enhanced performance.

				
					yugabyte=# CREATE TABLESPACE us_east_1_ts WITH (
yugabyte(#   replica_placement='{"num_replicas": 3, "placement_blocks":
yugabyte'#     [{"cloud":"aws","region":"us-east-1","zone":"us-east-1a","min_num_replicas":1,"leader_preference":1},
yugabyte'#      {"cloud":"aws","region":"us-east-1","zone":"us-east-1b","min_num_replicas":1,"leader_preference":1},
yugabyte'#      {"cloud":"aws","region":"us-east-1","zone":"us-east-1c","min_num_replicas":1,"leader_preference":1}]}'
yugabyte(#   );
CREATE TABLESPACE

yugabyte=# CREATE TABLESPACE us_east_2_ts WITH (
yugabyte(#   replica_placement='{"num_replicas": 3, "placement_blocks":
yugabyte'#     [{"cloud":"aws","region":"us-east-2","zone":"us-east-2a","min_num_replicas":1,"leader_preference":1},
yugabyte'#      {"cloud":"aws","region":"us-east-2","zone":"us-east-2b","min_num_replicas":1,"leader_preference":1},
yugabyte'#      {"cloud":"aws","region":"us-east-2","zone":"us-east-2c","min_num_replicas":1,"leader_preference":1}]}'
yugabyte(#   );
CREATE TABLESPACE

yugabyte=# CREATE TABLESPACE us_west_2_ts WITH (
yugabyte(#   replica_placement='{"num_replicas": 3, "placement_blocks":
yugabyte'#     [{"cloud":"aws","region":"us-west-2","zone":"us-west-2a","min_num_replicas":1,"leader_preference":1},
yugabyte'#      {"cloud":"aws","region":"us-west-2","zone":"us-west-2b","min_num_replicas":1,"leader_preference":1},
yugabyte'#      {"cloud":"aws","region":"us-west-2","zone":"us-west-2c","min_num_replicas":1,"leader_preference":1}]}'
yugabyte(#   );
CREATE TABLESPACE
				
			

In the previous tip, Extract Tablespace Placement Details via a Database View, we saw how to generate a clean, table-like view of tablespace details. In today’s tip, we’ll build on that concept by mapping the tablespace placement details to the corresponding nodes in the cluster.

As a bonus, we’ll also validate that the replication counts align, check for any invalid placement information, and verify the number of tables assigned to each tablespace!

First, let’s create a few tables and assign them to a tablespace.

				
					yugabyte=# CREATE TABLE some_table_east_1 (id INT) TABLESPACE us_east_1_ts;
CREATE TABLE

yugabyte=# CREATE TABLE some_table_east_2 (id INT) TABLESPACE us_east_2_ts;
CREATE TABLE

yugabyte=# CREATE TABLE another_table_east_2 (id INT) TABLESPACE us_east_2_ts;
CREATE TABLE
				
			

Here’s the code for our new database view:

				
					CREATE OR REPLACE VIEW tablespace_node_map_vw AS
WITH t AS (
  SELECT spcname, (regexp_replace(spcoptions[1], '^replica_placement=', ''))::jsonb AS p
  FROM pg_tablespace
  WHERE spcname NOT IN ('pg_default','pg_global')
),
tbl_count AS (
  SELECT tablespace, COUNT(*) AS table_count
  FROM pg_tables
  GROUP BY tablespace
)
SELECT
  t.spcname,
  COALESCE(tc.table_count, 0) AS table_count,
  (p->>'num_replicas')::int AS num_replicas,
  b->>'cloud' AS cloud,
  b->>'region' AS region,
  b->>'zone' AS zone,
  COALESCE(s.host, 'UNAVAILABLE') AS host,
  (b->>'min_num_replicas')::int AS min_replicas,
  b->>'leader_preference' AS leader_preference,
  CASE
    WHEN SUM((b->>'min_num_replicas')::int) OVER (PARTITION BY t.spcname) = (p->>'num_replicas')::int
    THEN 'OK' ELSE 'MISMATCH'
  END AS replica_check
FROM t
LEFT JOIN tbl_count tc ON tc.tablespace = t.spcname,
LATERAL jsonb_array_elements(p->'placement_blocks') AS b
LEFT JOIN yb_servers() s
  ON s.cloud = b->>'cloud'
 AND s.region = b->>'region'
 AND s.zone = b->>'zone'
ORDER BY t.spcname, cloud, region, zone;

				
			

Now that the view is created, we can query it to see the tablespace placement details, the mapped node hosts, the count of tables assigned to each tablespace, and an audit of the replica counts (the sum of min_replicas should equal max_replicas).

				
					yugabyte=# SELECT * FROM tablespace_node_map_vw;
   spcname    | table_count | num_replicas | cloud |  region   |    zone    |   host    | min_replicas | leader_preference | replica_check
--------------+-------------+--------------+-------+-----------+------------+-----------+--------------+-------------------+---------------
 us_east_1_ts |           1 |            3 | aws   | us-east-1 | us-east-1a | 127.0.0.1 |            1 | 1                 | OK
 us_east_1_ts |           1 |            3 | aws   | us-east-1 | us-east-1b | 127.0.0.4 |            1 | 1                 | OK
 us_east_1_ts |           1 |            3 | aws   | us-east-1 | us-east-1c | 127.0.0.7 |            1 | 1                 | OK
 us_east_2_ts |           2 |            3 | aws   | us-east-2 | us-east-2a | 127.0.0.2 |            1 | 1                 | OK
 us_east_2_ts |           2 |            3 | aws   | us-east-2 | us-east-2b | 127.0.0.5 |            1 | 1                 | OK
 us_east_2_ts |           2 |            3 | aws   | us-east-2 | us-east-2c | 127.0.0.8 |            1 | 1                 | OK
 us_west_2_ts |           0 |            3 | aws   | us-west-2 | us-west-2a | 127.0.0.3 |            1 | 1                 | OK
 us_west_2_ts |           0 |            3 | aws   | us-west-2 | us-west-2b | 127.0.0.6 |            1 | 1                 | OK
 us_west_2_ts |           0 |            3 | aws   | us-west-2 | us-west-2c | 127.0.0.9 |            1 | 1                 | OK
(9 rows)
				
			

Let’s see what happens when we shut down a node in us-east-2

				
					yugabyte=# \! yugabyted stop --base_dir=~/yb08
Stopped yugabyted using config /root/yb08/conf/yugabyted.conf.

yugabyte=# -- Wait about one minute...

yugabyte=# SELECT * FROM tablespace_node_map_vw;
   spcname    | table_count | num_replicas | cloud |  region   |    zone    |    host     | min_replicas | leader_preference | replica_check
--------------+-------------+--------------+-------+-----------+------------+-------------+--------------+-------------------+---------------
 us_east_1_ts |           1 |            3 | aws   | us-east-1 | us-east-1a | 127.0.0.1   |            1 | 1                 | OK
 us_east_1_ts |           1 |            3 | aws   | us-east-1 | us-east-1b | 127.0.0.4   |            1 | 1                 | OK
 us_east_1_ts |           1 |            3 | aws   | us-east-1 | us-east-1c | 127.0.0.7   |            1 | 1                 | OK
 us_east_2_ts |           2 |            3 | aws   | us-east-2 | us-east-2a | 127.0.0.2   |            1 | 1                 | OK
 us_east_2_ts |           2 |            3 | aws   | us-east-2 | us-east-2b | 127.0.0.5   |            1 | 1                 | OK
 us_east_2_ts |           2 |            3 | aws   | us-east-2 | us-east-2c | UNAVAILABLE |            1 | 1                 | OK
 us_west_2_ts |           0 |            3 | aws   | us-west-2 | us-west-2a | 127.0.0.3   |            1 | 1                 | OK
 us_west_2_ts |           0 |            3 | aws   | us-west-2 | us-west-2b | 127.0.0.6   |            1 | 1                 | OK
 us_west_2_ts |           0 |            3 | aws   | us-west-2 | us-west-2c | 127.0.0.9   |            1 | 1                 | OK
(9 rows)
				
			

Notice that the host for the node we stopped is now marked as UNAVAILABLE. When this happens, the tables in the associated tablespace become under-replicated. In a follow-up tip, Monitoring Under-Replicated Tablets in YugabyteDB, we dive into how to identify these under-replicated tablets.

One more thing; here’s a great example of how the replica_check column comes in handy.

				
					yugabyte=# CREATE TABLESPACE some_ts WITH (
yugabyte(# replica_placement='{"num_replicas": 3, "placement_blocks":
yugabyte'# [{"cloud":"aws","region":"us-east-2","zone":"us-east-2a","min_num_replicas":1,"leader_preference":1},
yugabyte'# {"cloud":"aws","region":"us-east-2","zone":"us-east-2b","min_num_replicas":1,"leader_preference":1}]}'
yugabyte(# );
NOTICE: num_replicas is 3, and the total min_num_replicas fields is 2. The location of the additional 1 replicas among the specified zones will be decided dynamically based on the cluster load
CREATE TABLESPACE

yugabyte=# SELECT * FROM tablespace_node_map_vw WHERE replica_check = 'MISMATCH';
spcname  | table_count | num_replicas | cloud | region    | zone       | host      | min_replicas | leader_preference | replica_check
---------+-------------+--------------+-------+-----------+------------+-----------+--------------+-------------------+---------------
some_ts  | 0           | 3            | aws   | us-east-2 | us-east-2a | 127.0.0.2 | 1            | 1                 | MISMATCH
some_ts  | 0           | 3            | aws   | us-east-2 | us-east-2b | 127.0.0.5 | 1            | 1                 | MISMATCH
(2 rows)
				
			

In this case, the third replica ends up on one of the specified zones—less than ideal, especially when it comes to resilience.

Have Fun!

Maple, our daughter’s dog, half-asleep and ready for her owners to join her for a peaceful night’s sleep!