Map a Table Row to the Tablet Leader Node

In a previous tip, we learned that in recent versions of YugabyteDB, we can use the YB_LOCAL_TABLETS system view to map table rows to their tablets.

We also learned in the YugabyteDB tip View Metadata for YSQL/YCQL/System Tablets on a Server that yb_local_tablets only became available in YB 2024.1.0.

In today’s tip, we explore how to map a table row to the tablet leader node in YugabyteDB versions older than 2024.1.0.

To accomplish this, we’ll need a few utility functions to handle key tasks for us.

The first function, get_table_id, was introduced in the YugabyteDB tip Display a Table’s TABLE_ID in YSQL:

				
					CREATE OR REPLACE FUNCTION get_table_id(schema_name_p VARCHAR, table_name_p VARCHAR) RETURNS VARCHAR
AS $$
SELECT '0000' || lpad(to_hex(d.oid::int), 4, '0') || '00003000800000000000' || lpad(to_hex(c.oid::int), 4, '0') tableid
  FROM pg_class c, pg_namespace n, pg_database d
 WHERE n.nspname = $1
   AND c.relname = $2
   AND c.relnamespace = n.oid
   AND d.datname=current_database();
$$ LANGUAGE SQL;
				
			

Next, we need a function to convert octal string values into integer values—we’ll see why later. Here’s that function, which I’ve named octal_string_to_int:

				
					CREATE OR REPLACE FUNCTION octal_string_to_int(octal_str TEXT)
RETURNS INTEGER AS $$
DECLARE
    bytea_val BYTEA;
    int_val INTEGER;
BEGIN
    -- Special case: Convert any two-character string to its 16-bit numerical representation
    IF length(octal_str) = 2 THEN
        RETURN (ascii(substr(octal_str, 1, 1)) << 8) + ascii(substr(octal_str, 2, 1));
    END IF;

    -- Convert octal escape sequence string to BYTEA (PostgreSQL 11 compatible)
    bytea_val := decode(octal_str, 'escape');

    -- Ensure the bytea length is at least 2
    IF length(bytea_val) < 2 THEN
        RAISE EXCEPTION 'Input must contain at least two bytes';
    END IF;

    -- Convert BYTEA to integer (big-endian interpretation)
    int_val := (get_byte(bytea_val, 0) << 8) + get_byte(bytea_val, 1);

    RETURN int_val;
END;
$$ LANGUAGE plpgsql;
				
			

Alright, now for the exciting part!

Let’s create a table and add some data.

				
					[root@cloud-server-0 ~]# alias y
alias y='ysqlsh -h $(hostname -I)'

[root@cloud-server-0 ~]# y -c "CREATE TABLE test(c1 INT PRIMARY KEY);"
CREATE TABLE

[root@cloud-server-0 ~]# y -c "INSERT INTO test SELECT generate_series(1, 100000);"
INSERT 0 100000
				
			

By the way, our YugabyteDB cluster is a simple three-node, multi-region setup:

				
					[root@yugabytedb ~]# y -c "SELECT host, cloud, region, zone FROM yb_servers() ORDER BY host;"
    host   | cloud |    region    |     zone
-----------+-------+--------------+---------------
 127.0.0.1 | aws   | us-east-2    | us-east-2a
 127.0.0.2 | aws   | ca-central-1 | ca-central-1a
 127.0.0.3 | aws   | us-west-2    | us-west-2a
(3 rows)
				
			

Now, let’s create a table to store details about the tablets our table is sharded into.

				
					[root@yugabytedb ~]# y -c "CREATE TABLE tablet_leaders(tablet_id TEXT PRIMARY KEY, partition_start TEXT, partition_end TEXT, leader_ip TEXT);"
CREATE TABLE

[root@yugabytedb ~]# y -c "\d tablet_leaders;"
              Table "public.tablet_leaders"
     Column      | Type | Collation | Nullable | Default
-----------------+------+-----------+----------+---------
 tablet_id       | text |           | not null |
 partition_start | text |           |          |
 partition_end   | text |           |          |
 leader_ip       | text |           |          |
Indexes:
    "tablet_leaders_pkey" PRIMARY KEY, lsm (tablet_id HASH)
				
			

Next, we’ll use the yb-admin CLI list_tablets tool to retrieve the tablets for our table and copy the details into the tablet_leaders table. Notice in the command, we are using the get_table_id function to retrieve the table id for the test table.

				
					[root@yugabytedb ~]# yb-admin -init_master_addrs $(hostname -I | awk '{print $1}') list_tablets ysql.yugabyte tableid.`y -Atc "SELECT get_table_id('public', 'test');"` 0 | awk -F'\\t' -v OFS='|' 'NR>1{print $1,$2,$3 }' | sed -e 's/ *| */|/g' | sed 's/partition_key_start: //g' | sed 's/ partition_key_end: /|/g' | sed 's/:9100//g' | sed 's/[[:blank:]]*$//' | y -c "COPY tablet_leaders FROM STDIN DELIMITER '|' CSV;"
COPY 3

[root@yugabytedb ~]# y -c "SELECT * FROM tablet_leaders ORDER BY leader_ip;"
            tablet_id             | partition_start | partition_end | leader_ip
----------------------------------+-----------------+---------------+-----------
 b0591641064241d6af3f7c85178ef68c |                 | UU            | 127.0.0.1
 da605ea5c6dd454980dfdd6bba57e36b | UU              | \252\252      | 127.0.0.2
 57c673eb1cd04c7c91cd6736616fb243 | \252\252        |               | 127.0.0.3
(3 rows)
				
			

Note that the values in the partition_start and partition_end columns are in octal. We’ll need to convert them to integers so we can later use the yb_hash_code function to determine which tablet a row belongs to. This is where our octal_string_to_int function comes in.

We’re going to create a database view that performs the conversion, while also joining with the yb_servers() system view to display the cloud, region, and zone information for the tablet leader nodes.

Here’s the view DDL, which I named tablet_leaders_vw:

				
					CREATE OR REPLACE VIEW tablet_leaders_vw
AS
  SELECT tablet_id,
         CASE WHEN partition_start = '' THEN 0 ELSE octal_string_to_int(partition_start) END partition_start_int,
         CASE WHEN partition_end = '' THEN 65536 ELSE octal_string_to_int(partition_end) END partition_end_int,
         leader_ip,
         ys.cloud,
         ys.region,
         ys.zone
    FROM tablet_leaders tl
    JOIN yb_servers() ys
      ON ys.host = tl.leader_ip;
				
			

If we query the view, we can now see the partition start and end for each tablet displayed as integer values, along with the placement information for each tablet leader.

				
					[root@yugabytedb ~]# y -c "SELECT * FROM tablet_leaders_vw ORDER BY partition_start_int;"
            tablet_id             | partition_start_int | partition_end_int | leader_ip  | cloud |    region    |     zone
----------------------------------+---------------------+-------------------+------------+-------+--------------+---------------
 b0591641064241d6af3f7c85178ef68c |                   0 |             21845 | 127.0.0.1  | aws   | us-east-2    | us-east-2a
 da605ea5c6dd454980dfdd6bba57e36b |               21845 |             43690 | 127.0.0.2  | aws   | ca-central-1 | ca-central-1a
 57c673eb1cd04c7c91cd6736616fb243 |               43690 |             65536 | 127.0.0.3  | aws   | us-west-2    | us-west-2a
(3 rows)
				
			
Finally, it’s easy to determine which tablet is the leader for a specific partition key (i.e. the c1 column) value in the test table.
A few examples, where c1 = 1, 354621 and 910001:
				
					[root@yugabytedb ~]# y -c "SELECT * FROM tablet_leaders_vw WHERE yb_hash_code(1) >= partition_start_int AND yb_hash_code(1)< partition_end_int;"
            tablet_id             | partition_start_int | partition_end_int | leader_ip | cloud |  region   |    zone
----------------------------------+---------------------+-------------------+-----------+-------+-----------+------------
 b0591641064241d6af3f7c85178ef68c |                   0 |             21845 | 127.0.0.1 | aws   | us-east-2 | us-east-2a
(1 row)

[root@yugabytedb ~]# y -c "SELECT * FROM tablet_leaders_vw WHERE yb_hash_code(354621) >= partition_start_int AND yb_hash_code(354621)< partition_end_int;"
            tablet_id             | partition_start_int | partition_end_int | leader_ip | cloud |    region    |     zone

----------------------------------+---------------------+-------------------+-----------+-------+--------------+---------------
 da605ea5c6dd454980dfdd6bba57e36b |               21845 |             43690 | 127.0.0.2 | aws   | ca-central-1 | ca-central-1a
(1 row)

[root@yugabytedb ~]# y -c "SELECT * FROM tablet_leaders_vw WHERE yb_hash_code(910001) >= partition_start_int AND yb_hash_code(910001)< partition_end_int;"
            tablet_id             | partition_start_int | partition_end_int | leader_ip | cloud |  region   |    zone
----------------------------------+---------------------+-------------------+-----------+-------+-----------+------------
 57c673eb1cd04c7c91cd6736616fb243 |               43690 |             65536 | 127.0.0.3 | aws   | us-west-2 | us-west-2a
(1 row)
				
			

Have Fun!

Maple, our daugther's "puppy", is now 30 pounds!