No Primary Key in YSQL? YugabyteDB Still Creates One

At first glance, YugabyteDB behaves just like PostgreSQL.

You can create a table without a primary key:

				
					CREATE TABLE no_pk(id INT);
				
			

No errors. No warnings. Totally valid.

Run \d no_pk, and you’ll see exactly what you expect… just a simple table with no primary key defined.

So it’s tempting to think:

  • 👉 “This is just a normal heap table.”

But under the hood, something very different is happening.

YugabyteDB is not heap-based. It uses a distributed, index-oriented storage engine (DocDB), where every row must have a key in order to be stored and distributed across the cluster.

So when you don’t define a primary key…

  • 👉 YugabyteDB quietly creates one for you.

And not just any key… a hidden, auto-generated column called ybrowid, which becomes the real primary key used for distribution.

This tip walks through a simple demo that proves it… all the way down to the raw storage layer… and shows how to trace that hidden key from SQL → DocDB → disk → back again.

🧪 Step 1: Create Tables (With and Without PK)

Create example tables:

				
					CREATE TABLE pk(id INT PRIMARY KEY);

CREATE TABLE no_pk(id INT);
				
			

Check the schemas:

				
					
yugabyte=# \d pk
                 Table "public.pk"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           | not null |
Indexes:
    "pk_pkey" PRIMARY KEY, lsm (id HASH)

yugabyte=# \d no_pk
               Table "public.no_pk"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |

				
			

At this point, no_pk looks like a normal PostgreSQL heap table…

…but that’s just the illusion.

🧪 Step 2: Insert Data

Insert some data into the tabe without a PK:

				
					INSERT INTO no_pk VALUES (1);
				
			

⚙️ Step 3: Flush to Disk

When you insert data into YugabyteDB, it does not go straight to disk.

Instead, it follows a write path similar to other LSM-based systems:

  • ● The write is recorded in memory (MemTable)
  • ● It is also written to the WAL (Write-Ahead Log) for durability
  • ● Eventually, it gets flushed to disk as an SSTable file

So when you run:

				
					yb-ts-cli --server_address=127.0.0.1:9100 flush_all_tablets
				
			

This forces YugabyteDB to:

  • 👉 take all in-memory data (MemTables)
    👉 write it out to disk as SST files (Sorted String Tables)
💡 Why this matters for this demo

Tools like sst_dump only read data that exists on disk.

If you don’t flush:

  • ● Your data is still in memory
  • ● No .sst files exist yet
  • sst_dump returns: from [] to []

🔬 Step 4: Inspect DocDB (The Truth Layer)

Gather the needed info:

				
					[root@localhost ~]# ysqlsh -h 127.0.0.1 -c "SHOW data_directory;"
     data_directory
------------------------
 /root/var/data/pg_data
(1 row)

[root@localhost ~]# ysqlsh -h 127.0.0.1 -c "SELECT table_id, tablet_id FROM yb_local_tablets WHERE table_name = 'no_pk';"
             table_id             |            tablet_id
----------------------------------+----------------------------------
 000034d4000030008000000000004135 | fe97bdc9e819427ba5711a79852b9ea8
(1 row)
				
			

Before running sst_dump:

				
					sst_dump \
  --command=scan \
  --file=/root/var/data/yb-data/tserver/data/rocksdb/table-<table_id>/tablet-<tablet_id>/ \
  --output_format=decoded_regulardb \
  --formatter_tablet_metadata=<YugabyteDB data directory></YugabyteDB>/tserver/tablet-meta/<tablet_id>
				
			

… it helps to understand what it actually is:

🧠 What is sst_dump?

sst_dump is a low-level utility that reads YugabyteDB’s on-disk storage files (SSTables).

● It allows you to:

  • ● Scan raw key-value data stored in DocDB
  • ● Decode document keys (DocKey) and values
  • ● Inspect how rows are physically stored on disk

In this tip, we use it to bypass the SQL layer entirely and see the actual primary key YugabyteDB is using.

Example:

				
					[root@localhost ~]# sst_dump \
  --command=scan \
  --file=/root/var/data/yb-data/tserver/data/rocksdb/table-000034d4000030008000000000004135/tablet-fe97bdc9e819427ba5711a79852b9ea8/ \
  --output_format=decoded_regulardb \
  --formatter_tablet_metadata=/root/var/data/yb-data/tserver/tablet-meta/fe97bdc9e819427ba5711a79852b9ea8
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0415 21:33:29.934546 1751267 kv_formatter.cc:35] Found info for table ID 000034d4000030008000000000004135 (namespace yugabyte, table_type PGSQL_TABLE_TYPE, name no_pk, cotable_id 00000000-0000-0000-0000-000000000000, colocation_id 0) in superblock
from [] to []
Process /root/var/data/yb-data/tserver/data/rocksdb/table-000034d4000030008000000000004135/tablet-fe97bdc9e819427ba5711a79852b9ea8//000010.sst
Sst file format: block-based
SubDocKey(DocKey(0xe8f7, ["\x9b\x1d_\x10*\x01D2\xbel\xe3\xdcwv\x07\x96"], []), [HT{ physical: 1776277623165668 }]) -> { 1: 1 }
				
			

Notice the table name in the first log line: no_pk. Because we created a YSQL table without a primary key, YugabyteDB quietly generated the structure needed to make it work at the storage layer.

Here is the line-by-line breakdown of the key-value pair sst_dump:

1. The Table Metadata

Found info for table ID ... (namespace yugabyte, table_type PGSQL_TABLE_TYPE, name no_pk...)

The utility successfully read the tablet metadata file you provided. It recognized that this is a PostgreSQL-compatible (YSQL) table named no_pk. Because it has this metadata, it knows how to decode the raw bytes into the human-readable format you see on the last line.

2. The Document Key (DocKey)

SubDocKey(DocKey(0xe8f7, ["\x9b\x1d_\x10*\x01D2\xbel\xe3\xdcwv\x07\x96"], []),

This is the most critical part. It represents how the row is physically identified and placed in DocDB.

  • 0xe8f7: This is the 16-bit Hash Code. YugabyteDB hashes the partition key to generate a number between 0x0000 and 0xFFFF to determine which tablet (shard) this row belongs to.

  • ["\x9b\x1d_\x10*\x01D2\xbel\xe3\xdcwv\x07\x96"]: This is the ybrowid! Because you did not define a primary key, YugabyteDB generated this 16-byte UUID, represented here as a raw byte string. It acts as the actual partition key.

  • []: This empty array represents the Range Key components. Since this table relies entirely on the auto-generated ybrowid hash, there are no range/clustering columns.

3. The Hybrid Time (HT)

[HT{ physical: 1776277623165668 }])

This is the Hybrid Time (HT) timestamp attached to this specific version of the row. YugabyteDB uses this for Multi-Version Concurrency Control (MVCC). It allows the database to keep track of different versions of the data so transactions can read a consistent snapshot of the database at a specific point in time without locking.

4. The Column Data (The Value)
-> { 1: 1 }

This is the actual payload stored in the row. It is represented as a dictionary mapping Column IDs to Values.

  • ● The first 1 is the Column ID. Instead of storing the column name (e.g., id or val), DocDB stores an integer to save disk space.

  • ● The second 1 is the Value you inserted.

So, if we ran CREATE TABLE no_pk (my_column INT); and then INSERT INTO no_pk VALUES (1);, this { 1: 1 } is exactly what that insert looks like on disk once it hits the SSTable!

🔍 Want to See the Hidden Column in Metadata?

Even though ybrowid is invisible to YSQL, it does exist in the physical schema stored by DocDB.

The tablet metadata is stored as a binary file, but you can still inspect parts of it using strings.

Try this:

				
					strings /root/var/data/yb-data/tserver/tablet-meta/<tablet_id> | grep -i -C 2 rowid
				
			

Example output:

				
					id
 000034d4000030008000000000004135
no_pk
ybrowid
public(
yugabyte`
				
			
🔍 What this shows

Even though the output isn’t pretty (it’s a binary protobuf under the hood), you can clearly see:

  • ybrowid exists in the physical schema
  • ● It’s stored alongside the table name (no_pk)
  • ● It’s part of the metadata DocDB uses to interpret rows

The metadata file is:

  • ● stored in a compact binary format
  • ● optimized for performance, not readability

So tools like strings just extract readable fragments.

👉 That’s why you see pieces like:

  • ● table ID
  • ● table name
  • ● column name (ybrowid)
  • ● namespace (public, yugabyte)

🚫 Can You Query ybrowid?

No:

				
					SELECT ybrowid, id FROM no_pk;
				
			

That results in an error:

				
					ERROR: column "ybrowid" does not exist
				
			

Because the PostgreSQL/YSQL layer has no awareness of it.

🔥 The Bridge: ybctid

💡 What is ybctid?

It is the serialized DocDB key exposed through YSQL.

It contains:

  • ● the hash
  • ● the hidden ybrowid
  • ● internal encoding bytes

We can query it directly:

				
					SELECT ybctid, id FROM no_pk;
				
			

Example output:

				
					\x47e8f7539b1d5f102a014432be6ce3dc7776079600002121 | 1
				
			

What you are seeing is the difference between raw escaped ASCII bytes (in sst_dump) and a pure hexadecimal representation (in YSQL’s ybctid).

In YSQL, the ybctid is the fully serialized, binary representation of the entire DocKey. It combines the hash code, the partition key (your UUID), and formatting bytes into one long hexadecimal string.

Here is exactly how that sst_dump string maps directly into the middle of your ybctid.

Translating the sst_dump String to Hex

The string "\x9b\x1d_\x10*\x01D2\xbel\xe3\xdcwv\x07\x96" is a mix of explicitly escaped hex bytes (like \x9b) and standard ASCII characters (like _ or D).

If we convert the standard ASCII characters to their hex equivalents, the exact match appears:

Original Char/Byte Character Type Hex Equivalent
\x9bEscaped Hex9b
\x1dEscaped Hex1d
_ASCII5f
\x10Escaped Hex10
*ASCII2a
\x01Escaped Hex01
DASCII44
2ASCII32
\xbeEscaped Hexbe
lASCII6c
\xe3Escaped Hexe3
\xdcEscaped Hexdc
wASCII77
vASCII76
\x07Escaped Hex07
\x96Escaped Hex96

When you put all those hex pairs together, you get your exact UUID:

  • 9b1d5f102a014432be6ce3dc77760796

🚨 Why This Matters

⚠️ Don’t ignore this

If you don’t define a primary key:

  • ● You lose control of distribution
  • ● You lose control of row identity
  • ● You can’t optimize access patterns
  • ● YugabyteDB makes those decisions for you

🎯 Final Takeaway

YSQL lets you create a table without a primary key.

YugabyteDB does not.

If you don’t define one, YugabyteDB silently creates ybrowid and uses it to:

  • ● uniquely identify each row
  • ● hash and distribute data across tablets
  • ● build the actual DocDB key used for storage

And as you just saw, you can trace that hidden key all the way from:

  • SQL → DocDB → disk → back via ybctid

There is no such thing as a table without a primary key in YugabyteDB… only tables where you didn’t choose it.

Have Fun!

Just trying to enjoy a peaceful family lunch in Dallas… meanwhile this Great-tailed Grackle is on the roof performing what can only be described as a full-volume dial-up modem remix concert. 🐦📢