Pitfalls of Adding a PK to an Existing Table in YSQL

Although it’s possible to add a primary key constraint to an existing table with the ALTER TABLE ADD CONSTRAINT command, it’s best practice to add the PK to a table inline when creating it with the CREATE TABLE command.

This is especially true if the table contains data.

Why? Remember that YugabyteDB is a distributed SQL database and it wants to automatically shard a table into tablets based on the primary key. When you add a primary key to a table, YugabyteDB has to reshard the table. To do that, it will drop and recreate the table with new tablets sharded by the new PK columns.

Don’t worry – there will not be any loss of data, but it can potentially take a long time to reshard the table depending on the amount of data and the Universe’s deployment options (i.e. replication factor, number of nodes, type of nodes, location of nodes, etc.)

The other gotcha when adding a primary key constraint to an existing table is that views that reference the table will be dropped when the table is recreated.

YugabyteDB will recreate indexes, sequences, foreign key constraints and triggers during the table recreation process, but views aren’t handled. You’ll have to manually recreate the views yourself.

If you don’t have the DDL for the views available, you can capture the view definition from a system table prior to altering the referenced table.

Example:

				
					yugabyte=# CREATE TABLE some_table (c1 INT, c2 VARCHAR);
CREATE TABLE

yugabyte=# CREATE VIEW some_table_vw AS SELECT c1, UPPER(c2) c2 FROM some_table;
CREATE VIEW

yugabyte=# \d
             List of relations
 Schema |     Name      | Type  |  Owner
--------+---------------+-------+----------
 public | some_table    | table | yugabyte
 public | some_table_vw | view  | yugabyte
(2 rows)
				
			

I want to add a PK to the table, but first I’ll save the DDL for the view that references the table to a file.

				
					yugabyte=# \a
Output format is unaligned.

yugabyte=# \t
Tuples only is on.

yugabyte=# \o some_table_vw.ddl

yugabyte=# SELECT 'CREATE VIEW ' || schemaname || '.' || viewname || ' AS' || CHR(10) || definition FROM pg_views WHERE schemaname = 'public' AND viewname
= 'some_table_vw';

yugabyte=# \o

yugabyte=# \! cat some_table_vw.ddl
CREATE VIEW public.some_table_vw AS
 SELECT some_table.c1,
    upper((some_table.c2)::text) AS c2
   FROM some_table;
				
			

Now I will add a PK to the table, noting that the view gets dropped.

				
					yugabyte=# ALTER TABLE some_table ADD CONSTRAINT some_table_pk PRIMARY KEY(c1);
NOTICE:  drop cascades to view some_table_vw
ALTER TABLE

yugabyte=# \d
           List of relations
 Schema |    Name    | Type  |  Owner
--------+------------+-------+----------
 public | some_table | table | yugabyte
(1 row)
				
			

To recreate the view I just have to run the script I generated earlier.

				
					yugabyte=# \i some_table_vw.ddl
CREATE VIEW

yugabyte=# \d
             List of relations
 Schema |     Name      | Type  |  Owner
--------+---------------+-------+----------
 public | some_table    | table | yugabyte
 public | some_table_vw | view  | yugabyte
(2 rows)
				
			

Hopefully it is a little more clear why it’s best practice to add a PK when you create the table, not after.

Have Fun!