When joining two tables a column that is part of the partition key, it’s important that the columns have the same data type.
Otherwise there is a chance the the data from one table might reside on a different node than the other table.
Example:
yugabyte=# CREATE TABLE a (c INT PRIMARY KEY);
CREATE TABLE
yugabyte=# INSERT INTO a SELECT 1;
INSERT 0 1
yugabyte=# CREATE TABLE b ( c NUMERIC(10, 0) PRIMARY KEY);
CREATE TABLE
yugabyte=# INSERT INTO b SELECT 1;
INSERT 0 1
yugabyte=# SELECT COUNT(*) FROM a JOIN b ON b.c = a.c; -- This JOIN is okay...
count
-------
1
(1 row)
yugabyte=# SELECT COUNT(*) FROM a JOIN b ON YB_HASH_CODE(b.c) = YB_HASH_CODE(a.c); -- But the data is on different nodes!
count
-------
0
(1 row)
This scenario could result in higher query latency dependning on the network between the nodes.
It’s best practice to join tables on columns that have the same data type – especially in a distributed SQL database like YugabyteDB!