Identify Mismatched Column Data Types for Foreign Key Columns

In PostgreSQL-compatible databases like YugabyteDB, it’s easy to assume that foreign key relationships “just work.” But one subtle pitfall that can silently affect application behavior is a mismatch in data types between a foreign key column and the column it references.

These mismatches often go unnoticed—until they break something.

Batching, Pushdowns, and Joins May Fail Silently

YugabyteDB, being a distributed SQL database, relies on optimized query planning, batching, and execution across nodes. When data types don’t align perfectly:

Batching is disabled in certain execution paths, hurting throughput.
Foreign key enforcement may require runtime casts, adding CPU overhead.
Joins become inefficient or fail to use indexes, leading to full scans.
Cross-table operations become brittle, especially under high concurrency or scale.

For example, referencing a bigint primary key with an integer foreign key may work syntactically—but it will subtly prevent batching or fast index access during inserts or updates.

Best Practice: Match Column Types Exactly

When defining a foreign key:

Use the same base type (bigint vs integer matters).
Match modifiers like length (varchar(255) vs text) and collation.
For composite keys, ensure every column in the key matches precisely.

You can use the following SQL query to detect type mismatches across your schema, including multi-column (composite) foreign keys. I encapsulated the SQL in a view for easy resuse:

				
					CREATE OR REPLACE VIEW fk_columns_datatype_mistmatch_vw AS
WITH fk_columns AS (
    SELECT
        con.oid AS constraint_oid,
        con.conname,
        con.conrelid,
        con.confrelid,
        generate_subscripts(con.conkey, 1) AS ordinality,
        con.conkey AS referencing_cols,
        con.confkey AS referenced_cols
    FROM pg_constraint con
    WHERE con.contype = 'f'
),
referencing AS (
    SELECT
        fk.constraint_oid,
        fk.conrelid,
        att.attname AS column_name,
        att.atttypid AS type_oid,
        att.atttypmod AS typmod,
        fk.ordinality
    FROM fk_columns fk
    JOIN pg_attribute att
      ON att.attrelid = fk.conrelid
     AND att.attnum = fk.referencing_cols[fk.ordinality]
),
referenced AS (
    SELECT
        fk.constraint_oid,
        fk.confrelid,
        att.attname AS column_name,
        att.atttypid AS type_oid,
        att.atttypmod AS typmod,
        fk.ordinality
    FROM fk_columns fk
    JOIN pg_attribute att
      ON att.attrelid = fk.confrelid
     AND att.attnum = fk.referenced_cols[fk.ordinality]
),
combined AS (
    SELECT
        c.conname AS constraint_name,
        r.conrelid::regclass AS referencing_table,
        r.column_name AS referencing_column,
        r.type_oid::regtype AS referencing_type,
        d.confrelid::regclass AS referenced_table,
        d.column_name AS referenced_column,
        d.type_oid::regtype AS referenced_type,
        r.typmod != d.typmod OR r.type_oid != d.type_oid AS type_mismatch
    FROM referencing r
    JOIN referenced d
      ON r.constraint_oid = d.constraint_oid
     AND r.ordinality = d.ordinality
    JOIN pg_constraint c ON c.oid = r.constraint_oid
)
SELECT *
FROM combined
WHERE type_mismatch;

Example:

				
					yugabyte=# CREATE TABLE users (
yugabyte(#     id BIGINT PRIMARY KEY,
yugabyte(#     name TEXT
yugabyte(# );
CREATE TABLE

yugabyte=# CREATE TABLE orders (
yugabyte(#     id SERIAL PRIMARY KEY,
yugabyte(#     user_id INTEGER,  -- ❌ Mismatch: should be BIGINT
yugabyte(#     amount NUMERIC,
yugabyte(#     FOREIGN KEY (user_id) REFERENCES users(id)
yugabyte(# );
CREATE TABLE

yugabyte=# SELECT * FROM fk_columns_datatype_mistmatch_vw;
   constraint_name   | referencing_table | referencing_column | referencing_type | referenced_table | referenced_column | referenced_type | type_mismatch
---------------------+-------------------+--------------------+------------------+------------------+-------------------+-----------------+---------------
 orders_user_id_fkey | orders            | user_id            | integer          | users            | id                | bigint          | t
(1 row)

Why It Matters More in YugabyteDB

In PostgreSQL, type mismatches might only lead to slight performance issues. In YugabyteDB, they can disable entire classes of optimizations and lead to inefficient distributed queries. Small schema mismatches can become scaling bottlenecks in production—especially when your application spans multiple nodes or regions.

Have Fun!

While hiking five miles along the Mill Prong Trail in Shenandoah National Park—on the way to Rapidan Camp, President Herbert Hoover’s summer retreat—we came across this beautiful little waterfall!