Be Careful with the NOT IN Operator when NULLs are Present

In YSQL we can use the IN operator in the WHERE clause to check if a value matches any value in a list of values.

The “list of values” can be the result set of a query.

Example:

				
					yugabyte=# CREATE TABLE test (c1 INT);
CREATE TABLE

yugabyte=# INSERT INTO test VALUES (1), (2), (3);
INSERT 0 3

yugabyte=# SELECT * FROM test ORDER BY c1;
 c1
----
  1
  2
  3
(3 rows)

yugabyte=# SELECT c2 FROM (SELECT 1 c2) foo WHERE c2 IN (SELECT * FROM test);
 c2
----
  1
(1 row)
				
			
Conversely, we can use the NOT IN operator to check if a value does not match any of the values in a list of values.

Example:

				
					yugabyte=# SELECT c2 FROM (SELECT 4 c2) foo WHERE c2 NOT IN (SELECT * FROM test);
 c2
----
  4
(1 row)
				
			

Unexpected behavior ensues when a NULL is introduced into the “list of values”.  

				
					yugabyte=# INSERT INTO test VALUES (NULL);
INSERT 0 1

yugabyte=# \pset null (null)
Null display is "(null)".

yugabyte=# SELECT * FROM test ORDER BY c1;
   c1
--------
      1
      2
      3
 (null)
(4 rows)
				
			
The IN operator works just as it did before…
				
					yugabyte=# SELECT c2 FROM (SELECT 1 c2) foo WHERE c2 IN (SELECT * FROM test);
 c2
----
  1
(1 row)
				
			
But what about the NOT IN operator?
				
					yugabyte=# SELECT c2 FROM (SELECT 4 c2) foo WHERE c2 NOT IN (SELECT * FROM test);
 c2
----
(0 rows)
				
			

Uh oh, no rows are returned! That’s unexpected…

As a work around, convert your SQL to use NOT EXISTS .
				
					yugabyte=# SELECT c2 FROM (SELECT 4 c2) foo WHERE NOT EXISTS (SELECT * FROM test WHERE test.c1 = c2);
 c2
----
  4
(1 row)
				
			

Have Fun!

Friendly Neighbor