Search a LIST Collection Data Type in YCQL (By Using a MAP Instead)

YCQL supports collection data types to specify columns for data objects that can contain more than one value.

One such type is a LIST which is an ordered collection of elements of the same primitive type.

We learned in a previous tip that we cannot search for a value in a LIST using the Cassandra CONTAINS operator in YugabyteDB.

This feature is coming to YugabyteDB and is being tracked here.

In the mean time, another work around is to use a MAP instead of a LIST.

MAP is an sorted collection of pairs of elements, a key and a value. The sorting order is based on the key values and is implementation-dependent. 

The trick is to include keys in the MAP that represent a distinct list of items from the original LIST and have the value for each MAP key be set to TRUE.  

Example:

I have the table below which has both a LIST and a MAP for demonstration purposes.

				
					yugabyte@ycqlsh> DESC TABLE ks.letters;

CREATE TABLE ks.letters (
    pk int PRIMARY KEY,
    letters_list list<text>,
    letters_map map<text, boolean>
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'true'};

yugabyte@ycqlsh> SELECT * FROM ks.letters;

 pk | letters_list         | letters_map
----+----------------------+----------------------------------------------
  5 | ['Q', 'A', 'P', 'W'] | {'A': True, 'P': True, 'Q': True, 'W': True}
  1 |      ['A', 'B', 'C'] |            {'A': True, 'B': True, 'C': True}
  4 |      ['X', 'Y', 'Z'] |            {'X': True, 'Y': True, 'Z': True}
  2 |           ['B', 'C'] |                       {'B': True, 'C': True}
  3 |           ['D', 'A'] |                       {'A': True, 'D': True}

(5 rows)
				
			

Now I want to see all rows where the LIST contains the letter ‘A’.

We know that we cannot use the CONTAINS operator on the LIST column, but we can search the MAP column for the letter ‘A’ like this:

				
					yugabyte@ycqlsh> SELECT * FROM ks.letters WHERE letters_map['A'] = TRUE;

 pk | letters_list         | letters_map
----+----------------------+----------------------------------------------
  5 | ['Q', 'A', 'P', 'W'] | {'A': True, 'P': True, 'Q': True, 'W': True}
  1 |      ['A', 'B', 'C'] |            {'A': True, 'B': True, 'C': True}
  3 |           ['D', 'A'] |                       {'A': True, 'D': True}

(3 rows)
				
			

Note that when using the MAP data type, the keys are sorted alphabetically. If you are relying on the order of the values in the original LIST, you may need to keep both columns in the table to handle that use case in addition to the search use case. 

Have Fun!

Bad Day on the Bay