Create a Secondary Index on a Collection in YCQL

YCQL supports collection data types to specify columns for data objects that can contain more than one value. These include the following data types:

  • LIST: An ordered collection of elements of the same primitive type.
  • MAP: A sorted collection of pairs of elements, a key and a value. The sorting order is based on the key values and is implementation-dependent.
  • SET: A sorted collection of elements. The sorting order is implementation-dependent. 

The use of indexes can enhance database performance by enabling the database server to find rows faster.

Can we create in index on a column that has a collection data type?

				
					yugabyte@ycqlsh> CREATE KEYSPACE ks;

yugabyte@ycqlsh> CREATE TABLE ks.t (pk INT PRIMARY KEY, letters LIST<TEXT>) WITH transactions = {'enabled':'true'};

yugabyte@ycqlsh> INSERT INTO ks.t (pk, letters) VALUES (1, ['A', 'B', 'C']);

yugabyte@ycqlsh> INSERT INTO ks.t (pk, letters) VALUES (2, ['A', 'C']);

yugabyte@ycqlsh> INSERT INTO ks.t (pk, letters) VALUES (3, ['D', 'E']);

yugabyte@ycqlsh> SELECT * FROM ks.t;

 pk | letters
----+-----------------
  1 | ['A', 'B', 'C']
  2 |      ['A', 'C']
  3 |      ['D', 'E']

(3 rows)

yugabyte@ycqlsh> CREATE INDEX t_letters_idx ON ks.t(letters);
InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid Primary Key Column Datatype
CREATE INDEX t_letters_idx ON ks.t(letters);
                                   ^^^^^^^
 (ql error -206)"
				
			

We can’t… unless we use a FROZEN data type!

Example:

				
					yugabyte@ycqlsh> ALTER TABLE ks.t ADD letters_frozen FROZEN<LIST<TEXT>>;

yugabyte@ycqlsh> UPDATE ks.t SET letters_frozen = ['A', 'B', 'C'] WHERE pk = 1;

yugabyte@ycqlsh> UPDATE ks.t SET letters_frozen = ['A', 'C'] WHERE pk = 2;

yugabyte@ycqlsh> UPDATE ks.t SET letters_frozen = ['D', 'E'] WHERE pk = 3;
yugabyte@ycqlsh> SELECT * FROM ks.t;

 pk | letters         | letters_frozen
----+-----------------+-----------------
  1 | ['A', 'B', 'C'] | ['A', 'B', 'C']
  2 |      ['A', 'C'] |      ['A', 'C']
  3 |      ['D', 'E'] |      ['D', 'E']

(3 rows)

yugabyte@ycqlsh> CREATE INDEX t_letters_idx ON ks.t(letters_frozen);

yugabyte@ycqlsh> EXPLAIN SELECT pk FROM ks.t WHERE letters_frozen = ['A', 'C'];

 QUERY PLAN
------------------------------------------------
 Index Only Scan using ks.t_letters_idx on ks.t
   Key Conditions: (letters_frozen = expr)
				
			

Have Fun!

I thought this was a "friendly" duck... I was wrong!