Using an Index with LIKE Operator Predicates in YSQL

The YSQL LIKE operator is used to match text values against a pattern using wildcards. If the search expression can be matched to the pattern expression, the LIKE operator will return true, which is 1. The percent sign represents zero, one, or multiple numbers or characters.

Example:

				
					yugabyte=# CREATE TABLE test as SELECT upper(substr(md5(random()::text), 0, 3))::text as c1 FROM generate_series(1, 100000);
SELECT 100000

yugabyte=# EXPLAIN (COSTS OFF) SELECT c1 FROM test WHERE c1 LIKE 'A';
             QUERY PLAN
------------------------------------
 Seq Scan on test
   Remote Filter: (c1 ~~ 'A'::text)
(2 rows)

yugabyte=# EXPLAIN (COSTS OFF) SELECT c1 FROM test WHERE c1 LIKE 'A%';
             QUERY PLAN
-------------------------------------
 Seq Scan on test
   Remote Filter: (c1 ~~ 'A%'::text)
(2 rows)

Note that in the first explain plan above, I did not use a wildcard which results in an equality comparison.

Example:

				
					yugabyte=# SELECT 'A' LIKE 'A' "alike?";
 alike?
--------
 t
(1 row)

yugabyte=# SELECT 'A' LIKE 'AA' "alike?";
 alike?
--------
 f
(1 row)

yugabyte=# SELECT 'A' = 'A' "alike?";
 alike?
--------
 t
(1 row)

Both of the explain plans in the earlier example show a full (sequential) scan on the TEST table. We should be able to improve query performace with a secondary index on the C1 column.

				
					yugabyte=# CREATE INDEX test_idx ON test(c1);
CREATE INDEX

yugabyte=# EXPLAIN (COSTS OFF) SELECT c1 FROM test WHERE c1 LIKE 'A';
             QUERY PLAN
------------------------------------
 Seq Scan on test
   Remote Filter: (c1 ~~ 'A'::text)
(2 rows)

yugabyte=# EXPLAIN (COSTS OFF) SELECT c1 FROM test WHERE c1 LIKE 'A%';
             QUERY PLAN
-------------------------------------
 Seq Scan on test
   Remote Filter: (c1 ~~ 'A%'::text)
(2 rows)

Wait a second – the index wasn’t used?!?! Why?

This is because, by default, YugabyteDB is using hash partitioning for the index.

				
					yugabyte=# \d test;
              Table "public.test"
 Column | Type | Collation | Nullable | Default
--------+------+-----------+----------+---------
 c1     | text |           |          |
Indexes:
    "test_idx" lsm (c1 HASH)

For the query optimizer to chose an index for a LIKE operator, the index will need to use range partitioning instead.

				
					yugabyte=# CREATE INDEX test_idx_asc ON test(c1 ASC);
CREATE INDEX

yugabyte=# \d test;
              Table "public.test"
 Column | Type | Collation | Nullable | Default
--------+------+-----------+----------+---------
 c1     | text |           |          |
Indexes:
    "test_idx" lsm (c1 HASH)
    "test_idx_asc" lsm (c1 ASC)

yugabyte=# EXPLAIN (COSTS OFF) SELECT c1 FROM test WHERE c1 LIKE 'A';
                 QUERY PLAN
--------------------------------------------
 Index Only Scan using test_idx_asc on test
   Index Cond: (c1 = 'A'::text)
   Remote Filter: (c1 ~~ 'A'::text)
(3 rows)

The range partitioned index is also utilized for a LIKE operator that has a trailing wildcard!

				
					yugabyte=# EXPLAIN (COSTS OFF) SELECT c1 FROM test WHERE c1 LIKE 'A%';
                       QUERY PLAN
--------------------------------------------------------
 Index Only Scan using test_idx_asc on test
   Index Cond: ((c1 >= 'A'::text) AND (c1 < 'B'::text))
   Remote Filter: (c1 ~~ 'A%'::text)
(3 rows)

To find out why YugabyteDB supports both hash and range partitioning, and the pros and cons of each, check out the awesome blog Four Data Sharding Strategies We Analyzed in Building a Distributed SQL Database.

Have Fun!