GENERATE A RANDOM STRING (FASTER METHOD)

In the previous tip GENERATE A RANDOM STRING, we learned how to create a function to do just that.

In that version of the function we used the PLPGSQL language.

In YugabyteDB, sometimes we can get better performance if a function is called many times if it’s defined using the SQL language instead.

Let’s see why.

First, here is the DDL for the new function which uses the SQL language to generate a string:

				
					CREATE OR REPLACE FUNCTION randomstring(INT)
  RETURNS VARCHAR AS
$$
  SELECT string_agg(substring('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', ((RANDOM() * 51) + 1)::INT, 1), '')
    FROM generate_series(1, $1);
$$ language sql;

Example:

				
					yugabyte=# CREATE OR REPLACE FUNCTION randomstring(INT)
yugabyte-#   RETURNS VARCHAR AS
yugabyte-# $$
yugabyte$#   SELECT string_agg(substring('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', ((RANDOM() * 51) + 1)::INT, 1), '')
yugabyte$#     FROM generate_series(1, $1);
yugabyte$# $$ language sql;
CREATE FUNCTION

yugabyte=# SELECT randomstring(10) rand_string;
 rand_string
-------------
 BUWkbkHQCh
(1 row)

To figure out why one version is “better” than the other, let’s create two versions of the function, one using PLPGSQL and the other SQL.

				
					yugabyte=# CREATE OR REPLACE FUNCTION randomstring_plpgsql(INT)
yugabyte-#   RETURNS VARCHAR AS
yugabyte-# $$
yugabyte$# DECLARE
yugabyte$#   passedValue INT:=$1;
yugabyte$#   rstring VARCHAR:='';
yugabyte$# BEGIN
yugabyte$#   SELECT string_agg(substring('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', ((RANDOM() * 51) + 1)::INT, 1), '') INTO rstring
yugabyte$#     FROM generate_series(1, passedValue);
yugabyte$#   RETURN rstring;
yugabyte$# END;
yugabyte$# $$ language 'plpgsql';
CREATE FUNCTION

yugabyte=# CREATE OR REPLACE FUNCTION randomstring_sql(INT)
yugabyte-#   RETURNS VARCHAR AS
yugabyte-# $$
yugabyte$#   SELECT string_agg(substring('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz', ((RANDOM() * 51) + 1)::INT, 1), '')
yugabyte$#     FROM generate_series(1, $1);
yugabyte$# $$ language sql;
CREATE FUNCTION

Next, we can look at the EXPLAIN PLAN for each.

				
					yugabyte=# EXPLAIN (ANALYZE, DIST) SELECT randomstring_plpgsql(10) rand_string;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Result  (cost=0.00..0.26 rows=1 width=32) (actual time=4.938..4.938 rows=1 loops=1)
 Planning Time: 0.052 ms
 Execution Time: 6.019 ms
 Storage Read Requests: 0
 Storage Write Requests: 0.000
 Catalog Read Requests: 11
 Catalog Read Execution Time: 13.263 ms
 Catalog Write Requests: 0.000
 Storage Flush Requests: 0
 Storage Execution Time: 13.263 ms
 Peak Memory Usage: 85 kB
(11 rows)

yugabyte=# EXPLAIN (ANALYZE, DIST) SELECT randomstring_sql(10) rand_string;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Result  (cost=0.00..0.26 rows=1 width=32) (actual time=0.158..0.159 rows=1 loops=1)
 Planning Time: 0.117 ms
 Execution Time: 0.180 ms
 Storage Read Requests: 0
 Storage Write Requests: 0.000
 Catalog Read Requests: 2
 Catalog Read Execution Time: 2.157 ms
 Catalog Write Requests: 0.000
 Storage Flush Requests: 0
 Storage Execution Time: 2.157 ms
 Peak Memory Usage: 44 kB
(11 rows)

Notice that the number of catalog read request and the peak memory used for the PGPLSQL version is higher than the SQL version.

Although the differneces are small for a single call, they’ll add up if called a million times!

				
					yugabyte=# \o /dev/null

yugabyte=# \timing on
Timing is on.

yugabyte=# SELECT randomstring_plpgsql(10) rand_string FROM generate_series(1,1000000);
Time: 24683.481 ms (00:24.683)

yugabyte=# SELECT randomstring_sql(10) rand_string FROM generate_series(1,1000000);
Time: 20570.437 ms (00:20.570)

Have Fun!