Every YSQL table should have a primary key which identifies a unique row.
A primary key has 2 components:
The partition key should distribute the rows evenly to have a balanced cluster.
We can identify imbalances in partition key column value distribution by listing the distinct values with the highest row counts.
Example:
yugabyte=# SELECT host, cloud, region, zone
yugabyte-# FROM yb_servers() ORDER BY region;
host | cloud | region | zone
-------------+-------+-----------+---------------
10.36.2.84 | aws | us-east-1 | us-east-1b
10.36.1.106 | aws | us-east-1 | us-east-1a
10.36.1.208 | aws | us-east-1 | us-east-1a-rr
(3 rows)
yugabyte=# \d test.some_table;
Table "test.some_table"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
c1 | integer | | not null |
c2 | integer | | not null |
c3 | integer | | not null |
c4 | text | | |
c5 | integer | | |
Indexes:
"some_table_pkey" PRIMARY KEY, lsm ((c1, c2) HASH, c3 ASC)
yugabyte=# \timing
Timing is on.
yugabyte=# SELECT COUNT(1) FROM test.some_table;
count
-----------
104082529
(1 row)
Time: 12804.237 ms (00:12.804)
yugabyte=# SELECT c1, c2, COUNT(1) FROM test.some_table GROUP BY c1, c2 ORDER BY COUNT(1) DESC LIMIT 5;
c1 | c2 | count
--------+----+---------
100001 | 0 | 8934648
501 | 1 | 2003574
1003 | 3 | 1094443
1003 | 1 | 1093014
1003 | 2 | 1092551
(5 rows)
Time: 87539.299 ms (01:27.539)
In this example, the partition key of the TEST table is composed of the columns c1 and c2. Examining the row counts for the distinct partition key values reveals a significant skew for (c1, c2) = (100001, 0).
That’s valuable information, but it took about 90 seconds to obtain the result. How can we make this process faster?
With an “old school” bash script!
#!/bin/bash
shopt -s expand_aliases
set -e
# Script: count_partition_keys.sh
# Description: This script will count the rows from a given table grouped by the partition key columns
# of the table's primary key.
#
# The system view yb_local_tablets, if it exists, will be used to determine the partition ranges.
# (YugabyteDB versions 2024.1.x and beyond). If the view does not exist, then the script will try
# to access the YB Master UI to determine the partition ranges. If the YB Master UI is not accessable,
# a set of 16 partion ranges will be used.
# Created: 09/05/2024
# Last update: 09/05/2024
# Edits:
#
# Call Spec:
# ./count_partition_keys.sh < > <> <> <>
# Example Use:
# ./count_partition_keys.sh public.test 127.0.0.1 yugabyte yugabyte
# Important Notes:
# 1. This script is for YSQL only.
# 2. Table must have a HASH sharded Primary Key defined.
# 3. The table name should be in the format schema.table
# 4. The Database Node IP should preferreably also be running the YB Master process
# For Mac, need to use gnu-sed
# Install with: brew install gnu-sed
# Uncomment the following lines after installing gnu-sed:
# OLD_PATH=$PATH
# PATH="/usr/local/opt/gnu-sed/libexec/gnubin:$PATH"
# Variables
TOPRC=5 # Limit output to top x rows
TMP_OUT_FILE="/tmp/tc.txt" # Temp file to write row counts
RCS=64 # Break up COUNT(*) queries into x pieces
# Verify input parameters
# Check if table name was provided
# Table name must be in the format: schema_name.table_name
if [ -z "$1" ]
then
echo "Please provide a table name in the format: schema_name.table_name"
exit 0
else
YB_SCHEMA_NAME=$(echo $1 | cut -f1 -d.)
YB_TABLE_NAME=$(echo $1 | cut -f2 -d.)
fi
# Check if IP or hostname was provided
if [ -z "$2" ]
then
echo "Please provide an IP or hostname for a YB Database Node (preferrably a YB Master node)"
exit 0
else
YB_IP=$2
fi
# Check if table database name was provided
if [ -z "$3" ]
then
echo "Please provide a database name"
exit 0
else
YB_DATABASE=$3
fi
# Check if database user name was provided
if [ -z "$4" ]
then
echo "Please provide a database username"
exit 0
else
YB_USER=$4
fi
# Set up the YSQLSH alias below per your environment
YB_PATH='/root/yugabyte-2024.1.1.0/bin/ysqlsh'
alias y="$YB_PATH -h $YB_IP -U $YB_USER -d $YB_DATABASE"
alias y1="$YB_PATH -U $YB_USER -d $YB_DATABASE"
# Check if table exists
tc=`y -Atc "SELECT COUNT(*) FROM pg_tables WHERE schemaname = '$YB_SCHEMA_NAME' AND tablename = '$YB_TABLE_NAME';" | awk '{ print $1'}`
if [ -z "$tc" ]
then
echo "Database connection failed."
exit 1
fi
if [ $tc -gt 0 ]
then
# Get the PK HASH columns for the table.
HPC=`y -Atc "WITH q AS (SELECT SUBSTR(pg_catalog.pg_get_indexdef(i.indexrelid, 0, true), POSITION('lsm' IN pg_catalog.pg_get_indexdef(i.indexrelid, 0, true))+4) x FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class c ON c.oid = i.indrelid JOIN pg_catalog.pg_class c2 ON c2.oid = i.indexrelid JOIN pg_catalog.pg_class class ON c.oid = class.oid JOIN pg_catalog.pg_namespace n ON n.oid = class.relnamespace JOIN pg_catalog.pg_constraint con ON (conrelid = i.indrelid AND conindid = i.indexrelid AND contype IN ('p')) WHERE n.nspname = '$YB_SCHEMA_NAME' AND class.relname = '$YB_TABLE_NAME' AND (SUBSTR(pg_catalog.pg_get_indexdef(i.indexrelid, 0, true), POSITION('lsm' IN pg_catalog.pg_get_indexdef(i.indexrelid, 0, true))+4) ILIKE '%HASH,%' OR SUBSTR(pg_catalog.pg_get_indexdef(i.indexrelid, 0, true), POSITION('lsm' IN pg_catalog.pg_get_indexdef(i.indexrelid, 0, true))+4) ILIKE '%HASH\)%') ORDER BY i.indisprimary DESC, i.indisunique DESC, c2.relname) SELECT REPLACE(REPLACE(x_1, '(', ''), ')', '') FROM (SELECT SUBSTR(x, 1, CASE WHEN POSITION('HASH,' IN x)-1 <= 1 THEN POSITION('HASH)' IN x)-1 ELSE POSITION('HASH,' IN x)-1 END) || ')' x_1 FROM q) foo ORDER BY LENGTH(x_1) DESC LIMIT 1;"`
if [ -z "$HPC" ]
then
echo "The table \"$1\" does not have a HASH sharded primary key defined."
exit 1
fi
START=0
INC=$((65536/$RCS))
END=$((65536-$INC))
# Remove temp file
rm -fr $TMP_OUT_FILE
# Get the list of T-Servers
HOSTS=`y -Atc "SELECT string_agg(host, '|') host FROM yb_servers();"`
IFS="|" read -a HOSTS_A <<< $HOSTS
echo $HOSTS
for COUNTER in $(eval echo "{$START..$END..$INC}")
do
RAND=$[$RANDOM % ${#HOSTS_A[@]}]
y1 -h ${HOSTS_A[$RAND]} -c "SET default_transaction_read_only = TRUE;" -c "SET yb_read_from_followers = TRUE;" -Atc "SELECT $HPC, COUNT(1) FROM $YB_SCHEMA_NAME.$YB_TABLE_NAME WHERE yb_hash_code($HPC) BETWEEN $COUNTER AND $(($COUNTER+$INC-1)) GROUP BY $HPC;" >> $TMP_OUT_FILE &
echo "SELECT $HPC, COUNT(1) FROM $YB_SCHEMA_NAME.$YB_TABLE_NAME WHERE yb_hash_code($HPC) BETWEEN $COUNTER AND $(($COUNTER+$INC-1)) GROUP BY $HPC;"
done
wait
# Display top TOPRC row count
sort -r -n -t '|' -k2 $TMP_OUT_FILE | head -$TOPRC | column -t -s '|' --table-columns PARTITION,COUNT
# Clean up temp file
rm -fr $TMP_OUT_FILE
else
echo "Table \"$1\" does not exist."
fi
exit 0
Example:
[yugabyte@ip-10-36-2-84 ~]$ time ./count_partition_keys.sh test.some_table 10.36.2.84 yugabyte yugabyte
Partition Key Columns: c1, c2
PARTITION COUNT
100001,0 8934648
501,1 2003574
1003,3 1094443
1003,1 1093014
1003,2 1092551
real 0m10.540s
user 0m0.228s
sys 0m0.403s
Reduction in run time using the bash script:
yugabyte=# SELECT ((85269.148 - (10.540*1000)) / 85269.148 * 100)::NUMERIC(5, 2) "% Reduction in Run Time";
% Reduction in Run Time
-------------------------
87.64
(1 row)
Have Fun!