We learned in a previous YugabyteDB Tip that we can Consolidate Data From YB Catalog Tables From All Nodes To One Node.
That solution required a lot of tedious setup using foreign data wrappers. In today’s tip, we explore how a shell script can be used to achieve the same result.
Here is the script:
#!/bin/bash
# Script: yb_cluster_query.sh
# Description: This script will execute provided query on all nodes in a cluster and
# dump the output to a file.
# Created: 04/11/2025
# Last update: 04/11/2025
# Call Spec:
# ./yb_cluster_query.sh <> <> <> <> <
Below is a brief demonstration of the script in use…
On a three node cluster I’ll create a user named lucy and a database named lucy_db, assigning ownership of the database to the new user:
yugabyte=# SELECT host, cloud, region, zone FROM yb_servers() ORDER BY host;
host | cloud | region | zone
-----------+-------+-----------+----------------
127.0.0.1 | aws | us-east-1 | us-east-1-zone
127.0.0.2 | aws | us-east-2 | us-east-2-zone
127.0.0.3 | aws | us-west-2 | us-west-2-zone
(3 rows)
yugabyte=# CREATE USER lucy;
CREATE ROLE
yugabyte=# CREATE DATABASE lucy_db WITH OWNER lucy;
CREATE DATABASE
Next, we’ll have the new user do some work on nodes 127.0.0.1 and 127.0.0.2:
[root@cloud-server-0 ~]# ysqlsh -h 127.0.0.1 -U lucy -d lucy_db -c "CREATE TABLE tab (id INT PRIMARY KEY);"
CREATE TABLE
[root@cloud-server-0 ~]# ysqlsh -h 127.0.0.1 -U lucy -d lucy_db -c "INSERT INTO tab SELECT generate_series(1, 100000);"
INSERT 0 100000
[root@cloud-server-0 ~]# ysqlsh -h 127.0.0.1 -U lucy -d lucy_db -c "SELECT COUNT(*) FROM tab;"
count
--------
100000
(1 row)
[root@cloud-server-0 ~]# ysqlsh -h 127.0.0.2 -U lucy -d lucy_db -c "SELECT COUNT(*) FROM tab WHERE id BETWEEN 10000 AND 100000;"
count
-------
90001
(1 row)
I want to check the pg_stat_statements
statistics for this user across the cluster. While this can be done through YugabyteDB Anywhere, it’s also easy to accomplish using the new script!
pg_stat_statements
, pg_user
, and pg_database
.
[root@cloud-server-0 ~]# cat pg_stat_statements.sql
SELECT datname, LEFT(query, 25) query, calls, rows, mean_exec_time FROM pg_stat_statements ss JOIN pg_user u ON u.usesysid = userid JOIN pg_database d ON d.oid = dbid WHERE usename = 'lucy'
Now I’m ready to run the script, passing in all the required parameters, including my SQL file.
[root@cloud-server-0 ~]# ./yb_cluster_query.sh 127.0.0.1 lucy_db yugabyte pg_stat_statements.sql cluster_output.csv "|"
Warning: No output for 127.0.0.3
✅ Results written to cluster_output.csv
[root@cloud-server-0 ~]# cat cluster_output.csv
host|datname|query|calls|rows|mean_exec_time
127.0.0.1|lucy_db|CREATE TABLE tab (id INT |1|0|178.813274
127.0.0.1|lucy_db|INSERT INTO tab SELECT ge|1|100000|838.076101
127.0.0.1|lucy_db|SELECT COUNT(*) FROM tab|1|1|14.20426
127.0.0.2|lucy_db|SELECT COUNT(*) FROM tab |1|1|39.662449
Note: The initial output from the script includes a warning indicating that no results were returned from node 127.0.0.3. In this case, it simply means that the user Lucy hasn’t executed any SQL statements on that node.
For easy analysis, I can load the data from the file into a YugbayteDB temp table:
yugabyte=# CREATE TEMP TABLE pg_stat_statements_lucy(host TEXT, datname TEXT, query TEXT, calls
CREATE TABLE
yugabyte=# \COPY pg_stat_statements_lucy FROM 'cluster_output.csv' WITH (FORMAT 'csv', HEADER TRUE, DELIMITER '|');
WARNING: batched COPY is not supported on temporary tables
DETAIL: Defaulting to using one transaction for the entire copy.
HINT: Either copy onto non-temporary table or set rows_per_transaction option to `0` to disable batching and remove this warning.
COPY 4
yugabyte=# SELECT * FROM pg_stat_statements_lucy;
host | datname | query | calls | rows | mean_exec_time
-----------+---------+---------------------------+-------+--------+----------------
127.0.0.1 | lucy_db | CREATE TABLE tab (id INT | 1 | 0 | 178.813274
127.0.0.1 | lucy_db | INSERT INTO tab SELECT ge | 1 | 100000 | 838.076101
127.0.0.1 | lucy_db | SELECT COUNT(*) FROM tab | 1 | 1 | 14.20426
127.0.0.2 | lucy_db | SELECT COUNT(*) FROM tab | 1 | 1 | 39.662449
(4 rows)
yugabyte=# SELECT * FROM pg_stat_statements_lucy WHERE mean_exec_time > 500;
host | datname | query | calls | rows | mean_exec_time
-----------+---------+---------------------------+-------+--------+----------------
127.0.0.1 | lucy_db | INSERT INTO tab SELECT ge | 1 | 100000 | 838.076101
(1 row)
Have Fun!