Create a Table and Load it with Available Prometheus Metric Names and Descriptions

Prometheus is a widely used standard for time-series monitoring in cloud-native infrastructure, utilizing time-series data as a source for generating alerts.

Every node in a YugabyteDB universe exports detailed time-series metrics, available in both Prometheus exposition format and JSON for easy integration with Prometheus.

What metrics are avaiable?

You can view YB-TServer metrics in Prometheus format directly in a browser or via the CLI with the following command:

				
					curl <node IP>:9000/prometheus-metrics

				
			

And view YB-Master server metrics in Prometheus format using the following command in a browser or via the CLI:

				
					curl <node IP>:7000/prometheus-metrics
				
			

We can store the metric names and their descriptions in a table for easy querying! 

Example:

				
					yugabyte=# CREATE TABLE available_prometheus_metrics (server TEXT, metric TEXT, description TEXT);
CREATE TABLE
				
			

Load the avaiable YB Master metric names and descriptions:

				
					yugabyte=# SELECT inet_server_addr();
 inet_server_addr
------------------
 ***.**.**.248
(1 row)

yugabyte=# yugabyte=# \COPY available_prometheus_metrics FROM PROGRAM 'curl -s http://***.**.**.248:7000/prometheus-metrics | grep HELP | sed "s/# HELP //" | sed "s/ /|/1" | sed -e "s/^/MASTER|/" | uniq' DELIMITER '|';
COPY 2578
				
			

Load the YB T-Server metric names and descriptions:

				
					yugabyte=# \COPY available_prometheus_metrics FROM PROGRAM 'curl -s http://***.**.**.248:9000/prometheus-metrics | grep HELP | sed "s/# HELP //" | sed "s/ /|/1" | sed -e "s/^/TSERVER|/" | uniq' DELIMITER '|';
COPY 1894
				
			

Now it’s super simple to look for a partilar metric of interest to scrape…

				
					yugabyte=# SELECT metric, description FROM available_prometheus_metrics WHERE server = 'MASTER' AND description ILIKE '%clock%' ORDER BY metric;
                           metric                            |                                   description

-------------------------------------------------------------+---------------------------------------------------------------------------------
 handler_latency_yb_server_GenericService_ServerClock        | Microseconds spent handling yb.server.GenericService.ServerClock() RPC requests
 handler_latency_yb_server_GenericService_ServerClock_count  | Microseconds spent handling yb.server.GenericService.ServerClock() RPC requests
 handler_latency_yb_server_GenericService_ServerClock_sum    | Microseconds spent handling yb.server.GenericService.ServerClock() RPC requests
 hybrid_clock_error                                          | Server clock maximum error.
 hybrid_clock_hybrid_time                                    | Hybrid clock hybrid_time.
 hybrid_clock_skew                                           | Server clock skew.
 service_request_bytes_yb_server_GenericService_ServerClock  | Bytes received by yb.server.GenericService.ServerClock() RPC requests
 service_response_bytes_yb_server_GenericService_ServerClock | Bytes sent in response to yb.server.GenericService.ServerClock() RPC requests
(8 rows)
				
			
				
					yugabyte=# SELECT metric, description FROM available_prometheus_metrics WHERE server = 'TSERVER' AND description ILIKE '%conflict%' ORDER BY metric;
                   metric                   |                                       description

--------------------------------------------+-----------------------------------------------------------------------------------------
 conflict_resolution_latency_count          | Microseconds spent on conflict resolution across all transactions at the current tablet
 conflict_resolution_latency_sum            | Microseconds spent on conflict resolution across all transactions at the current tablet
 conflict_resolution_num_keys_scanned_count | Number of keys scanned during conflict resolution)
 conflict_resolution_num_keys_scanned_sum   | Number of keys scanned during conflict resolution)
 transaction_conflicts                      | Number of conflicts detected among uncommitted distributed transactions.
(5 rows)
				
			

Have Fun!

My daughter's dog is chilly?