Use VmHWM to Catch “Memory-Heavy” YSQL Backends (and Help Prevent OOMs)

TL;DR

A previous YugabyteDB tip, Measure Total Postgres Backend Usage, showed how to measure total current backend memory usage using PSS (Proportional Set Size). That tells you how close your system is to memory pressure right now.

This tip adds a second lens: VmHWM (“High Water Mark”) from /proc/<pid>/status, which reveals the peak resident memory each backend has reached during its lifetime.

Together, these two metrics help you:

● Understand current memory pressure
● Identify spike-capable sessions
● Reduce the risk of unexpected OOM kills

Why This Matters for OOM Prevention

Out-of-memory (OOM) events in distributed systems are rarely caused by steady, predictable usage.

More often, they happen like this:

1. A backend runs a large sort, hash join, or complex aggregation.
2. Memory usage spikes significantly.
3. The query finishes and memory is “freed.”
4. Everything looks fine… until several sessions spike at the same time.
5. The kernel’s OOM killer steps in.

If you only look at current memory (VmRSS), you may miss the sessions that are capable of causing those spikes.

That’s where VmHWM becomes extremely valuable.

What `VmHWM` Actually Measures

In Linux, every process exposes memory stats in:

				
					/proc/<pid>/status

Key fields:

● VmRSS → Current resident memory (physical RAM)
● VmHWM → Peak resident memory (“High Water Mark”)
● VmPeak → Peak virtual memory (address space, not necessarily backed by RAM)

The Critical Difference

VmRSS answers:

“How much RAM is this backend using right now?”

VmHWM answers:

“What is the maximum RAM this backend has required at any point since it started?”

That makes VmHWM ideal for:

● Identifying historically memory-heavy sessions
● Understanding worst-case backend demand
● Capacity planning
● Explaining intermittent OOMs

How This Complements the Previous YugabyteDB Tip

Use Both Metrics Together

PSS (from a previous YugabyteDB tip) → Measures total backend memory usage right now without massively double-counting shared memory.

VmHWM (this tip) → Identifies which backends have proven they can spike large amounts of RAM.

Together they give you:

● Real-time pressure visibility
● Worst-case backend awareness
● Better guardrails against OOM

Think of it this way:

● PSS = “Are we close to the cliff?”
● VmHWM = “Which sessions have demonstrated they can push us off the cliff?”

Important: Do NOT Sum VmHWM Across Backends

Just like RSS, VmHWM includes memory that may reflect shared segments.

In PostgreSQL-style architectures (including YSQL in YugabyteDB):

● Each backend maps shared memory.
● Shared memory may appear in each process’s resident numbers.
● Summing VmHWM across all processes will greatly overestimate true RAM usage.

VmHWM should be used to:

● Rank and identify outliers
● Detect spike-capable backends
● Inform tuning decisions

Not to calculate total node memory.

How to Check VmHWM for a Backend

To check a specific backend:

				
					grep VmHWM /proc/<pid>/status

To list the top 20 YSQL backends by peak resident memory:

				
					# Print header (match exact widths used later)
printf "%10s  %-24s  %12s  %12s  %s\n" \
  "PID" "TYPE" "VmHWM_MB" "VmRSS_MB" "DETAIL"

printf "%10s  %-24s  %12s  %12s  %s\n" \
  "----------" "------------------------" "------------" "------------" "------------------------------"

# Collect tab-separated raw rows first
{
for pid in $(ps -eo pid,comm | awk '$2=="postgres"{print $1}'); do
  cmd=$(ps -p "$pid" -o args= 2>/dev/null || true)
  [[ -n "$cmd" ]] || continue

  role=""
  if [[ "$cmd" == *"postgres:"* ]]; then
    role="${cmd#*postgres: }"
  fi

  if [[ "$cmd" != *"postgres:"* ]] && [[ "$cmd" == *"/postgres/bin/postgres"* || "$cmd" == *"/postgres" || "$cmd" == postgres* ]]; then
    type="YSQL postmaster parent"
    detail="${cmd:0:120}"

  elif [[ "$role" == YSQL\ webserver* ]]; then
    type="YSQL webserver"
    detail="$role"

  elif [[ "$role" == yb_ash\ collector* ]]; then
    type="yb_ash collector"
    detail="$role"

  elif [[ "$role" =~ ^(logger|checkpointer|background\ writer|walwriter|autovacuum\ launcher|stats\ collector|logical\ replication\ launcher) ]]; then
    type="Background worker"
    detail="$role"

  elif [[ -n "$role" ]]; then
    type="Client backend"
    detail="$role"
  else
    type="Other"
    detail="${cmd:0:120}"
  fi

  status_file="/proc/$pid/status"
  [[ -r "$status_file" ]] || continue

  hwm_kb=$(awk '/^VmHWM:/ {print $2}' "$status_file" 2>/dev/null)
  rss_kb=$(awk '/^VmRSS:/ {print $2}' "$status_file" 2>/dev/null)
  [[ -n "$hwm_kb" ]] || continue
  rss_kb="${rss_kb:-0}"

  hwm_mb=$(awk "BEGIN {printf \"%.1f\", $hwm_kb/1024}")
  rss_mb=$(awk "BEGIN {printf \"%.1f\", $rss_kb/1024}")

  printf "%s\t%s\t%s\t%s\t%s\n" "$pid" "$type" "$hwm_mb" "$rss_mb" "${detail:0:120}"
done
} | sort -t $'\t' -k3,3nr | head -20 | \
awk -F $'\t' '{
  printf "%10s  %-24s  %12s  %12s  %s\n",
         $1, $2, $3, $4, $5
}'

Sample output:

				
					.      PID  TYPE                          VmHWM_MB      VmRSS_MB  DETAIL
----------  ------------------------  ------------  --------------------------------------------------------------------------------------------------------------------------------------
   2061477  YSQL postmaster parent           215.7         215.7  /root/yugabyte-2025.2.0.0/postgres/bin/postgres -D /root/var/data/pg_data -p 5433 -h 127.0.0.1 -k /tmp/.yb.127.0.0.1:543
   2061552  Client backend                    44.9          44.9  yugabyte yugabyte 127.0.0.1(39086) idle
   2061487  yb_ash collector                  41.4          41.4  yb_ash collector
   2061482  YSQL webserver                    29.3          29.3  YSQL webserver
   2061124  YSQL postmaster parent            27.1          27.1  postgres
   2061480  Background worker                 18.1          18.1  checkpointer
   2061479  Background worker                 17.5          17.5  logger
   2061158  Client backend                    15.8          15.8  postgres yugaware 127.0.0.1(39552) idle
   2061174  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39570) idle
   2061175  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39584) idle
   2061176  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39600) idle
   2061177  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39604) idle
   2061178  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39612) idle
   2061179  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39618) idle
   2061180  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39626) idle
   2061181  Client backend                    12.3          12.3  postgres yugaware 127.0.0.1(39628) idle
   2061173  Client backend                    12.1          12.1  postgres yugaware 127.0.0.1(39558) idle
   2061154  Background worker                  9.6           9.6  walwriter
   2061155  Background worker                  8.0           8.0  autovacuum launcher
   2061157  Background worker                  7.1           7.1  logical replication launcher

🔎 Observations

● Single-node yugabyted deployment.
- The ~216 MB postmaster is the active YSQL instance. VmHWM = VmRSS, indicating stable shared memory usage (no spike behavior).
● Memory distribution looks healthy.
- Background workers, YSQL webserver, and yb_ash collector are all within expected ranges.
● One elevated client backend (~45 MB).
- Most idle sessions sit around 12–15 MB. The higher session likely executed a heavier query earlier and retained memory. Not a concern by itself… but something to watch if many accumulate.

Nothing looks abnormal from an OOM standpoint.

👀 What to Look For

● Multiple client backends with VmHWM steadily increasing
● Idle sessions holding large resident memory
● VmHWM significantly higher than VmRSS (indicates prior spikes)
● Rapid growth in total backend PSS combined with rising per-backend VmHWM

Even Better (Filter Only Client Backends)

f you’re specifically hunting OOM risks, you probably want to ignore:

● logger
● checkpointer
● walwriter
● yb_ash collector
● YSQL webserver

Save the following script as yb_top_backend_mem.sh:

				
					#!/usr/bin/env bash
#
# yb_top_backend_mem.sh
#
# Show the Top-N YSQL processes ranked by peak resident memory (VmHWM),
# with current resident memory (VmRSS), DELTA (VmHWM - VmRSS),
# and a best-effort process classification.
#
# Enhancements:
#   --sort-delta           Sort by DELTA_MB instead of VmHWM_MB
#   Summary line           Total VmRSS MB across displayed rows + (optional) all scanned rows
#   Color highlighting     DELTA_MB warning/critical thresholds
#
# Columns:
#   PID | TYPE | VmHWM_MB | VmRSS_MB | DELTA_MB | DETAIL
#

set -euo pipefail

TOP_N=20
ONLY_CLIENT=0

HOST="${HOST:-127.0.0.1}"
PORT="${PORT:-5433}"
USER_NAME="${USER_NAME:-yugabyte}"
DB_NAME="${DB_NAME:-yugabyte}"
YSQLSH="${YSQLSH:-ysqlsh}"
TIMEOUT_S="${TIMEOUT_S:-5}"

SORT_MODE="hwm"                 # hwm | delta
DELTA_WARN_MB="${DELTA_WARN_MB:-50}"
DELTA_CRIT_MB="${DELTA_CRIT_MB:-200}"

COLOR_MODE="auto"               # auto | on | off

usage() {
  cat <<EOF
Usage: $(basename "$0") [options]
  -n, --top N                 Show top N rows (default: ${TOP_N})
  -h, --host HOST             YSQL host (default: ${HOST})
  -p, --port PORT             YSQL port (default: ${PORT})
  -U, --user USER             YSQL user (default: ${USER_NAME})
  -d, --db DB                 Database name (default: ${DB_NAME})
  --timeout SEC               statement_timeout in seconds (default: ${TIMEOUT_S})
  --only-client               Show only client backends (from pg_stat_activity)
  --sort-delta                Sort by DELTA_MB (spike hunting mode)
  --delta-warn-mb N           Warn threshold for DELTA_MB (default: ${DELTA_WARN_MB})
  --delta-crit-mb N           Crit threshold for DELTA_MB (default: ${DELTA_CRIT_MB})
  --color                     Force color output
  --no-color                  Disable color output
  --help                      Show help

Environment:
  YSQLSH                       Path to ysqlsh (default: ysqlsh)
  HOST, PORT, USER_NAME, DB_NAME, TIMEOUT_S
  DELTA_WARN_MB, DELTA_CRIT_MB
EOF
}

while [[ $# -gt 0 ]]; do
  case "$1" in
    -n|--top) TOP_N="$2"; shift 2;;
    -h|--host) HOST="$2"; shift 2;;
    -p|--port) PORT="$2"; shift 2;;
    -U|--user) USER_NAME="$2"; shift 2;;
    -d|--db) DB_NAME="$2"; shift 2;;
    --timeout) TIMEOUT_S="$2"; shift 2;;
    --only-client) ONLY_CLIENT=1; shift;;
    --sort-delta) SORT_MODE="delta"; shift;;
    --delta-warn-mb) DELTA_WARN_MB="$2"; shift 2;;
    --delta-crit-mb) DELTA_CRIT_MB="$2"; shift 2;;
    --color) COLOR_MODE="on"; shift;;
    --no-color) COLOR_MODE="off"; shift;;
    --help) usage; exit 0;;
    *) echo "Unknown option: $1" >&2; usage; exit 2;;
  esac
done

command -v "$YSQLSH" >/dev/null 2>&1 || {
  echo "ERROR: '$YSQLSH' not found in PATH." >&2
  exit 1
}

timeout_ms=$(( TIMEOUT_S * 1000 ))

# Determine whether to use color
use_color=0
if [[ "$COLOR_MODE" == "on" ]]; then
  use_color=1
elif [[ "$COLOR_MODE" == "auto" && -t 1 ]]; then
  use_color=1
fi

# ANSI colors (only used if use_color=1)
RED=$'\033[31m'
YELLOW=$'\033[33m'
RESET=$'\033[0m'

CLIENT_PIDS=""
if [[ "$ONLY_CLIENT" -eq 1 ]]; then
  SQL_PIDS="SET statement_timeout = ${timeout_ms};
SELECT pid
FROM pg_stat_activity
WHERE backend_type = 'client backend'
ORDER BY pid;"

  if ! CLIENT_PIDS=$("$YSQLSH" -h "$HOST" -p "$PORT" -U "$USER_NAME" -d "$DB_NAME" \
    -A -t -F $'\n' -c "$SQL_PIDS"); then
    cat >&2 <<EOF
ERROR: Failed to run ysqlsh query.

Try:
  $YSQLSH -h $HOST -p $PORT -U $USER_NAME -d $DB_NAME -c "SELECT 1;"
EOF
    exit 1
  fi
fi

printf "%10s  %-24s  %12s  %12s  %12s  %s\n" \
  "PID" "TYPE" "VmHWM_MB" "VmRSS_MB" "DELTA_MB" "DETAIL"
printf "%10s  %-24s  %12s  %12s  %12s  %s\n" \
  "----------" "------------------------" "------------" "------------" "------------" "------------------------------"

# Choose sort key (3=VmHWM_MB, 5=DELTA_MB in tab output)
sort_key_col=3
if [[ "$SORT_MODE" == "delta" ]]; then
  sort_key_col=5
fi

# We'll compute totals over:
#  - all scanned rows (ALL_RSS_SUM_MB)
#  - displayed top-N rows (TOP_RSS_SUM_MB)
ALL_RSS_SUM_MB=0

# Collect rows (tab-separated), while also accumulating ALL_RSS_SUM_MB
rows_tmp="$(mktemp)"
trap 'rm -f "$rows_tmp"' EXIT

{
  if [[ "$ONLY_CLIENT" -eq 1 ]]; then
    while read -r pid; do
      [[ -n "${pid:-}" ]] || continue

      cmd=$(ps -p "$pid" -o args= 2>/dev/null || true)
      [[ -n "$cmd" ]] || continue

      status_file="/proc/$pid/status"
      [[ -r "$status_file" ]] || continue

      hwm_kb=$(awk '/^VmHWM:/ {print $2}' "$status_file")
      rss_kb=$(awk '/^VmRSS:/ {print $2}' "$status_file")
      [[ -n "${hwm_kb:-}" ]] || continue
      rss_kb="${rss_kb:-0}"

      hwm_mb=$(awk "BEGIN {printf \"%.1f\", $hwm_kb/1024}")
      rss_mb=$(awk "BEGIN {printf \"%.1f\", $rss_kb/1024}")
      delta_mb=$(awk "BEGIN {printf \"%.1f\", ($hwm_kb-$rss_kb)/1024}")

      # Accumulate ALL_RSS_SUM_MB
      ALL_RSS_SUM_MB=$(awk "BEGIN {printf \"%.1f\", ${ALL_RSS_SUM_MB}+${rss_mb}}")

      detail="$cmd"
      if [[ "$cmd" == *"postgres:"* ]]; then
        detail="${cmd#*postgres: }"
      fi

      printf "%s\t%s\t%s\t%s\t%s\t%s\n" \
        "$pid" "Client backend" "$hwm_mb" "$rss_mb" "$delta_mb" "${detail:0:140}"
    done <<< "$CLIENT_PIDS"
  else
    for pid in $(ps -eo pid,comm | awk '$2=="postgres"{print $1}'); do
      cmd=$(ps -p "$pid" -o args= 2>/dev/null || true)
      [[ -n "$cmd" ]] || continue

      role=""
      if [[ "$cmd" == *"postgres:"* ]]; then
        role="${cmd#*postgres: }"
      fi

      type="Other"
      detail="${cmd:0:140}"

      if [[ "$cmd" != *"postgres:"* ]] && [[ "$cmd" == *"/postgres/bin/postgres"* || "$cmd" == *"/postgres" || "$cmd" == postgres* ]]; then
        type="YSQL postmaster parent"
      elif [[ "$role" == YSQL\ webserver* ]]; then
        type="YSQL webserver"; detail="$role"
      elif [[ "$role" == yb_ash\ collector* ]]; then
        type="yb_ash collector"; detail="$role"
      elif [[ "$role" =~ ^(logger|checkpointer|background\ writer|walwriter|autovacuum\ launcher|stats\ collector|logical\ replication\ launcher) ]]; then
        type="Background worker"; detail="$role"
      elif [[ -n "$role" ]]; then
        type="Client backend"; detail="$role"
      fi

      status_file="/proc/$pid/status"
      [[ -r "$status_file" ]] || continue

      hwm_kb=$(awk '/^VmHWM:/ {print $2}' "$status_file")
      rss_kb=$(awk '/^VmRSS:/ {print $2}' "$status_file")
      [[ -n "${hwm_kb:-}" ]] || continue
      rss_kb="${rss_kb:-0}"

      hwm_mb=$(awk "BEGIN {printf \"%.1f\", $hwm_kb/1024}")
      rss_mb=$(awk "BEGIN {printf \"%.1f\", $rss_kb/1024}")
      delta_mb=$(awk "BEGIN {printf \"%.1f\", ($hwm_kb-$rss_kb)/1024}")


      ALL_RSS_SUM_MB=$(awk "BEGIN {printf \"%.1f\", ${ALL_RSS_SUM_MB}+${rss_mb}}")

      printf "%s\t%s\t%s\t%s\t%s\t%s\n" \
        "$pid" "$type" "$hwm_mb" "$rss_mb" "$delta_mb" "${detail:0:140}"
    done
  fi
} > "$rows_tmp"

# Select Top-N
top_tmp="$(mktemp)"
trap 'rm -f "$rows_tmp" "$top_tmp"' EXIT

sort -t $'\t' -k${sort_key_col},${sort_key_col}nr "$rows_tmp" | head -n "$TOP_N" > "$top_tmp"

# Compute TOP_RSS_SUM_MB
TOP_RSS_SUM_MB=$(awk -F $'\t' '{sum+=$4} END {printf "%.1f", sum+0}' "$top_tmp")

# Print Top-N with optional color highlighting based on DELTA_MB
awk -F $'\t' -v use_color="$use_color" -v warn="$DELTA_WARN_MB" -v crit="$DELTA_CRIT_MB" \
  -v RED="$RED" -v YELLOW="$YELLOW" -v RESET="$RESET" '
{
  pid=$1; type=$2; hwm=$3; rss=$4; delta=$5; detail=$6;

  delta_out=delta;
  if (use_color==1) {
    if (delta+0 >= crit+0) delta_out=RED delta RESET;
    else if (delta+0 >= warn+0) delta_out=YELLOW delta RESET;
  }

  printf "%10s  %-24s  %12s  %12s  %12s  %s\n", pid, type, hwm, rss, delta_out, detail;
}' "$top_tmp"


# Summary
printf "\n%s\n" "Summary"
printf '%s\n' "-------"

printf "  %-34s %12s\n" \
  "Sort mode:" "$( [[ "$SORT_MODE" == "delta" ]] && echo "DELTA_MB" || echo "VmHWM_MB" )"

printf "  %-34s %12.1f MB\n" \
  "Total VmRSS (all scanned rows):" "$ALL_RSS_SUM_MB"

printf "  %-34s %12.1f MB\n" \
  "Total VmRSS (displayed Top-${TOP_N}):" "$TOP_RSS_SUM_MB"

printf "  %-34s %12.1f MB\n" \
  "DELTA warn threshold:" "$DELTA_WARN_MB"

printf "  %-34s %12.1f MB\n" \
  "DELTA crit threshold:" "$DELTA_CRIT_MB"

Make it executable:

				
					chmod +x yb_top_backend_mem.sh

If you want the script to be OOM-focused, run:

				
					./yb_top_backend_mem.sh --only-client -n 20

That removes parent/worker noise and highlights “which sessions are the biggest right now / historically.”

Example:

				
					[root@localhost yb]# ./yb_top_backend_mem.sh --only-client -n 20
       PID  TYPE                          VmHWM_MB      VmRSS_MB      DELTA_MB  DETAIL
----------  ------------------------  ------------  ------------  ------------  ------------------------------
   2100666  Client backend                    53.4          53.4           0.0  yugabyte yugabyte 127.0.0.1(35762) idle
   2100649  Client backend                    45.1          45.1           0.0  yugabyte yugabyte 127.0.0.1(35760) idle
   2061552  Client backend                    45.0          45.0           0.0  yugabyte yugabyte 127.0.0.1(39086) idle

Summary
-------
  Sort mode:                             VmHWM_MB
  Total VmRSS (all scanned rows):           143.5 MB
  Total VmRSS (displayed Top-20):           143.5 MB
  DELTA warn threshold:                      50.0 MB
  DELTA crit threshold:                     200.0 MB

🔎 What DELTA Shows

● 0.0 → Process currently at peak memory
● > 0 → Process previously spiked and released memory
● Large DELTA → “Spike-capable” backend (important for OOM analysis)

🎯 Conclusion

OOMs don’t happen because of steady memory usage.

They happen because of spikes.

● PSS shows total backend memory pressure right now.
● VmHWM shows which sessions have proven they can spike.
● DELTA highlights backends that already did.

Use both metrics together and you move from reactive firefighting to proactive guardrails.

In YugabyteDB, every YSQL backend is a real OS process. If one spikes, the kernel notices.

If you want to stay ahead of OOM:

1. Measure the present.
2. Remember the past.
3. Plan for the worst-case.

Have Fun!