In the previous tip, Validating Python SELinux Bindings on YugabyteDB Database Nodes (with Automation Support), we introduced a script to validate an easy-to-miss YugabyteDB prerequisite:
- Database nodes must have the Python SELinux package that corresponds to the Python version in use.
That check works great on a single node, but in real deployments, you almost always want to validate every node in the universe before installation or provisioning.
In this follow-up tip, weβll show how to:
β Run the same validation across many nodes from one server
β Support pre-install environments (no
yugabyteuser required)β Scale cleanly from 3 nodes to 1000+ nodes
β Collect results in JSON for automation
β Print a concise, human-readable summary for tickets and reviews
π§ Why this matters (especially before install)
Validating only one node is risky. In practice:
β One node might be running Python 3.11 while others use 3.9
β SELinux bindings might be installed everywhere except one node
β OS images can drift over time
These issues tend to surface during provisioning, when theyβre slow and painful to debug.
Running this check before YugabyteDB is installed gives you confidence that every node meets the requirement, and lets you fix issues early.
π SSH access: flexible by design
This approach does not assume:
β the
yugabyteOS user existsβ YugabyteDB is installed
β YugabyteDB Anywhere requires SSH (newer versions donβt, after provisioning)
Instead, it works with whatever SSH access you already have, for example:
β
ec2-user(Amazon Linux)β
ubuntuβ
centos,rocky,almalinuxβ
rootβ a corporate admin account
The only requirement is that the SSH user can run python3 on the node.
π Step 1: Prepare a node list
Create a simple nodes.txt file with one entry per line.
10.0.1.10
ubuntu@10.0.1.11
root@db-node-03.example.com
β If an entry includes
user@host, that user is usedβ If itβs just a hostname or IP, a default SSH user will be applied
π§ͺ Step 2: Use the parallel remote runner
This runner script:
β SSHes to each node
β Streams
yb_python_selinux_check.sh --jsonremotely (no copy needed)β Runs checks in parallel (configurable)
β Writes:
β a combined JSON file
β a concise summary table
β Exits non-zero if any node fails (automation-friendly)
It also supports batch vs no-batch SSH, which is critical for different environments.
Save as: run_yb_python_selinux_check_remote_parallel.sh
#!/usr/bin/env bash
set -euo pipefail
NODES_FILE="${1:-nodes.txt}"
shift || true
SSH_USER="${SSH_USER:-yugabyte}"
CONNECT_TIMEOUT="${CONNECT_TIMEOUT:-6}"
OUTFILE_JSON="${OUTFILE_JSON:-yb_selinux_results.json}"
OUTFILE_SUMMARY="${OUTFILE_SUMMARY:-yb_selinux_summary.tsv}"
PARALLEL=10
STRICT_MODE=false
BATCH_MODE=true # default = non-interactive (best for scale)
while [[ $# -gt 0 ]]; do
case "$1" in
--parallel) PARALLEL="${2:-10}"; shift 2 ;;
--strict) STRICT_MODE=true; shift ;;
--batch) BATCH_MODE=true; shift ;;
--no-batch) BATCH_MODE=false; shift ;;
*) echo "Unknown arg: $1" >&2; exit 2 ;;
esac
done
if [[ ! -f "$NODES_FILE" ]]; then
echo "ERROR: nodes file not found: $NODES_FILE" >&2
exit 2
fi
if [[ ! -f "./yb_python_selinux_check.sh" ]]; then
echo "ERROR: ./yb_python_selinux_check.sh not found in current directory" >&2
exit 2
fi
mapfile -t NODES < <(grep -vE '^\s*(#|$)' "$NODES_FILE")
[[ "${#NODES[@]}" -eq 0 ]] && { echo "ERROR: no nodes found"; exit 2; }
TMPDIR="$(mktemp -d)"
trap 'rm -rf "$TMPDIR"' EXIT
json_escape() {
local s="${1:-}"
s="${s//\\/\\\\}"
s="${s//\"/\\\"}"
s="${s//$'\n'/\\n}"
printf '%s' "$s"
}
make_target() {
[[ "$1" == *@* ]] && printf "%s" "$1" || printf "%s@%s" "$SSH_USER" "$1"
}
worker() {
local entry="$1"
local target ts raw rc payload
local out_json out_tsv
target="$(make_target "$entry")"
ts="$(date -Iseconds)"
out_json="$TMPDIR/$(echo "$target" | tr '/:@' '___').json"
out_tsv="$TMPDIR/$(echo "$target" | tr '/:@' '___').tsv"
local ssh_opts=(
-o ConnectTimeout="$CONNECT_TIMEOUT"
-o StrictHostKeyChecking=accept-new
)
$BATCH_MODE && ssh_opts+=(-o BatchMode=yes) || ssh_opts+=(-o BatchMode=no)
local remote_args=(--json)
$STRICT_MODE && remote_args+=(--strict)
set +e
raw="$(ssh "${ssh_opts[@]}" "$target" "bash -s -- ${remote_args[*]}" < ./yb_python_selinux_check.sh 2>&1)"
rc=$?
set -e
if [[ $rc -eq 0 || $rc -eq 1 ]] && [[ "$raw" == \{* ]]; then
payload="$raw"
else
payload="{\"status\":\"error\",\"exit_code\":$rc,\"reason\":\"ssh_failed\",\"raw\":\"$(json_escape "$raw")\"}"
fi
printf '{"target":"%s","timestamp":"%s","result":%s}\n' \
"$(json_escape "$target")" "$(json_escape "$ts")" "$payload" > "$out_json"
printf "%s\t%s\t%s\t%s\t%s\n" \
"$target" \
"$(echo "$payload" | sed -n 's/.*"id":"\([^"]*\)".*/\1/p')" \
"$(echo "$payload" | sed -n 's/.*"version_id":"\([^"]*\)".*/\1/p')" \
"$(echo "$payload" | sed -n 's/.*"version":"Python \([^"]*\)".*/\1/p')" \
"$(echo "$payload" | sed -n 's/.*"status":"\([^"]*\)".*/\1/p')" \
> "$out_tsv"
}
export -f worker make_target json_escape
export SSH_USER CONNECT_TIMEOUT STRICT_MODE BATCH_MODE TMPDIR
printf "%s\n" "${NODES[@]}" | xargs -P "$PARALLEL" -I{} bash -lc 'worker "$@"' _ {}
{
echo -e "target\tos_id\tos_version_id\tpython_version\tstatus"
cat "$TMPDIR"/*.tsv | sort
} > "$OUTFILE_SUMMARY"
{
echo "["
paste -sd, "$TMPDIR"/*.json
echo "]"
} > "$OUTFILE_JSON"
echo
printf "%-36s %-18s %-10s %-18s %-6s\n" "TARGET" "OS" "VER" "PYTHON" "STATUS"
printf "%-36s %-18s %-10s %-18s %-6s\n" "------------------------------------" "------------------" "----------" "------------------" "------"
any_bad=0
while IFS=$'\t' read -r target os_id os_ver py_ver status; do
[[ "$target" == "target" ]] && continue
[[ "$status" != "pass" ]] && any_bad=1
printf "%-36s %-18s %-10s %-18s %-6s\n" \
"$target" "${os_id:-unknown}" "${os_ver:-unknown}" "${py_ver:-unknown}" "${status^^}"
done < "$OUTFILE_SUMMARY"
echo
echo "Wrote:"
echo " - $OUTFILE_JSON"
echo " - $OUTFILE_SUMMARY"
exit $any_bad
Make it executable:
chmod +x run_yb_python_selinux_check_remote_parallel.sh
βΆοΈ Step 3: Run it
From the same directory where the files yb_python_selinux_check.sh and nodes.txt live.
Default behavior (batch mode – recommended)
./run_yb_python_selinux_check_remote_parallel.sh nodes.txt --parallel 25 --batch
- β Uses
BatchMode=yes - β No password prompts
- β Requires SSH keys
- β Best for automation and large node counts
Password-prompt mode (no-batch)
./run_yb_python_selinux_check_remote_parallel.sh nodes.txt --parallel 5 --no-batch
β Allows interactive password authentication
β Useful for:
β demos
β one-off troubleshooting
β environments without SSH keys yet
β Not recommended for large node counts
Choosing --parallel
A good starting guide:
| Node Count | Suggested Parallelism |
|---|---|
| 3β12 | --parallel 5 or 10 |
| 27β100 | --parallel 20β40 |
| 100β1000 | --parallel 50+ (tune carefully) |
Youβre balancing:
β SSH handshakes
β DNS latency
β jump host CPU
β firewall / bastion limits
π Example summary output
TARGET OS VER PYTHON STATUS
------------------------------------ ------------ ---------- ---------------- ------
ubuntu@10.0.1.10 almalinux 9.6 3.9.21 PASS
ubuntu@10.0.1.11 almalinux 9.6 3.11.11 FAIL
root@db-node-03.example.com. rocky 9.4 3.9.18 PASS
This gives you exactly what you need at a glance:
β Which nodes fail
β OS version
β Python version
β PASS / FAIL
How to Read the Output
| Status | Meaning |
|---|---|
| PASS |
Confirms that Python is installed and that the active Python interpreter can successfully import
the SELinux bindings (import selinux). This is the exact runtime requirement that
YugabyteDB tooling depends on.
|
| FAIL | Indicates that the active Python interpreter cannot import the SELinux bindings. This is most commonly caused by a mismatch between the Python version in use and the version targeted by the installed SELinux packages. |
| PASS (Strict Mode) |
When the check is run with --strict, a PASS additionally confirms
that a recognized SELinux Python package (for example,
python3-libselinux or libselinux-python) is installed. This mode is
useful for compliance- or audit-driven environments.
|
π Generated artifacts
After the run completes, youβll have:
β
yb_selinux_results.json– A full JSON array with detailed results per node (ideal for audits, CI, or support tickets)- β
yb_selinux_summary.tsv– A simple, sortable summary file you can:β open in Excel
β paste into Slack
β convert to Markdown
β Conclusion
When validating prerequisites for a distributed database, checking a single node isnβt enough. Small configuration differences, especially around Python versions, can quietly derail provisioning later.
This remote, parallel validation approach lets you:
β verify every node before install
β scale from a handful of nodes to thousands
β collect machine-readable JSON
β and still produce a clean, human-friendly summary
Itβs a lightweight step that fits perfectly into preflight checks and can save hours of troubleshooting down the line.
Have Fun!
