If you are a software engineer, you probably use Git every day.
Need to test something risky?
Create a branch.
Need to preserve a known-good state?
Create a tag.
Need to throw away a bad idea?
Delete the branch and move on.
But databases have traditionally not worked that way.
Testing schema changes, debugging production data, or running QA usually means copying large datasets, waiting, and burning resources.
With YugabyteDB Instant Cloning, that changes.
You can now treat your database more like code: branch it, test it, reset it, and drop it… all in seconds.
YugabyteDB Instant Cloning enables zero-copy, writable database branchesβmaking it possible to implement Git-like workflows directly at the database layer. Instead of restoring full backups (slow, disruptive, and often losing recent data), you can clone your database at a precise moment in time, inspect or extract exactly what you need, and recover without impacting production.
The Idea: yb-branch
This tip walks through a simple tool concept: yb-branch.
The goal is to make database workflows feel like Git:
- β branch
- β tag
- β reset
- β drop
- β list
- β prune
- β audit everything
Example:
yb-branch branch prod_db dev_feature_z --owner my_dev_role --create-tag z-initial-state
Or time-travel:
yb-branch branch prod_db recovery_db \
--owner dba_admin \
--as-of "$(date -d '5 minutes ago' +'%Y-%m-%d %H:%M:%S')"
The full CLI implementation supports all major lifecycle operations and integrates audit logging and tagging.
yb-branch.py:
#!/usr/bin/env python3
import argparse
import psycopg2
from psycopg2 import sql
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
import subprocess
import json
import os
import sys
# ==========================================
# CONFIGURATION VIA ENVIRONMENT VARIABLES
# ==========================================
# Core Database Credentials
DB_HOST = os.getenv("YB_HOST", "127.0.0.1")
DB_PORT = os.getenv("YB_PORT", "5433")
DB_USER = os.getenv("YB_USER", "yugabyte")
DB_PASS = os.getenv("YB_PASS", "yugabyte")
ADMIN_DB = os.getenv("YB_ADMIN_DB", "yugabyte")
SEED_MASTER = os.getenv("YB_SEED_MASTER", "127.0.0.1:7100")
# SSL / TLS Configuration
SSL_MODE = os.getenv("YB_SSL_MODE", "prefer") # e.g., disable, require, verify-ca, verify-full
SSL_ROOT_CERT = os.getenv("YB_SSL_ROOT_CERT") # Path to root.crt
SSL_CERT = os.getenv("YB_SSL_CERT") # Path to client certificate
SSL_KEY = os.getenv("YB_SSL_KEY") # Path to client key
CERTS_DIR = os.getenv("YB_CERTS_DIR") # Used by yb-admin for TLS-enabled clusters
# ==========================================
# UTILITY FUNCTIONS
# ==========================================
def get_db_connection(dbname=ADMIN_DB, autocommit=False):
"""Establishes a secure database connection with optional SSL."""
try:
# Build connection arguments dynamically to only include provided SSL paths
conn_args = {
"host": DB_HOST,
"port": DB_PORT,
"user": DB_USER,
"password": DB_PASS,
"dbname": dbname,
"sslmode": SSL_MODE
}
if SSL_ROOT_CERT: conn_args["sslrootcert"] = SSL_ROOT_CERT
if SSL_CERT: conn_args["sslcert"] = SSL_CERT
if SSL_KEY: conn_args["sslkey"] = SSL_KEY
conn = psycopg2.connect(**conn_args)
if autocommit:
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
return conn
except Exception as e:
print(f"[-] Database connection failed: {e}")
sys.exit(1)
def get_all_masters(seed_master):
"""Dynamically fetches all master addresses for high availability."""
cmd = ["yb-admin", "--master_addresses", seed_master]
if CERTS_DIR:
cmd.extend(["--certs_dir_name", CERTS_DIR])
cmd.append("list_all_masters")
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
lines = result.stdout.strip().split('\n')
master_addresses = []
for line in lines[1:]:
if line.strip():
parts = line.split()
if len(parts) >= 2:
master_addresses.append(parts[1])
if master_addresses:
return ",".join(master_addresses)
return seed_master
except Exception:
return seed_master
def get_db_oid(cursor, db_name):
"""Fetches the PostgreSQL OID for a given database."""
cursor.execute("SELECT oid FROM pg_database WHERE datname = %s", (db_name,))
row = cursor.fetchone()
return row[0] if row else None
def check_clone_status_via_ybadmin(dynamic_masters, source_db_oid):
"""Calls yb-admin list_clones to verify the cluster-level status."""
namespace_id = f"00004000000030008000{source_db_oid:012x}"
cmd = ["yb-admin", "--master_addresses", dynamic_masters]
if CERTS_DIR:
cmd.extend(["--certs_dir_name", CERTS_DIR])
cmd.extend(["list_clones", namespace_id])
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
return json.loads(result.stdout)
except Exception:
return []
# ==========================================
# CORE WORKFLOWS
# ==========================================
def create_branch(user, source_db, branch_db, tag=None, as_of=None, owner_role=None, new_tag=None):
"""Creates a database clone supporting tags or raw timestamps, and optionally tags the new clone."""
if tag and as_of:
print("[-] Error: Please specify either --tag OR --as-of, not both.")
return False
dynamic_masters = get_all_masters(SEED_MASTER)
conn = get_db_connection()
cur = conn.cursor()
target_time_str = None
if tag:
cur.execute("SELECT point_in_time FROM db_version_tags WHERE tag_name = %s", (tag,))
row = cur.fetchone()
if not row:
print(f"[-] Error: Tag '{tag}' not found in db_version_tags.")
return False
target_time_str = row[0].strftime('%Y-%m-%d %H:%M:%S.%f')
elif as_of:
target_time_str = as_of
cur.execute("CALL request_db_branch(%s, %s, %s, %s, NULL);",
(user, source_db, branch_db, target_time_str))
log_id = cur.fetchone()[0]
conn.commit()
source_oid = get_db_oid(cur, source_db)
cur.close()
conn.close()
if not source_oid:
print(f"[-] Error: Source database '{source_db}' does not exist.")
return False
ddl_conn = get_db_connection(autocommit=True)
ddl_cur = ddl_conn.cursor()
try:
ddl_cur.execute("CALL finalize_db_branch(%s, 'CLONING');", (log_id,))
print(f"[*] Initiating instant clone from '{source_db}' to '{branch_db}'...")
if target_time_str:
print(f"[*] Time travel active: Cloning AS OF {target_time_str}")
query = sql.SQL("CREATE DATABASE {} OWNER {} TEMPLATE {} AS OF {};").format(
sql.Identifier(branch_db), sql.Identifier(owner_role), sql.Identifier(source_db),
sql.Literal(target_time_str)
)
else:
query = sql.SQL("CREATE DATABASE {} OWNER {} TEMPLATE {};").format(
sql.Identifier(branch_db), sql.Identifier(owner_role), sql.Identifier(source_db)
)
ddl_cur.execute(query)
clones = check_clone_status_via_ybadmin(dynamic_masters, source_oid)
our_clone = next((c for c in clones if c.get('target_namespace_name') == branch_db), None)
if our_clone and our_clone.get('aggregate_state') != 'COMPLETE':
print("[!] Warning: YB-Master reports clone is not yet COMPLETE.")
ddl_cur.execute("CALL finalize_db_branch(%s, 'SUCCESS');", (log_id,))
print(f"[+] SUCCESS! Database branch '{branch_db}' is ready. Owned by: {owner_role}")
if new_tag:
print(f"[*] Automatically applying tag '{new_tag}' to new branch '{branch_db}'...")
create_tag(new_tag, branch_db, user)
return True
except Exception as e:
print(f"[-] Clone failed: {e}")
ddl_cur.execute("CALL finalize_db_branch(%s, 'FAILED');", (log_id,))
return False
finally:
ddl_cur.close()
ddl_conn.close()
def drop_branch(branch_name, assume_yes=False, user="system"):
"""Safely drops a clone database and updates the audit log."""
conn = get_db_connection()
cur = conn.cursor()
cur.execute("""
SELECT log_id, source_db
FROM db_clone_audit_log
WHERE target_clone_db = %s AND status = 'SUCCESS'
ORDER BY created_at DESC LIMIT 1;
""", (branch_name,))
row = cur.fetchone()
if not row:
print(f"[-] ERROR: '{branch_name}' is not recognized as an active clone.")
cur.close()
conn.close()
return False
log_id, source_db = row
cur.close()
conn.close()
if not assume_yes:
print(f"\n[!] WARNING: You are about to permanently drop '{branch_name}'.")
choice = input(" Are you sure you want to proceed? [y/N]: ").strip().lower()
if choice != 'y':
print("[*] Drop operation cancelled.")
return False
print(f"[*] Initiating drop for database '{branch_name}'...")
ddl_conn = get_db_connection(autocommit=True)
ddl_cur = ddl_conn.cursor()
try:
# 1. Lock the door: Prevent any NEW connections from being established
print(f"[*] Blocking new connections to '{branch_name}'...")
ddl_cur.execute(sql.SQL("ALTER DATABASE {} ALLOW_CONNECTIONS false;").format(sql.Identifier(branch_name)))
# 2. Kick everyone out: Terminate existing connections
print(f"[*] Terminating active connections to '{branch_name}'...")
ddl_cur.execute("""
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = %s AND pid <> pg_backend_pid();
""", (branch_name,))
# 3. Drop the database
ddl_cur.execute(sql.SQL("DROP DATABASE {};").format(sql.Identifier(branch_name)))
# 4. Update the audit log
log_conn = get_db_connection()
log_cur = log_conn.cursor()
log_cur.execute("UPDATE db_clone_audit_log SET status = 'DROPPED', dropped_at = NOW() WHERE log_id = %s;", (log_id,))
log_conn.commit()
log_conn.close()
print(f"[+] SUCCESS! Clone '{branch_name}' has been dropped.")
return True
except psycopg2.errors.ObjectInUse:
print(f"[-] ERROR: Cannot drop '{branch_name}'. Clients are currently connected.")
# If it fails, unlock the door so the database isn't left in a permanently locked state
try:
ddl_cur.execute(sql.SQL("ALTER DATABASE {} ALLOW_CONNECTIONS true;").format(sql.Identifier(branch_name)))
except:
pass
return False
except Exception as e:
print(f"[-] Failed to drop database: {e}")
# Attempt to unlock just in case it failed for another reason
try:
ddl_cur.execute(sql.SQL("ALTER DATABASE {} ALLOW_CONNECTIONS true;").format(sql.Identifier(branch_name)))
except:
pass
return False
finally:
ddl_conn.close()
def reset_branch(branch_name, assume_yes, user):
"""Drops an active branch and instantly recreates it from its original source."""
conn = get_db_connection()
cur = conn.cursor()
cur.execute("""
SELECT a.source_db, r.rolname as owner_role
FROM db_clone_audit_log a
INNER JOIN pg_database d ON a.target_clone_db = d.datname
INNER JOIN pg_roles r ON d.datdba = r.oid
WHERE a.target_clone_db = %s AND a.status = 'SUCCESS'
ORDER BY a.created_at DESC LIMIT 1;
""", (branch_name,))
row = cur.fetchone()
cur.close()
conn.close()
if not row:
print(f"[-] ERROR: Could not find active branch '{branch_name}' to reset.")
return
source_db, owner_role = row
print(f"[*] Resetting branch '{branch_name}' (Source: {source_db}, Owner: {owner_role})")
if drop_branch(branch_name, assume_yes, user):
print(f"[*] Recreating branch '{branch_name}'...")
create_branch(user, source_db, branch_name, None, None, owner_role, None)
def list_branches(owner_filter=None):
"""Lists all active database branches, when they were cloned, what they are based on, and their tags."""
conn = get_db_connection()
cur = conn.cursor()
query = """
SELECT
a.target_clone_db,
a.source_db,
r.rolname,
TO_CHAR(a.completed_at, 'YYYY-MM-DD HH24:MI'),
COALESCE(
(SELECT tag_name
FROM db_version_tags
WHERE source_db = a.source_db AND point_in_time = a.point_in_time
LIMIT 1),
CASE
WHEN a.point_in_time < a.created_at - INTERVAL '2 seconds'
THEN 'AS OF ' || TO_CHAR(a.point_in_time, 'YYYY-MM-DD HH24:MI')
ELSE 'latest'
END
) as based_on_tag,
COALESCE((
SELECT string_agg(tag_name, ', ')
FROM db_version_tags
WHERE source_db = a.target_clone_db
), '') as branch_tags
FROM db_clone_audit_log a
INNER JOIN pg_database d ON a.target_clone_db = d.datname
INNER JOIN pg_roles r ON d.datdba = r.oid
WHERE a.status = 'SUCCESS'
"""
params = []
if owner_filter:
query += " AND r.rolname = %s"
params.append(owner_filter)
query += " ORDER BY a.completed_at DESC;"
try:
cur.execute(query, params)
rows = cur.fetchall()
if not rows:
print("[*] No active branches found.")
return
print(f"{'BRANCH NAME':<20} | {'SOURCE DB':<15} | {'OWNER':<15} | {'CLONED AT':<16} | {'BASED ON':<22} | {'TAGS ON BRANCH'}")
print("-" * 117)
for r in rows:
print(f"{r[0]:<20} | {r[1]:<15} | {r[2]:<15} | {r[3]:<16} | {r[4]:<22} | {r[5]}")
except Exception as e:
print(f"[-] Failed to fetch branches: {e}")
finally:
cur.close()
conn.close()
def prune_branches(days, dry_run, assume_yes, user):
"""Garbage collects branches older than X days."""
conn = get_db_connection()
cur = conn.cursor()
cur.execute("""
SELECT a.target_clone_db, TO_CHAR(a.completed_at, 'YYYY-MM-DD')
FROM db_clone_audit_log a
INNER JOIN pg_database d ON a.target_clone_db = d.datname
WHERE a.status = 'SUCCESS' AND a.completed_at < NOW() - %s::interval;
""", (f"{days} days",))
rows = cur.fetchall()
cur.close()
conn.close()
if not rows:
print(f"[*] No active branches older than {days} days found.")
return
print(f"[*] Found {len(rows)} branch(es) older than {days} days.")
for branch, date in rows:
if dry_run:
print(f" [DRY RUN] Would drop: {branch} (Created: {date})")
else:
drop_branch(branch, assume_yes, user)
def create_tag(tag_name, source_db, user):
"""Creates a snapshot tag."""
conn = get_db_connection()
cur = conn.cursor()
try:
cur.execute("INSERT INTO db_version_tags (tag_name, source_db, point_in_time, created_by) VALUES (%s, %s, NOW(), %s)",
(tag_name, source_db, user))
conn.commit()
print(f"[+] Tag '{tag_name}' created successfully for '{source_db}'.")
except psycopg2.IntegrityError:
print(f"[-] Failed: Tag '{tag_name}' already exists.")
except Exception as e:
print(f"[-] Failed to create tag: {e}")
finally:
conn.close()
def list_tags():
"""Lists all release tags."""
conn = get_db_connection()
cur = conn.cursor()
cur.execute("SELECT tag_name, source_db, TO_CHAR(point_in_time, 'YYYY-MM-DD HH24:MI:SS'), created_by FROM db_version_tags ORDER BY point_in_time DESC;")
rows = cur.fetchall()
if not rows:
print("[*] No tags found.")
return
print(f"{'TAG NAME':<25} | {'DATABASE':<20} | {'CUT AT':<20} | {'CREATOR'}")
print("-" * 85)
for r in rows:
print(f"{r[0]:<25} | {r[1]:<20} | {r[2]:<20} | {r[3]}")
def show_log(limit):
"""Displays the branching history by reconstructing events from timestamps."""
conn = get_db_connection()
cur = conn.cursor()
# We fetch both the creation time and the dropped time
cur.execute("""
SELECT TO_CHAR(COALESCE(completed_at, created_at), 'YYYY-MM-DD HH24:MI:SS'),
TO_CHAR(dropped_at, 'YYYY-MM-DD HH24:MI:SS'),
requested_by, status, target_clone_db, source_db
FROM db_clone_audit_log
ORDER BY created_at DESC;
""")
rows = cur.fetchall()
events = []
for r in rows:
created_str, dropped_str, req, stat, branch, source = r
# 1. Everyone gets a creation event
create_stat = "SUCCESS" if stat == "DROPPED" else stat
events.append({
'time_str': created_str,
'msg': f"[{created_str}] {req} CREATED branch '{branch}' (Source: {source}) [Status: {create_stat}]"
})
# 2. If it was dropped, it gets a SECOND event in the timeline
if dropped_str:
events.append({
'time_str': dropped_str,
'msg': f"[{dropped_str}] {req} DROPPED branch '{branch}' (Source: {source}) [Status: DROPPED]"
})
# Sort all events chronologically (newest first) and apply the limit
events.sort(key=lambda x: x['time_str'], reverse=True)
for e in events[:limit]:
print(e['msg'])
cur.close()
conn.close()
# ==========================================
# CLI ENTRY POINT WITH EXTENSIVE HELP
# ==========================================
if __name__ == "__main__":
# --- SYS.ARGV INTERCEPTOR ---
# This automatically swaps "--help branch" into "branch --help"
# so argparse correctly triggers the subcommand help menu.
if len(sys.argv) == 3 and sys.argv[1] in ['-h', '--help', 'help']:
sys.argv[1], sys.argv[2] = sys.argv[2], '--help'
# ----------------------------
main_desc = """
yb-branch: Git-like Versioning for YugabyteDB
-------------------------------------------------
Instantly clone, version, reset, and prune YugabyteDB databases using zero-copy storage.
All operations are securely audited and natively integrate with YugabyteDB's
Instant Cloning and Point-In-Time-Recovery (PITR) features.
"""
main_epilog = """
ENVIRONMENT VARIABLES REQUIRED:
YB_HOST, YB_PORT, YB_USER, YB_PASS, YB_ADMIN_DB, YB_SEED_MASTER
SSL/TLS CONFIGURATION (Optional):
YB_SSL_MODE (Default: 'prefer'. Options: 'require', 'verify-ca', 'verify-full')
YB_SSL_ROOT_CERT (Path to root.crt)
YB_SSL_CERT (Path to client.crt)
YB_SSL_KEY (Path to client.key)
YB_CERTS_DIR (Path to the certs directory required by yb-admin)
"""
parser = argparse.ArgumentParser(
description=main_desc,
epilog=main_epilog,
formatter_class=argparse.RawDescriptionHelpFormatter
)
subparsers = parser.add_subparsers(dest="command", required=True, title="Available Commands")
# 1. Branch Command
branch_help = """
Create a new database branch (clone).
Examples:
yb-branch branch prod_db dev_db --owner user
yb-branch branch prod_db qa_db --owner user --tag v1.0
yb-branch branch prod_db inv_db --owner user --as-of '2026-05-04 13:00:00'
"""
branch_p = subparsers.add_parser("branch", help="Create a new database branch", description=branch_help, formatter_class=argparse.RawDescriptionHelpFormatter)
branch_p.add_argument("source", help="Source database to clone (e.g., 'prod_db')")
branch_p.add_argument("branch_name", help="Name of the new branch")
branch_p.add_argument("--owner", required=True, help="REQUIRED. Postgres Role that will own the branch")
branch_p.add_argument("--tag", help="Optional. Specific tag to clone from (time travel)")
branch_p.add_argument("--as-of", help="Optional. Specific timestamp to clone from (e.g., '2026-05-04 13:13:00' or Unix microseconds)")
branch_p.add_argument("--create-tag", help="Optional. Immediately tag the newly created branch with this name")
branch_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)
# 2. Drop Command
drop_p = subparsers.add_parser("drop", help="Safely drop a database branch")
drop_p.add_argument("branch_name", help="Branch to drop")
drop_p.add_argument("-y", "--yes", action="store_true", help="Skip confirmation prompt")
drop_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)
# 3. Reset Command
reset_p = subparsers.add_parser("reset", help="Drop and recreate a branch from its original source")
reset_p.add_argument("branch_name", help="Branch to reset")
reset_p.add_argument("-y", "--yes", action="store_true", help="Skip confirmation prompt")
reset_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)
# 4. List Command
list_p = subparsers.add_parser("list", help="List all active branches and their tags")
list_p.add_argument("--owner", help="Filter by specific Postgres role")
# 5. Prune Command
prune_p = subparsers.add_parser("prune", help="Garbage collect old branches")
prune_p.add_argument("--days", type=int, default=7, help="Drop branches older than X days (default: 7)")
prune_p.add_argument("--dry-run", action="store_true", help="Show what would be dropped without deleting")
prune_p.add_argument("-y", "--yes", action="store_true", help="Skip confirmation prompts")
prune_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)
# 6. Tag Command
tag_p = subparsers.add_parser("tag", help="Create a release tag")
tag_p.add_argument("tag_name", help="Name of tag (e.g. 'v1.0')")
tag_p.add_argument("source", help="Database to tag (can be a source db or a branch)")
tag_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)
# 7. Tags Command
subparsers.add_parser("tags", help="List all release tags")
# 8. Log Command
log_p = subparsers.add_parser("log", help="Show branch activity history")
log_p.add_argument("--limit", type=int, default=10, help="Number of history records to show")
args = parser.parse_args()
# Route commands
if args.command == "branch":
create_branch(args.user, args.source, args.branch_name, args.tag, args.as_of, args.owner, args.create_tag)
elif args.command == "drop":
drop_branch(args.branch_name, args.yes, args.user)
elif args.command == "reset":
reset_branch(args.branch_name, args.yes, args.user)
elif args.command == "list":
list_branches(args.owner)
elif args.command == "prune":
prune_branches(args.days, args.dry_run, args.yes, args.user)
elif args.command == "tag":
create_tag(args.tag_name, args.source, args.user)
elif args.command == "tags":
list_tags()
elif args.command == "log":
show_log(args.limit)
To make yb-branch act like a native system command (so you can run it from any directory without typing ./ or python3), you need to move it into a directory that is included in your system’s $PATH.
The standard location for custom administrative scripts on Linux is /usr/local/bin.
Run these commands to create an executable from the python script and move it:
cp yb-branch.py yb-branch
chmod +x yb-branch
mv yb-branch /usr/local/bin/
Verify it works
yb-branch --help
If it prints the help menu, you are fully set up!
usage: yb-branch [-h] {branch,drop,reset,list,prune,tag,tags,log} ...
yb-branch: Git-like Versioning for YugabyteDB
-------------------------------------------------
Instantly clone, version, reset, and prune YugabyteDB databases using zero-copy storage.
All operations are securely audited and natively integrate with YugabyteDB's
Instant Cloning and Point-In-Time-Recovery (PITR) features.
optional arguments:
-h, --help show this help message and exit
Available Commands:
{branch,drop,reset,list,prune,tag,tags,log}
branch Create a new database branch
drop Safely drop a database branch
reset Drop and recreate a branch from its original source
list List all active branches and their tags
prune Garbage collect old branches
tag Create a release tag
tags List all release tags
log Show branch activity history
ENVIRONMENT VARIABLES REQUIRED:
YB_HOST, YB_PORT, YB_USER, YB_PASS, YB_ADMIN_DB, YB_SEED_MASTER
SSL/TLS CONFIGURATION (Optional):
YB_SSL_MODE (Default: 'prefer'. Options: 'require', 'verify-ca', 'verify-full')
YB_SSL_ROOT_CERT (Path to root.crt)
YB_SSL_CERT (Path to client.crt)
YB_SSL_KEY (Path to client.key)
YB_CERTS_DIR (Path to the certs directory required by yb-admin)
From now on, any user on that server (with the correct environment variables set) can just type yb-branch from anywhere.
How Instant Cloning Works
YugabyteDB cloning is fast because it does not copy data upfront.
Instead:
- β The clone shares underlying data files initially
- β Changes are written as new delta files
- β Source and clone are fully isolated logically
Example:
CREATE DATABASE dev_clone TEMPLATE prod_db;
Point-in-time clone:
CREATE DATABASE recovery_clone
TEMPLATE prod_db
AS OF '2026-05-04 13:13:00';
Prerequisites
Database
Cloning must be enabled:
./bin/yugabyted start \
--advertise_address=127.0.0.1 \
--master_flags "enable_db_clone=true"
Enable PITR (required for time-travel clones):
./bin/yb-admin \
--master_addresses ip1:7100,ip2:7100,ip3:7100 \
create_snapshot_schedule 1440 10080 ysql.prod_db
CLI Tool
To run the yb-branch CLI tool, you only need one Python library and one system binary. Thatβs it.
This was intentionally designed to be lightweight and easy to deploy anywhere (laptop, server, CI/CD runner).
1. Python Version
You need Python 3.6 or higher.
The script uses modern features like f-strings, so older versions will not work.
2. PostgreSQL Driver (psycopg2)
The script uses psycopg2 to:
- β Connect to YugabyteDB (YSQL)
- β Execute
CREATE DATABASE ... TEMPLATE ... AS OF - β Manage audit tables and metadata
Install it using the precompiled binary:
pip3 install psycopg2-binary
3. System Binary (yb-admin)
The script also depends on the yb-admin CLI.
It uses it to:
- β Discover all master nodes dynamically
- β Verify clone status (ensure it reaches
COMPLETE)
yb-admin binary must be installed and available in your system $PATH.If you are running this on a YugabyteDB node, youβre already covered… itβs included by default.
Top Use Cases
| Use Case | What It Enables |
| Ephemeral CI/CD Environments | Dedicated database per Pull Request |
| Safe Schema Migrations | Test risky changes safely |
| Surgical Incident Recovery | Recover data without full restore |
1. Ephemeral CI/CD Environments
Instead of maintaining shared, constantly breaking staging databases, your CI/CD pipeline can spin up a fresh database for every pull request.
yb-branch branch prod_db pr-1234 --owner ci_runner --create-tag fix-pr-1234
Example output:
[*] Initiating instant clone from 'prod_db' to 'pr-1234'...
[+] SUCCESS! Database branch 'pr-1234' is ready. Owned by: ci_runner
List the clones:
yb-branch list
Example output:
BRANCH NAME | SOURCE DB | OWNER | CLONED AT | BASED ON | TAGS ON BRANCH
---------------------------------------------------------------------------------------------------------------------
pr-1234 | prod_db | ci_runner | 2026-05-04 17:28 | latest | fix-pr-1234
When tests complete:
yb-branch drop pr-1234 -y
Example output:
[*] Initiating drop for database 'pr-1234'...
[+] SUCCESS! Clone 'pr-1234' has been dropped.
No drift. No collisions. No cleanup headaches.
2. Safe Schema Migrations
Want to test a large or risky schema change?
yb-branch branch prod_db migration_test --owner dba_team
Example output:
[*] Initiating instant clone from 'prod_db' to 'migration_test'...
[+] SUCCESS! Database branch 'migration_test' is ready. Owned by: dba_team
Run your migration safely.
If it fails:
yb-branch reset migration_test -y
Example output:
[*] Resetting branch 'migration_test' (Source: prod_db, Owner: dba_team)
[*] Initiating drop for database 'migration_test'...
[+] SUCCESS! Clone 'migration_test' has been dropped.
[*] Recreating branch 'migration_test'...
[*] Initiating instant clone from 'prod_db' to 'migration_test'...
[+] SUCCESS! Database branch 'migration_test' is ready. Owned by: dba_team
Instant rollback. No restore needed.
3. Surgical Incident Recovery
A record was deleted at 1:12 PM.
Instead of restoring everything:
yb-branch branch prod_db recovery_db --as-of '2026-05-04 13:10:00'
Example output:
[*] Initiating instant clone from 'prod_db' to 'recovery_db'...
[*] Time travel active: Cloning AS OF 2026-05-04 13:10:00
[+] SUCCESS! Database branch 'recovery_db' is ready. Owned by: dba_team
Then:
- β Query the clone
- β Extract the data
- β Restore only whatβs needed
Tracking and Governance Tables
To avoid chaos, we track everything.
Audit Log
-- Audit Log for all branching and dropping activity
CREATE TABLE IF NOT EXISTS db_clone_audit_log (
log_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
requested_by VARCHAR(255) NOT NULL,
-- Updated to allow alphanumeric characters, underscores, and hyphens
source_db text NOT NULL CHECK (source_db ~ '^[a-zA-Z0-9_-]+$'),
target_clone_db text NOT NULL CHECK (target_clone_db ~ '^[a-zA-Z0-9_-]+$'),
point_in_time TIMESTAMP NOT NULL,
status VARCHAR(50) NOT NULL DEFAULT 'PENDING'
CHECK (status IN ('PENDING', 'CLONING', 'SUCCESS', 'FAILED', 'DROPPED')),
created_at TIMESTAMP DEFAULT NOW(),
completed_at TIMESTAMP,
dropped_at TIMESTAMP
);
Version Tags (Git-style)
-- Version Tags (Like Git Tags)
CREATE TABLE IF NOT EXISTS db_version_tags (
tag_name text PRIMARY KEY CHECK (tag_name ~ '^[a-zA-Z0-9_.-]+$'),
source_db text NOT NULL,
point_in_time TIMESTAMP NOT NULL,
created_by VARCHAR(255),
created_at TIMESTAMP DEFAULT NOW()
);
Stored Procedures (Audit Hooks)
We log before and after the clone operation.
Request a Branch
CREATE OR REPLACE PROCEDURE request_db_branch(
p_requester IN VARCHAR,
p_source_db IN VARCHAR,
p_branch_name IN VARCHAR,
p_point_in_time IN TIMESTAMP,
p_log_id INOUT UUID
)
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO db_clone_audit_log (
requested_by,
source_db,
target_clone_db,
point_in_time,
status
)
VALUES (
p_requester,
p_source_db,
p_branch_name,
COALESCE(p_point_in_time, NOW()),
'PENDING'
)
RETURNING log_id INTO p_log_id;
END;
$$;
Finalize Status
CREATE OR REPLACE PROCEDURE finalize_db_branch(
p_log_id IN UUID,
p_final_status IN VARCHAR
)
LANGUAGE plpgsql
AS $$
BEGIN
UPDATE db_clone_audit_log
SET status = p_final_status,
completed_at = NOW()
WHERE log_id = p_log_id;
END;
$$;
Enterprise-Ready: Security and CI/CD Integration
yb-branch is designed to be secure by default.
All configuration is driven by environment variables:
export YB_HOST="192.168.1.50"
export YB_PORT="5433"
export YB_USER="yugabyte"
export YB_PASS="SuperSecretPass123"
export YB_ADMIN_DB="yugabyte"
export YB_SEED_MASTER="192.168.1.50:7100"
TLS / SSL Support
If your YugabyteDB cluster requires encrypted connections, yb-branch now supports full SSL configuration via environment variables.
Simply export the required values before running the tool:
export YB_SSL_MODE="verify-full"
export YB_SSL_ROOT_CERT="/etc/yugabyte/certs/root.crt"
export YB_CERTS_DIR="/etc/yugabyte/certs"
verify-full in production to enforce full certificate validation (hostname + CA). This prevents man-in-the-middle attacks and ensures you’re connecting to the correct cluster. Why CI/CD Pipelines Love This
- β Credentials stored securely in secrets vaults
- β Injected at runtime
- β Automatically masked in logs
- β Destroyed after execution
SSL support:
- β Secure connections end-to-end
- β No plaintext credentials or traffic
- β Fully compliant with production security requirements
This means:
- β Same script works everywhere
- β Security posture is preserved
- β No friction for developers
Here is a rapid-fire, start-to-finish demo of what a standard developer workflow looks like using yb-branch.
Let’s imagine you are tasked with running a risky schema migration against production data. You want to test it in total isolation, mess up, start over, and clean up after yourself.
Here is exactly how that plays out in the terminal, complete with the Git analogies!
1. Anchor the current production state (The git tag)
Before doing anything, the Release Manager tags the current state of production.
$ yb-branch tag v2.4-stable prod_db
[+] Tag 'v2.4-stable' created successfully for 'prod_db'.
2. Create your isolated test environment (The git checkout -b)
You spin up your own dedicated database clone to test the risky migration. You decide to immediately tag its starting state just in case.
$ yb-branch branch prod_db dev_migration_test --owner my_dev_role --create-tag pre-migration-start
[*] Initiating instant clone from 'prod_db' to 'dev_migration_test'...
[+] SUCCESS! Database branch 'dev_migration_test' is ready. Owned by: my_dev_role
[*] Automatically applying tag 'pre-migration-start' to new branch 'dev_migration_test'...
[+] Tag 'pre-migration-start' created successfully for 'dev_migration_test'.
3. Verify your workspace (The git branch & git status)
Take a quick look to see exactly what you are working with.
$ yb-branch list
BRANCH NAME | SOURCE DB | OWNER | CLONED AT | BASED ON | TAGS ON BRANCH
---------------------------------------------------------------------------------------------------------------------
dev_migration_test | prod_db | my_dev_role | 2026-05-04 20:29 | latest | pre-migration-start
4. Oops… Mistakes were made (The git reset --hard)
You run your SQL script against dev_migration_test, but you accidentally dropped the wrong table! The data is ruined. Instead of running a bunch of manual restore commands, you just instantly reset the branch back to a clean slate.
$ yb-branch reset dev_migration_test -y
[*] Resetting branch 'dev_migration_test' (Source: prod_db, Owner: my_dev_role)
[*] Initiating drop for database 'dev_migration_test'...
[+] SUCCESS! Clone 'dev_migration_test' has been dropped.
[*] Recreating branch 'dev_migration_test'...
[*] Initiating instant clone from 'prod_db' to 'dev_migration_test'...
[+] SUCCESS! Database branch 'dev_migration_test' is ready. Owned by: my_dev_role
5. Review the Audit Log (The git log)
Curious what the DBA team sees? The tool automatically tracked all of your rapid-fire creation, dropping, and resetting.
[root@localhost ~]# yb-branch log --limit 4
[2026-05-04 20:34:48] root CREATED branch 'dev_migration_test' (Source: prod_db) [Status: SUCCESS]
[2026-05-04 20:34:38] root DROPPED branch 'dev_migration_test' (Source: prod_db) [Status: DROPPED]
[2026-05-04 20:29:20] root CREATED branch 'dev_migration_test' (Source: prod_db) [Status: SUCCESS]
6. Clean up your mess (The git branch -D)
Your test was finally successful. You merge your code, and you throw the database away so it stops consuming cluster resources.
$ yb-branch drop dev_migration_test -y
[*] Initiating drop for database 'dev_migration_test'...
[*] Blocking new connections to 'dev_migration_test'...
[*] Terminating active connections to 'dev_migration_test'...
[+] SUCCESS! Clone 'dev_migration_test' has been dropped.
And just like that, you managed a massive database environment with the exact same speed and workflow you use for source code!
β οΈ Guardrails (Do Not Skip This)
Cloning is fast… but not free.
Each clone:
- β Creates new logical tablets
- β Increases CPU + memory usage
- β Can increase disk usage over time (compaction behavior)
- β Counts against cluster limits
Best Practices
- β Require an owner for every clone
- β Use naming conventions (pr_, dev_, recovery_)
- β Track all clones in audit tables
- β Prune aggressively
- β Limit clone lifespan
- β Monitor tablet counts and resource usage
- β Include clone load in cluster sizing
Conclusion
Instant cloning fundamentally changes how teams work with data.
Instead of slow, heavy database copies, you get:
- β Fast, isolated environments
- β Safe testing against real data
- β Surgical recovery workflows
- β Cleaner CI/CD pipelines
The yb-branch pattern shows how little glue code is needed to unlock a powerful developer experience.
Once you try this workflow, it is hard to go back.
Have Fun!
