Building a Git-like Database Versioning Tool with Instant Cloning

If you are a software engineer, you probably use Git every day.

Need to test something risky?
Create a branch.

Need to preserve a known-good state?
Create a tag.

Need to throw away a bad idea?
Delete the branch and move on.

But databases have traditionally not worked that way.

Testing schema changes, debugging production data, or running QA usually means copying large datasets, waiting, and burning resources.

With YugabyteDB Instant Cloning, that changes.

You can now treat your database more like code: branch it, test it, reset it, and drop it… all in seconds.

πŸ’‘ Key Insight

YugabyteDB Instant Cloning enables zero-copy, writable database branchesβ€”making it possible to implement Git-like workflows directly at the database layer. Instead of restoring full backups (slow, disruptive, and often losing recent data), you can clone your database at a precise moment in time, inspect or extract exactly what you need, and recover without impacting production.

πŸ‘‰ Learn more about Instant Cloning

The Idea: yb-branch

This tip walks through a simple tool concept: yb-branch.

The goal is to make database workflows feel like Git:

  • ● branch
  • ● tag
  • ● reset
  • ● drop
  • ● list
  • ● prune
  • ● audit everything

Example:

				
					yb-branch branch prod_db dev_feature_z --owner my_dev_role --create-tag z-initial-state
				
			

Or time-travel:

				
					yb-branch branch prod_db recovery_db \
  --owner dba_admin \
  --as-of "$(date -d '5 minutes ago' +'%Y-%m-%d %H:%M:%S')"
				
			

The full CLI implementation supports all major lifecycle operations and integrates audit logging and tagging.

yb-branch.py:
				
					#!/usr/bin/env python3
import argparse
import psycopg2
from psycopg2 import sql
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
import subprocess
import json
import os
import sys

# ==========================================
# CONFIGURATION VIA ENVIRONMENT VARIABLES
# ==========================================
# Core Database Credentials
DB_HOST = os.getenv("YB_HOST", "127.0.0.1")
DB_PORT = os.getenv("YB_PORT", "5433")
DB_USER = os.getenv("YB_USER", "yugabyte")
DB_PASS = os.getenv("YB_PASS", "yugabyte")
ADMIN_DB = os.getenv("YB_ADMIN_DB", "yugabyte")
SEED_MASTER = os.getenv("YB_SEED_MASTER", "127.0.0.1:7100")

# SSL / TLS Configuration
SSL_MODE = os.getenv("YB_SSL_MODE", "prefer")  # e.g., disable, require, verify-ca, verify-full
SSL_ROOT_CERT = os.getenv("YB_SSL_ROOT_CERT")  # Path to root.crt
SSL_CERT = os.getenv("YB_SSL_CERT")            # Path to client certificate
SSL_KEY = os.getenv("YB_SSL_KEY")              # Path to client key
CERTS_DIR = os.getenv("YB_CERTS_DIR")          # Used by yb-admin for TLS-enabled clusters

# ==========================================
# UTILITY FUNCTIONS
# ==========================================

def get_db_connection(dbname=ADMIN_DB, autocommit=False):
    """Establishes a secure database connection with optional SSL."""
    try:
        # Build connection arguments dynamically to only include provided SSL paths
        conn_args = {
            "host": DB_HOST,
            "port": DB_PORT,
            "user": DB_USER,
            "password": DB_PASS,
            "dbname": dbname,
            "sslmode": SSL_MODE
        }
        if SSL_ROOT_CERT: conn_args["sslrootcert"] = SSL_ROOT_CERT
        if SSL_CERT: conn_args["sslcert"] = SSL_CERT
        if SSL_KEY: conn_args["sslkey"] = SSL_KEY

        conn = psycopg2.connect(**conn_args)

        if autocommit:
            conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
        return conn
    except Exception as e:
        print(f"[-] Database connection failed: {e}")
        sys.exit(1)

def get_all_masters(seed_master):
    """Dynamically fetches all master addresses for high availability."""
    cmd = ["yb-admin", "--master_addresses", seed_master]
    if CERTS_DIR:
        cmd.extend(["--certs_dir_name", CERTS_DIR])
    cmd.append("list_all_masters")

    try:
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
        lines = result.stdout.strip().split('\n')
        master_addresses = []
        for line in lines[1:]:
            if line.strip():
                parts = line.split()
                if len(parts) >= 2:
                    master_addresses.append(parts[1])
        if master_addresses:
            return ",".join(master_addresses)
        return seed_master
    except Exception:
        return seed_master

def get_db_oid(cursor, db_name):
    """Fetches the PostgreSQL OID for a given database."""
    cursor.execute("SELECT oid FROM pg_database WHERE datname = %s", (db_name,))
    row = cursor.fetchone()
    return row[0] if row else None

def check_clone_status_via_ybadmin(dynamic_masters, source_db_oid):
    """Calls yb-admin list_clones to verify the cluster-level status."""
    namespace_id = f"00004000000030008000{source_db_oid:012x}"

    cmd = ["yb-admin", "--master_addresses", dynamic_masters]
    if CERTS_DIR:
        cmd.extend(["--certs_dir_name", CERTS_DIR])
    cmd.extend(["list_clones", namespace_id])

    try:
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
        return json.loads(result.stdout)
    except Exception:
        return []

# ==========================================
# CORE WORKFLOWS
# ==========================================

def create_branch(user, source_db, branch_db, tag=None, as_of=None, owner_role=None, new_tag=None):
    """Creates a database clone supporting tags or raw timestamps, and optionally tags the new clone."""
    if tag and as_of:
        print("[-] Error: Please specify either --tag OR --as-of, not both.")
        return False

    dynamic_masters = get_all_masters(SEED_MASTER)
    conn = get_db_connection()
    cur = conn.cursor()

    target_time_str = None

    if tag:
        cur.execute("SELECT point_in_time FROM db_version_tags WHERE tag_name = %s", (tag,))
        row = cur.fetchone()
        if not row:
            print(f"[-] Error: Tag '{tag}' not found in db_version_tags.")
            return False
        target_time_str = row[0].strftime('%Y-%m-%d %H:%M:%S.%f')
    elif as_of:
        target_time_str = as_of

    cur.execute("CALL request_db_branch(%s, %s, %s, %s, NULL);",
                (user, source_db, branch_db, target_time_str))
    log_id = cur.fetchone()[0]
    conn.commit()

    source_oid = get_db_oid(cur, source_db)
    cur.close()
    conn.close()

    if not source_oid:
        print(f"[-] Error: Source database '{source_db}' does not exist.")
        return False

    ddl_conn = get_db_connection(autocommit=True)
    ddl_cur = ddl_conn.cursor()

    try:
        ddl_cur.execute("CALL finalize_db_branch(%s, 'CLONING');", (log_id,))
        print(f"[*] Initiating instant clone from '{source_db}' to '{branch_db}'...")

        if target_time_str:
            print(f"[*] Time travel active: Cloning AS OF {target_time_str}")
            query = sql.SQL("CREATE DATABASE {} OWNER {} TEMPLATE {} AS OF {};").format(
                sql.Identifier(branch_db), sql.Identifier(owner_role), sql.Identifier(source_db),
                sql.Literal(target_time_str)
            )
        else:
            query = sql.SQL("CREATE DATABASE {} OWNER {} TEMPLATE {};").format(
                sql.Identifier(branch_db), sql.Identifier(owner_role), sql.Identifier(source_db)
            )

        ddl_cur.execute(query)

        clones = check_clone_status_via_ybadmin(dynamic_masters, source_oid)
        our_clone = next((c for c in clones if c.get('target_namespace_name') == branch_db), None)

        if our_clone and our_clone.get('aggregate_state') != 'COMPLETE':
            print("[!] Warning: YB-Master reports clone is not yet COMPLETE.")

        ddl_cur.execute("CALL finalize_db_branch(%s, 'SUCCESS');", (log_id,))
        print(f"[+] SUCCESS! Database branch '{branch_db}' is ready. Owned by: {owner_role}")

        if new_tag:
            print(f"[*] Automatically applying tag '{new_tag}' to new branch '{branch_db}'...")
            create_tag(new_tag, branch_db, user)

        return True

    except Exception as e:
        print(f"[-] Clone failed: {e}")
        ddl_cur.execute("CALL finalize_db_branch(%s, 'FAILED');", (log_id,))
        return False
    finally:
        ddl_cur.close()
        ddl_conn.close()

def drop_branch(branch_name, assume_yes=False, user="system"):
    """Safely drops a clone database and updates the audit log."""
    conn = get_db_connection()
    cur = conn.cursor()

    cur.execute("""
        SELECT log_id, source_db
        FROM db_clone_audit_log
        WHERE target_clone_db = %s AND status = 'SUCCESS'
        ORDER BY created_at DESC LIMIT 1;
    """, (branch_name,))
    row = cur.fetchone()

    if not row:
        print(f"[-] ERROR: '{branch_name}' is not recognized as an active clone.")
        cur.close()
        conn.close()
        return False

    log_id, source_db = row
    cur.close()
    conn.close()

    if not assume_yes:
        print(f"\n[!] WARNING: You are about to permanently drop '{branch_name}'.")
        choice = input("    Are you sure you want to proceed? [y/N]: ").strip().lower()
        if choice != 'y':
            print("[*] Drop operation cancelled.")
            return False

    print(f"[*] Initiating drop for database '{branch_name}'...")
    ddl_conn = get_db_connection(autocommit=True)
    ddl_cur = ddl_conn.cursor()

    try:
        # 1. Lock the door: Prevent any NEW connections from being established
        print(f"[*] Blocking new connections to '{branch_name}'...")
        ddl_cur.execute(sql.SQL("ALTER DATABASE {} ALLOW_CONNECTIONS false;").format(sql.Identifier(branch_name)))

        # 2. Kick everyone out: Terminate existing connections
        print(f"[*] Terminating active connections to '{branch_name}'...")
        ddl_cur.execute("""
            SELECT pg_terminate_backend(pid)
            FROM pg_stat_activity
            WHERE datname = %s AND pid <> pg_backend_pid();
        """, (branch_name,))

        # 3. Drop the database
        ddl_cur.execute(sql.SQL("DROP DATABASE {};").format(sql.Identifier(branch_name)))

        # 4. Update the audit log
        log_conn = get_db_connection()
        log_cur = log_conn.cursor()
        log_cur.execute("UPDATE db_clone_audit_log SET status = 'DROPPED', dropped_at = NOW() WHERE log_id = %s;", (log_id,))
        log_conn.commit()
        log_conn.close()

        print(f"[+] SUCCESS! Clone '{branch_name}' has been dropped.")
        return True

    except psycopg2.errors.ObjectInUse:
        print(f"[-] ERROR: Cannot drop '{branch_name}'. Clients are currently connected.")
        # If it fails, unlock the door so the database isn't left in a permanently locked state
        try:
            ddl_cur.execute(sql.SQL("ALTER DATABASE {} ALLOW_CONNECTIONS true;").format(sql.Identifier(branch_name)))
        except:
            pass
        return False

    except Exception as e:
        print(f"[-] Failed to drop database: {e}")
        # Attempt to unlock just in case it failed for another reason
        try:
            ddl_cur.execute(sql.SQL("ALTER DATABASE {} ALLOW_CONNECTIONS true;").format(sql.Identifier(branch_name)))
        except:
            pass
        return False

    finally:
        ddl_conn.close()

def reset_branch(branch_name, assume_yes, user):
    """Drops an active branch and instantly recreates it from its original source."""
    conn = get_db_connection()
    cur = conn.cursor()

    cur.execute("""
        SELECT a.source_db, r.rolname as owner_role
        FROM db_clone_audit_log a
        INNER JOIN pg_database d ON a.target_clone_db = d.datname
        INNER JOIN pg_roles r ON d.datdba = r.oid
        WHERE a.target_clone_db = %s AND a.status = 'SUCCESS'
        ORDER BY a.created_at DESC LIMIT 1;
    """, (branch_name,))
    row = cur.fetchone()
    cur.close()
    conn.close()

    if not row:
        print(f"[-] ERROR: Could not find active branch '{branch_name}' to reset.")
        return

    source_db, owner_role = row
    print(f"[*] Resetting branch '{branch_name}' (Source: {source_db}, Owner: {owner_role})")

    if drop_branch(branch_name, assume_yes, user):
        print(f"[*] Recreating branch '{branch_name}'...")
        create_branch(user, source_db, branch_name, None, None, owner_role, None)

def list_branches(owner_filter=None):
    """Lists all active database branches, when they were cloned, what they are based on, and their tags."""
    conn = get_db_connection()
    cur = conn.cursor()

    query = """
        SELECT
            a.target_clone_db,
            a.source_db,
            r.rolname,
            TO_CHAR(a.completed_at, 'YYYY-MM-DD HH24:MI'),
            COALESCE(
                (SELECT tag_name
                 FROM db_version_tags
                 WHERE source_db = a.source_db AND point_in_time = a.point_in_time
                 LIMIT 1),
                CASE
                    WHEN a.point_in_time < a.created_at - INTERVAL '2 seconds'
                    THEN 'AS OF ' || TO_CHAR(a.point_in_time, 'YYYY-MM-DD HH24:MI')
                    ELSE 'latest'
                END
            ) as based_on_tag,
            COALESCE((
                SELECT string_agg(tag_name, ', ')
                FROM db_version_tags
                WHERE source_db = a.target_clone_db
            ), '') as branch_tags
        FROM db_clone_audit_log a
        INNER JOIN pg_database d ON a.target_clone_db = d.datname
        INNER JOIN pg_roles r ON d.datdba = r.oid
        WHERE a.status = 'SUCCESS'
    """

    params = []
    if owner_filter:
        query += " AND r.rolname = %s"
        params.append(owner_filter)
    query += " ORDER BY a.completed_at DESC;"

    try:
        cur.execute(query, params)
        rows = cur.fetchall()

        if not rows:
            print("[*] No active branches found.")
            return

        print(f"{'BRANCH NAME':<20} | {'SOURCE DB':<15} | {'OWNER':<15} | {'CLONED AT':<16} | {'BASED ON':<22} | {'TAGS ON BRANCH'}")
        print("-" * 117)
        for r in rows:
            print(f"{r[0]:<20} | {r[1]:<15} | {r[2]:<15} | {r[3]:<16} | {r[4]:<22} | {r[5]}")

    except Exception as e:
        print(f"[-] Failed to fetch branches: {e}")
    finally:
        cur.close()
        conn.close()

def prune_branches(days, dry_run, assume_yes, user):
    """Garbage collects branches older than X days."""
    conn = get_db_connection()
    cur = conn.cursor()
    cur.execute("""
        SELECT a.target_clone_db, TO_CHAR(a.completed_at, 'YYYY-MM-DD')
        FROM db_clone_audit_log a
        INNER JOIN pg_database d ON a.target_clone_db = d.datname
        WHERE a.status = 'SUCCESS' AND a.completed_at < NOW() - %s::interval;
    """, (f"{days} days",))
    rows = cur.fetchall()
    cur.close()
    conn.close()

    if not rows:
        print(f"[*] No active branches older than {days} days found.")
        return

    print(f"[*] Found {len(rows)} branch(es) older than {days} days.")
    for branch, date in rows:
        if dry_run:
            print(f"    [DRY RUN] Would drop: {branch} (Created: {date})")
        else:
            drop_branch(branch, assume_yes, user)

def create_tag(tag_name, source_db, user):
    """Creates a snapshot tag."""
    conn = get_db_connection()
    cur = conn.cursor()
    try:
        cur.execute("INSERT INTO db_version_tags (tag_name, source_db, point_in_time, created_by) VALUES (%s, %s, NOW(), %s)",
                   (tag_name, source_db, user))
        conn.commit()
        print(f"[+] Tag '{tag_name}' created successfully for '{source_db}'.")
    except psycopg2.IntegrityError:
        print(f"[-] Failed: Tag '{tag_name}' already exists.")
    except Exception as e:
        print(f"[-] Failed to create tag: {e}")
    finally:
        conn.close()

def list_tags():
    """Lists all release tags."""
    conn = get_db_connection()
    cur = conn.cursor()
    cur.execute("SELECT tag_name, source_db, TO_CHAR(point_in_time, 'YYYY-MM-DD HH24:MI:SS'), created_by FROM db_version_tags ORDER BY point_in_time DESC;")
    rows = cur.fetchall()
    if not rows:
        print("[*] No tags found.")
        return
    print(f"{'TAG NAME':<25} | {'DATABASE':<20} | {'CUT AT':<20} | {'CREATOR'}")
    print("-" * 85)
    for r in rows:
        print(f"{r[0]:<25} | {r[1]:<20} | {r[2]:<20} | {r[3]}")

def show_log(limit):
    """Displays the branching history by reconstructing events from timestamps."""
    conn = get_db_connection()
    cur = conn.cursor()

    # We fetch both the creation time and the dropped time
    cur.execute("""
        SELECT TO_CHAR(COALESCE(completed_at, created_at), 'YYYY-MM-DD HH24:MI:SS'),
               TO_CHAR(dropped_at, 'YYYY-MM-DD HH24:MI:SS'),
               requested_by, status, target_clone_db, source_db
        FROM db_clone_audit_log
        ORDER BY created_at DESC;
    """)
    rows = cur.fetchall()

    events = []
    for r in rows:
        created_str, dropped_str, req, stat, branch, source = r

        # 1. Everyone gets a creation event
        create_stat = "SUCCESS" if stat == "DROPPED" else stat
        events.append({
            'time_str': created_str,
            'msg': f"[{created_str}] {req} CREATED branch '{branch}' (Source: {source}) [Status: {create_stat}]"
        })

        # 2. If it was dropped, it gets a SECOND event in the timeline
        if dropped_str:
            events.append({
                'time_str': dropped_str,
                'msg': f"[{dropped_str}] {req} DROPPED branch '{branch}' (Source: {source}) [Status: DROPPED]"
            })

    # Sort all events chronologically (newest first) and apply the limit
    events.sort(key=lambda x: x['time_str'], reverse=True)

    for e in events[:limit]:
        print(e['msg'])

    cur.close()
    conn.close()

# ==========================================
# CLI ENTRY POINT WITH EXTENSIVE HELP
# ==========================================

if __name__ == "__main__":
    # --- SYS.ARGV INTERCEPTOR ---
    # This automatically swaps "--help branch" into "branch --help"
    # so argparse correctly triggers the subcommand help menu.
    if len(sys.argv) == 3 and sys.argv[1] in ['-h', '--help', 'help']:
        sys.argv[1], sys.argv[2] = sys.argv[2], '--help'
    # ----------------------------

    main_desc = """
    yb-branch: Git-like Versioning for YugabyteDB
    -------------------------------------------------
    Instantly clone, version, reset, and prune YugabyteDB databases using zero-copy storage.

    All operations are securely audited and natively integrate with YugabyteDB's
    Instant Cloning and Point-In-Time-Recovery (PITR) features.
    """

    main_epilog = """
    ENVIRONMENT VARIABLES REQUIRED:
      YB_HOST, YB_PORT, YB_USER, YB_PASS, YB_ADMIN_DB, YB_SEED_MASTER

    SSL/TLS CONFIGURATION (Optional):
      YB_SSL_MODE        (Default: 'prefer'. Options: 'require', 'verify-ca', 'verify-full')
      YB_SSL_ROOT_CERT   (Path to root.crt)
      YB_SSL_CERT        (Path to client.crt)
      YB_SSL_KEY         (Path to client.key)
      YB_CERTS_DIR       (Path to the certs directory required by yb-admin)
    """

    parser = argparse.ArgumentParser(
        description=main_desc,
        epilog=main_epilog,
        formatter_class=argparse.RawDescriptionHelpFormatter
    )
    subparsers = parser.add_subparsers(dest="command", required=True, title="Available Commands")

    # 1. Branch Command
    branch_help = """
    Create a new database branch (clone).
    Examples:
      yb-branch branch prod_db dev_db --owner user
      yb-branch branch prod_db qa_db --owner user --tag v1.0
      yb-branch branch prod_db inv_db --owner user --as-of '2026-05-04 13:00:00'
    """
    branch_p = subparsers.add_parser("branch", help="Create a new database branch", description=branch_help, formatter_class=argparse.RawDescriptionHelpFormatter)
    branch_p.add_argument("source", help="Source database to clone (e.g., 'prod_db')")
    branch_p.add_argument("branch_name", help="Name of the new branch")
    branch_p.add_argument("--owner", required=True, help="REQUIRED. Postgres Role that will own the branch")
    branch_p.add_argument("--tag", help="Optional. Specific tag to clone from (time travel)")
    branch_p.add_argument("--as-of", help="Optional. Specific timestamp to clone from (e.g., '2026-05-04 13:13:00' or Unix microseconds)")
    branch_p.add_argument("--create-tag", help="Optional. Immediately tag the newly created branch with this name")
    branch_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)

    # 2. Drop Command
    drop_p = subparsers.add_parser("drop", help="Safely drop a database branch")
    drop_p.add_argument("branch_name", help="Branch to drop")
    drop_p.add_argument("-y", "--yes", action="store_true", help="Skip confirmation prompt")
    drop_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)

    # 3. Reset Command
    reset_p = subparsers.add_parser("reset", help="Drop and recreate a branch from its original source")
    reset_p.add_argument("branch_name", help="Branch to reset")
    reset_p.add_argument("-y", "--yes", action="store_true", help="Skip confirmation prompt")
    reset_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)

    # 4. List Command
    list_p = subparsers.add_parser("list", help="List all active branches and their tags")
    list_p.add_argument("--owner", help="Filter by specific Postgres role")

    # 5. Prune Command
    prune_p = subparsers.add_parser("prune", help="Garbage collect old branches")
    prune_p.add_argument("--days", type=int, default=7, help="Drop branches older than X days (default: 7)")
    prune_p.add_argument("--dry-run", action="store_true", help="Show what would be dropped without deleting")
    prune_p.add_argument("-y", "--yes", action="store_true", help="Skip confirmation prompts")
    prune_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)

    # 6. Tag Command
    tag_p = subparsers.add_parser("tag", help="Create a release tag")
    tag_p.add_argument("tag_name", help="Name of tag (e.g. 'v1.0')")
    tag_p.add_argument("source", help="Database to tag (can be a source db or a branch)")
    tag_p.add_argument("--user", default=os.getenv('USER', 'system'), help=argparse.SUPPRESS)

    # 7. Tags Command
    subparsers.add_parser("tags", help="List all release tags")

    # 8. Log Command
    log_p = subparsers.add_parser("log", help="Show branch activity history")
    log_p.add_argument("--limit", type=int, default=10, help="Number of history records to show")

    args = parser.parse_args()

    # Route commands
    if args.command == "branch":
        create_branch(args.user, args.source, args.branch_name, args.tag, args.as_of, args.owner, args.create_tag)
    elif args.command == "drop":
        drop_branch(args.branch_name, args.yes, args.user)
    elif args.command == "reset":
        reset_branch(args.branch_name, args.yes, args.user)
    elif args.command == "list":
        list_branches(args.owner)
    elif args.command == "prune":
        prune_branches(args.days, args.dry_run, args.yes, args.user)
    elif args.command == "tag":
        create_tag(args.tag_name, args.source, args.user)
    elif args.command == "tags":
        list_tags()
    elif args.command == "log":
        show_log(args.limit)
				
			

To make yb-branch act like a native system command (so you can run it from any directory without typing ./ or python3), you need to move it into a directory that is included in your system’s $PATH.

The standard location for custom administrative scripts on Linux is /usr/local/bin.

Run these commands to create an executable from the python script and move it:

				
					cp yb-branch.py yb-branch
chmod +x yb-branch
mv yb-branch /usr/local/bin/
				
			

Verify it works

				
					yb-branch --help
				
			

If it prints the help menu, you are fully set up!

				
					usage: yb-branch [-h] {branch,drop,reset,list,prune,tag,tags,log} ...

    yb-branch: Git-like Versioning for YugabyteDB
    -------------------------------------------------
    Instantly clone, version, reset, and prune YugabyteDB databases using zero-copy storage.

    All operations are securely audited and natively integrate with YugabyteDB's
    Instant Cloning and Point-In-Time-Recovery (PITR) features.


optional arguments:
  -h, --help            show this help message and exit

Available Commands:
  {branch,drop,reset,list,prune,tag,tags,log}
    branch              Create a new database branch
    drop                Safely drop a database branch
    reset               Drop and recreate a branch from its original source
    list                List all active branches and their tags
    prune               Garbage collect old branches
    tag                 Create a release tag
    tags                List all release tags
    log                 Show branch activity history

    ENVIRONMENT VARIABLES REQUIRED:
      YB_HOST, YB_PORT, YB_USER, YB_PASS, YB_ADMIN_DB, YB_SEED_MASTER

    SSL/TLS CONFIGURATION (Optional):
      YB_SSL_MODE        (Default: 'prefer'. Options: 'require', 'verify-ca', 'verify-full')
      YB_SSL_ROOT_CERT   (Path to root.crt)
      YB_SSL_CERT        (Path to client.crt)
      YB_SSL_KEY         (Path to client.key)
      YB_CERTS_DIR       (Path to the certs directory required by yb-admin)
				
			

From now on, any user on that server (with the correct environment variables set) can just type yb-branch from anywhere.

How Instant Cloning Works

YugabyteDB cloning is fast because it does not copy data upfront.

Instead:

  • ● The clone shares underlying data files initially
  • ● Changes are written as new delta files
  • ● Source and clone are fully isolated logically

Example:

				
					CREATE DATABASE dev_clone TEMPLATE prod_db;
				
			

Point-in-time clone:

				
					CREATE DATABASE recovery_clone
TEMPLATE prod_db
AS OF '2026-05-04 13:13:00';
				
			

Prerequisites

Database

Cloning must be enabled:

				
					./bin/yugabyted start \
  --advertise_address=127.0.0.1 \
  --master_flags "enable_db_clone=true"
				
			

Enable PITR (required for time-travel clones):

				
					./bin/yb-admin \
  --master_addresses ip1:7100,ip2:7100,ip3:7100 \
  create_snapshot_schedule 1440 10080 ysql.prod_db
				
			

CLI Tool

To run the yb-branch CLI tool, you only need one Python library and one system binary. That’s it.

This was intentionally designed to be lightweight and easy to deploy anywhere (laptop, server, CI/CD runner).

1. Python Version

You need Python 3.6 or higher.

The script uses modern features like f-strings, so older versions will not work.

2. PostgreSQL Driver (psycopg2)

The script uses psycopg2 to:

  • ● Connect to YugabyteDB (YSQL)
  • ● Execute CREATE DATABASE ... TEMPLATE ... AS OF
  • ● Manage audit tables and metadata

Install it using the precompiled binary:

				
					pip3 install psycopg2-binary
				
			
3. System Binary (yb-admin)

The script also depends on the yb-admin CLI.

It uses it to:

  • ● Discover all master nodes dynamically
  • ● Verify clone status (ensure it reaches COMPLETE)
πŸ’‘ Requirement
The yb-admin binary must be installed and available in your system $PATH.

If you are running this on a YugabyteDB node, you’re already covered… it’s included by default.

Top Use Cases

Use Case What It Enables
Ephemeral CI/CD Environments Dedicated database per Pull Request
Safe Schema Migrations Test risky changes safely
Surgical Incident Recovery Recover data without full restore
1. Ephemeral CI/CD Environments

Instead of maintaining shared, constantly breaking staging databases, your CI/CD pipeline can spin up a fresh database for every pull request.

				
					yb-branch branch prod_db pr-1234 --owner ci_runner --create-tag fix-pr-1234
				
			

Example output:

				
					[*] Initiating instant clone from 'prod_db' to 'pr-1234'...
[+] SUCCESS! Database branch 'pr-1234' is ready. Owned by: ci_runner
				
			

List the clones:

				
					yb-branch list
				
			

Example output:

				
					BRANCH NAME          | SOURCE DB       | OWNER           | CLONED AT        | BASED ON               | TAGS ON BRANCH
---------------------------------------------------------------------------------------------------------------------
pr-1234              | prod_db         | ci_runner       | 2026-05-04 17:28 | latest                 | fix-pr-1234
				
			

When tests complete:

				
					yb-branch drop pr-1234 -y
				
			

Example output:

				
					[*] Initiating drop for database 'pr-1234'...
[+] SUCCESS! Clone 'pr-1234' has been dropped.
				
			

No drift. No collisions. No cleanup headaches.

2. Safe Schema Migrations

Want to test a large or risky schema change?

				
					yb-branch branch prod_db migration_test --owner dba_team
				
			

Example output:

				
					[*] Initiating instant clone from 'prod_db' to 'migration_test'...
[+] SUCCESS! Database branch 'migration_test' is ready. Owned by: dba_team
				
			

Run your migration safely.

If it fails:

				
					yb-branch reset migration_test -y
				
			

Example output:

				
					[*] Resetting branch 'migration_test' (Source: prod_db, Owner: dba_team)
[*] Initiating drop for database 'migration_test'...
[+] SUCCESS! Clone 'migration_test' has been dropped.
[*] Recreating branch 'migration_test'...
[*] Initiating instant clone from 'prod_db' to 'migration_test'...
[+] SUCCESS! Database branch 'migration_test' is ready. Owned by: dba_team
				
			

Instant rollback. No restore needed.

3. Surgical Incident Recovery

A record was deleted at 1:12 PM.

Instead of restoring everything:

				
					yb-branch branch prod_db recovery_db --as-of '2026-05-04 13:10:00'
				
			

Example output:

				
					[*] Initiating instant clone from 'prod_db' to 'recovery_db'...
[*] Time travel active: Cloning AS OF 2026-05-04 13:10:00
[+] SUCCESS! Database branch 'recovery_db' is ready. Owned by: dba_team
				
			

Then:

  • ● Query the clone
  • ● Extract the data
  • ● Restore only what’s needed

Tracking and Governance Tables

To avoid chaos, we track everything.

Audit Log
				
					-- Audit Log for all branching and dropping activity
CREATE TABLE IF NOT EXISTS db_clone_audit_log (
    log_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    requested_by VARCHAR(255) NOT NULL,
    -- Updated to allow alphanumeric characters, underscores, and hyphens
    source_db text NOT NULL CHECK (source_db ~ '^[a-zA-Z0-9_-]+$'),
    target_clone_db text NOT NULL CHECK (target_clone_db ~ '^[a-zA-Z0-9_-]+$'),
    point_in_time TIMESTAMP NOT NULL,
    status VARCHAR(50) NOT NULL DEFAULT 'PENDING' 
        CHECK (status IN ('PENDING', 'CLONING', 'SUCCESS', 'FAILED', 'DROPPED')),
    created_at TIMESTAMP DEFAULT NOW(),
    completed_at TIMESTAMP,
    dropped_at TIMESTAMP
);
				
			
Version Tags (Git-style)
				
					-- Version Tags (Like Git Tags)
CREATE TABLE IF NOT EXISTS db_version_tags (
    tag_name text PRIMARY KEY CHECK (tag_name ~ '^[a-zA-Z0-9_.-]+$'),
    source_db text NOT NULL,
    point_in_time TIMESTAMP NOT NULL,
    created_by VARCHAR(255),
    created_at TIMESTAMP DEFAULT NOW()
);
				
			
πŸ’‘ Why This Matters
Clones are easy to create. Without tracking, they become impossible to manage. Audit tables give you visibility, accountability, and control.

Stored Procedures (Audit Hooks)

We log before and after the clone operation.

Request a Branch
				
					CREATE OR REPLACE PROCEDURE request_db_branch(
    p_requester IN VARCHAR,
    p_source_db IN VARCHAR,
    p_branch_name IN VARCHAR,
    p_point_in_time IN TIMESTAMP,
    p_log_id INOUT UUID
)
LANGUAGE plpgsql
AS $$
BEGIN
    INSERT INTO db_clone_audit_log (
        requested_by,
        source_db,
        target_clone_db,
        point_in_time,
        status
    )
    VALUES (
        p_requester,
        p_source_db,
        p_branch_name,
        COALESCE(p_point_in_time, NOW()),
        'PENDING'
    )
    RETURNING log_id INTO p_log_id;
END;
$$;
				
			
Finalize Status
				
					CREATE OR REPLACE PROCEDURE finalize_db_branch(
    p_log_id IN UUID,
    p_final_status IN VARCHAR
)
LANGUAGE plpgsql
AS $$
BEGIN
    UPDATE db_clone_audit_log
    SET status = p_final_status,
        completed_at = NOW()
    WHERE log_id = p_log_id;
END;
$$;
				
			

Enterprise-Ready: Security and CI/CD Integration

yb-branch is designed to be secure by default.

All configuration is driven by environment variables:

				
					export YB_HOST="192.168.1.50"
export YB_PORT="5433"
export YB_USER="yugabyte"
export YB_PASS="SuperSecretPass123"
export YB_ADMIN_DB="yugabyte"
export YB_SEED_MASTER="192.168.1.50:7100"
				
			
πŸ’‘ Why This Matters
No credentials in code. No secrets in Git. The same script works across Dev, Staging, and Prod by simply swapping environment variables.
TLS / SSL Support

If your YugabyteDB cluster requires encrypted connections, yb-branch now supports full SSL configuration via environment variables.

Simply export the required values before running the tool:

				
					export YB_SSL_MODE="verify-full"
export YB_SSL_ROOT_CERT="/etc/yugabyte/certs/root.crt"
export YB_CERTS_DIR="/etc/yugabyte/certs"
				
			
πŸ’‘ Pro Tip
Use verify-full in production to enforce full certificate validation (hostname + CA). This prevents man-in-the-middle attacks and ensures you’re connecting to the correct cluster.
Why CI/CD Pipelines Love This
  • ● Credentials stored securely in secrets vaults
  • ● Injected at runtime
  • ● Automatically masked in logs
  • ● Destroyed after execution

SSL support:

  • ● Secure connections end-to-end
  • ● No plaintext credentials or traffic
  • ● Fully compliant with production security requirements

This means:

  • ● Same script works everywhere
  • ● Security posture is preserved
  • ● No friction for developers

Here is a rapid-fire, start-to-finish demo of what a standard developer workflow looks like using yb-branch.

Let’s imagine you are tasked with running a risky schema migration against production data. You want to test it in total isolation, mess up, start over, and clean up after yourself.

Here is exactly how that plays out in the terminal, complete with the Git analogies!

1. Anchor the current production state (The git tag)

Before doing anything, the Release Manager tags the current state of production.

				
					$ yb-branch tag v2.4-stable prod_db
[+] Tag 'v2.4-stable' created successfully for 'prod_db'.
				
			

2. Create your isolated test environment (The git checkout -b)

You spin up your own dedicated database clone to test the risky migration. You decide to immediately tag its starting state just in case.

				
					$ yb-branch branch prod_db dev_migration_test --owner my_dev_role --create-tag pre-migration-start
[*] Initiating instant clone from 'prod_db' to 'dev_migration_test'...
[+] SUCCESS! Database branch 'dev_migration_test' is ready. Owned by: my_dev_role
[*] Automatically applying tag 'pre-migration-start' to new branch 'dev_migration_test'...
[+] Tag 'pre-migration-start' created successfully for 'dev_migration_test'.
				
			

3. Verify your workspace (The git branch & git status)

Take a quick look to see exactly what you are working with.

				
					$ yb-branch list
BRANCH NAME          | SOURCE DB       | OWNER           | CLONED AT        | BASED ON               | TAGS ON BRANCH
---------------------------------------------------------------------------------------------------------------------
dev_migration_test   | prod_db         | my_dev_role     | 2026-05-04 20:29 | latest                 | pre-migration-start
				
			

4. Oops… Mistakes were made (The git reset --hard)

You run your SQL script against dev_migration_test, but you accidentally dropped the wrong table! The data is ruined. Instead of running a bunch of manual restore commands, you just instantly reset the branch back to a clean slate.

				
					$ yb-branch reset dev_migration_test -y
[*] Resetting branch 'dev_migration_test' (Source: prod_db, Owner: my_dev_role)
[*] Initiating drop for database 'dev_migration_test'...
[+] SUCCESS! Clone 'dev_migration_test' has been dropped.
[*] Recreating branch 'dev_migration_test'...
[*] Initiating instant clone from 'prod_db' to 'dev_migration_test'...
[+] SUCCESS! Database branch 'dev_migration_test' is ready. Owned by: my_dev_role
				
			

5. Review the Audit Log (The git log)

Curious what the DBA team sees? The tool automatically tracked all of your rapid-fire creation, dropping, and resetting.

				
					[root@localhost ~]# yb-branch log --limit 4
[2026-05-04 20:34:48] root CREATED branch 'dev_migration_test' (Source: prod_db) [Status: SUCCESS]
[2026-05-04 20:34:38] root DROPPED branch 'dev_migration_test' (Source: prod_db) [Status: DROPPED]
[2026-05-04 20:29:20] root CREATED branch 'dev_migration_test' (Source: prod_db) [Status: SUCCESS]
				
			

6. Clean up your mess (The git branch -D)

Your test was finally successful. You merge your code, and you throw the database away so it stops consuming cluster resources.

				
					$ yb-branch drop dev_migration_test -y
[*] Initiating drop for database 'dev_migration_test'...
[*] Blocking new connections to 'dev_migration_test'...
[*] Terminating active connections to 'dev_migration_test'...
[+] SUCCESS! Clone 'dev_migration_test' has been dropped.
				
			

And just like that, you managed a massive database environment with the exact same speed and workflow you use for source code!

⚠️ Guardrails (Do Not Skip This)

Cloning is fast… but not free.

Each clone:

  • ● Creates new logical tablets
  • ● Increases CPU + memory usage
  • ● Can increase disk usage over time (compaction behavior)
  • ● Counts against cluster limits
⚠️ Important
Do NOT treat clones as unlimited. Without cleanup, you will hit tablet, memory, and disk limits.

Best Practices

  • ● Require an owner for every clone
  • ● Use naming conventions (pr_, dev_, recovery_)
  • ● Track all clones in audit tables
  • ● Prune aggressively
  • ● Limit clone lifespan
  • ● Monitor tablet counts and resource usage
  • ● Include clone load in cluster sizing

Conclusion

Instant cloning fundamentally changes how teams work with data.

Instead of slow, heavy database copies, you get:

  • ● Fast, isolated environments
  • ● Safe testing against real data
  • ● Surgical recovery workflows
  • ● Cleaner CI/CD pipelines

The yb-branch pattern shows how little glue code is needed to unlock a powerful developer experience.

Once you try this workflow, it is hard to go back.

πŸ’‘ Final Takeaway
YugabyteDB Instant Cloning brings Git-style workflows to databases… branch, test, reset, and recover instantly. Just remember: fast cloning shifts the bottleneck from time to resource management. Use it wisely.

Have Fun!