Safely Archiving or Deleting Old YugabyteDB Log Files

YugabyteDB logs are incredibly useful for troubleshooting, performance analysis, auditing, and support investigations.

But like all logs, they need a little housekeeping.

YugabyteDB already rotates many of its log files automatically, but rotation is not the same thing as long-term retention management. Over time, older rotated logs can still consume disk space unless you archive or delete them.

So what is the best way to clean up old YugabyteDB logs?

The short answer: use the operating system, not pg_cron.

🧠 Key Insight

Use pg_cron for scheduled database work. Use Linux cron, systemd timers, logrotate, or your server automation tooling for operating system work like copying, compressing, archiving, or deleting log files.

Log Rotation vs. Log Retention

YugabyteDB nodes can produce several types of logs, including YB-Master logs, YB-TServer logs, YSQL logs, and other diagnostic files.

These logs are important when debugging issues, reviewing cluster behavior, or working with support. However, if old logs are never cleaned up, they can slowly consume disk space.

That makes log retention an operational task worth planning.

A good log retention strategy usually answers three questions:

1. How long should logs stay on the database nodes?
2. Should older logs be archived somewhere else?
3. When is it safe to delete them locally?

Where YugabyteDB Stores Logs

The exact log location depends on how YugabyteDB is deployed.

For YugabyteDB Anywhere-managed universes, logs are commonly available here:

				
					/home/yugabyte/master/logs
/home/yugabyte/tserver/logs

These directories are typically symlinks to the first directory listed in --fs_data_dirs.

For other deployments, logs are usually under the YugabyteDB data directory. For example:

				
					<yugabyte-data-directory>/master/logs
<yugabyte-data-directory>/tserver/logs

YugabyteDB also has WAL directories, tablet data directories, RocksDB files, and other internal storage paths.

Those are not regular log files.

⚠️ Important

This tip is about cleaning up diagnostic log files only. Do not delete anything under YugabyteDB WAL directories, tablet data directories, RocksDB directories, or DocDB data directories.

Why Not Use `pg_cron`?

pg_cron is designed for scheduling SQL commands inside the database.

That makes it useful for scheduled database tasks like this:

				
					SELECT cron.schedule(
  'daily-table-cleanup',
  '0 2 * * *',
  $$DELETE FROM app_events WHERE created_at < now() - interval '90 days'$$
);

But archiving or deleting log files is not really a database task.

This is an operating system task:

				
					find /home/yugabyte/tserver/logs -type f -mtime +14 -delete

That kind of work belongs in Linux cron, a systemd timer, logrotate, Ansible, Chef, Puppet, Kubernetes jobs, or whatever automation tool you already use to manage your servers.

Recommended Approach

The safest pattern is:

1. Let YugabyteDB write and rotate its own logs.
2. Use an operating system-level scheduled job to manage retention.
3. Archive older logs if needed.
4. Delete only files from known log directories.
5. Never run cleanup commands against broad YugabyteDB data paths.

For many environments, a simple daily cron job on each database node is enough.

Run the Job on Every DB Node

YugabyteDB log files are stored locally on each YB-Master and YB-TServer node.

That means a log cleanup or archive job needs to run on every database server in the cluster, not just one node.

For example, in a 3-node universe, you would typically install the cron job on all three nodes:

				
					node-1: install /usr/local/bin/cleanup-yb-logs.sh and add the cron entry
node-2: install /usr/local/bin/cleanup-yb-logs.sh and add the cron entry
node-3: install /usr/local/bin/cleanup-yb-logs.sh and add the cron entry

The cron entry would look the same on each node:

				
					15 2 * * * /usr/local/bin/cleanup-yb-logs.sh >> /var/log/cleanup-yb-logs.log 2>&1

In production, this is usually handled through configuration management or automation tooling such as Ansible, Puppet, Chef, Terraform, cloud-init, or a standard operations runbook.

🧠 Key Insight

YugabyteDB logs are node-local. If you manage retention with cron, install the same cleanup or archive job on every DB node, or use centralized logging so retention is handled outside the database servers.

Option 1: Delete Logs Older Than X Days

Here is a simple example that deletes YugabyteDB log files older than 14 days.

Create a script:

				
					sudo vi /usr/local/bin/cleanup-yb-logs.sh

Add the following:

				
					#!/usr/bin/env bash
set -euo pipefail

RETENTION_DAYS=14

LOG_DIRS=(
  "/home/yugabyte/master/logs"
  "/home/yugabyte/tserver/logs"
)

for LOG_DIR in "${LOG_DIRS[@]}"; do
  if [ ! -d "$LOG_DIR" ]; then
    echo "Skipping missing directory: $LOG_DIR"
    continue
  fi

  echo "Cleaning logs older than ${RETENTION_DAYS} days in ${LOG_DIR}"

  find "$LOG_DIR" \
    -type f \
    -mtime +"$RETENTION_DAYS" \
    -print \
    -delete
done

Make the script executable:

				
					sudo chmod +x /usr/local/bin/cleanup-yb-logs.sh

Schedule it with cron:

				
					sudo crontab -e

Add:

				
					15 2 * * * /usr/local/bin/cleanup-yb-logs.sh >> /var/log/cleanup-yb-logs.log 2>&1

This runs the cleanup every day at 2:15 AM.

Remember to install this cron entry on each DB node, unless your server automation tooling handles that for you.

Option 2: Archive Logs Before Deleting Them

In some environments, you may want to retain logs for troubleshooting, compliance, or support purposes.

In that case, archive the logs first, then delete older local copies later.

For example:

				
					sudo vi /usr/local/bin/archive-yb-logs.sh

Add:

				
					#!/usr/bin/env bash
set -euo pipefail

ARCHIVE_AFTER_DAYS=2
DELETE_AFTER_DAYS=30

HOSTNAME="$(hostname -s)"
ARCHIVE_ROOT="/mnt/log-archive/yugabyte/${HOSTNAME}"

LOG_DIRS=(
  "/home/yugabyte/master/logs"
  "/home/yugabyte/tserver/logs"
)

mkdir -p "$ARCHIVE_ROOT"

for LOG_DIR in "${LOG_DIRS[@]}"; do
  if [ ! -d "$LOG_DIR" ]; then
    echo "Skipping missing directory: $LOG_DIR"
    continue
  fi

  COMPONENT="$(basename "$(dirname "$LOG_DIR")")"
  DEST_DIR="${ARCHIVE_ROOT}/${COMPONENT}"

  mkdir -p "$DEST_DIR"

  echo "Archiving logs older than ${ARCHIVE_AFTER_DAYS} days from ${LOG_DIR} to ${DEST_DIR}"

  find "$LOG_DIR" \
    -type f \
    -mtime +"$ARCHIVE_AFTER_DAYS" \
    -print0 | while IFS= read -r -d '' file; do

      if command -v lsof >/dev/null 2>&1 && lsof "$file" >/dev/null 2>&1; then
        echo "Skipping open file: $file"
        continue
      fi

      gzip -c "$file" > "${DEST_DIR}/$(basename "$file").gz"
    done

  echo "Deleting local logs older than ${DELETE_AFTER_DAYS} days in ${LOG_DIR}"

  find "$LOG_DIR" \
    -type f \
    -mtime +"$DELETE_AFTER_DAYS" \
    -print \
    -delete
done

Make it executable:

				
					sudo chmod +x /usr/local/bin/archive-yb-logs.sh

Schedule it:

				
					sudo crontab -e

Add:

				
					30 2 * * * /usr/local/bin/archive-yb-logs.sh >> /var/log/archive-yb-logs.log 2>&1

This gives you two levels of retention:

				
					ARCHIVE_AFTER_DAYS=2
DELETE_AFTER_DAYS=30

That means:

● After 2 days: copy and compress logs to the archive location
● After 30 days: delete old local log files

Again, this job should run on each DB node because each node has its own local YugabyteDB logs.

Option 3: Archive Logs to Object Storage

You can also archive logs to object storage such as S3, assuming the required CLI tool and credentials are already configured on the server.

For example:

				
					aws s3 cp "$file" "s3://my-yb-log-archive/${HOSTNAME}/${COMPONENT}/$(basename "$file")"

A simple archive loop might look like this:

				
					find "$LOG_DIR" \
  -type f \
  -mtime +2 \
  -print0 | while IFS= read -r -d '' file; do

    if command -v lsof >/dev/null 2>&1 && lsof "$file" >/dev/null 2>&1; then
      echo "Skipping open file: $file"
      continue
    fi

    aws s3 cp "$file" "s3://my-yb-log-archive/${HOSTNAME}/${COMPONENT}/$(basename "$file")"
done

From there, an object storage lifecycle policy can move older logs to cheaper storage or expire them automatically.

This job also needs to run on each DB node unless logs are already being collected centrally.

What About `logrotate`?

logrotate can also be useful, especially if your operations team already uses it.

However, YugabyteDB already performs its own log rotation. For example, the --max_log_size flag controls the maximum size of YB-Master and YB-TServer log files.

Because of that, I usually prefer not to make logrotate responsible for rotating YugabyteDB logs again.

Instead, use a cleanup or archive job that works on files YugabyteDB has already rotated.

In other words:

● Use YugabyteDB for log creation and rotation.
● Use operating system tooling for retention and archival.

Be Careful With find

The find command is powerful, but it can also be dangerous if the path is too broad.

Good:

				
					find /home/yugabyte/master/logs -type f -mtime +14 -delete
find /home/yugabyte/tserver/logs -type f -mtime +14 -delete

Risky:

				
					find /home/yugabyte -type f -mtime +14 -delete
find /mnt/d0/yb-data -type f -mtime +14 -delete

The second examples are too broad. They may match files that are not diagnostic logs.

⚠️ Safety Tip

Always target explicit log directories. Avoid broad cleanup commands against parent YugabyteDB data directories.

Do Not Delete These

Avoid deleting anything under paths like:

				
					*/wals/*
*/tablet-*
*/rocksdb/*
*/yb-data/master/*
*/yb-data/tserver/*

Only target known log directories.

Good examples:

				
					/home/yugabyte/master/logs
/home/yugabyte/tserver/logs

Bad examples:

				
					/home/yugabyte/master
/home/yugabyte/tserver
/mnt/d0/yb-data
/mnt/d1/yb-data

Bonus: Viewing YugabyteDB Logs from SQL

If you want to inspect YugabyteDB log files from SQL, check out this related tip:
👉 How to View Log Files in YugabyteDB Using SQL

That tip is useful for viewing logs and is focused on managing log retention.

Final Takeaway

YugabyteDB already rotates logs, but you may still want an explicit retention policy.

For simple cleanup, use Linux cron or a systemd timer on each DB node to delete logs older than a set number of days.

For longer retention, archive logs to a shared filesystem, object storage, or centralized logging platform before deleting them locally.

Just keep the boundary clear:

● pg_cron is for scheduled SQL.

The operating system is for scheduled file management.

And because YugabyteDB logs are node-local, OS-level retention jobs need to run on every DB node unless log collection is centralized.

✅ Final Takeaway

Use pg_cron for scheduled database work. Use operating system tools for log retention. Since YugabyteDB logs are local to each node, run your cleanup or archive job on every DB node, or use centralized logging to manage retention outside the cluster.

Have Fun!