Recover Recently Deleted Data Using Follower Reads

YugabyteDB is a transactional database that supports distributed transactions. A transaction is a sequence of operations performed as a single logical unit of work. YugabyteDB provides ACID guarantees for all transactions.

If you are in a middle of a transaction and acidentally delete some data, you can undo that delete by issuing a ROLLBACK statement.

Example:

				
					yugabyte=# SELECT * FROM test ORDER BY c1;
 c1 | c2
----+----
  1 | A
  2 | B
  3 | C
(3 rows)

yugabyte=# BEGIN TRANSACTION;
BEGIN

yugabyte=# DELETE FROM test WHERE c1 = 2; -- I meant to say c1 = 3!
DELETE 1

yugabyte=# SELECT * FROM test ORDER BY c1;
 c1 | c2
----+----
  1 | A
  3 | C
(2 rows)

yugabyte=# ROLLBACK;
ROLLBACK

yugabyte=# SELECT * FROM test ORDER BY c1; -- Phew!
 c1 | c2
----+----
  1 | A
  2 | B
  3 | C
(3 rows)
				
			

But what if you’ve already commited the delete only to suddenly realize that you didn’t mean to do that? 

Well, if you act fast, there is a trick using Follower Reads that may help undo that delete.

Example:

				
					yugabyte=# CREATE TABLE some_data (id INT PRIMARY KEY, the_data TEXT, created_ts TIMESTAMP);
CREATE TABLE

yugabyte=# INSERT INTO some_data VALUES (1, 'Old Data1', current_timestamp);
INSERT 0 1

yugabyte=# INSERT INTO some_data VALUES (2, 'Old Data2', current_timestamp);
INSERT 0 1

yugabyte=# SELECT pg_sleep_for('10 SECONDS');
 pg_sleep_for
--------------

(1 row)
				
			

Now I do a few more DML operations…

				
					yugabyte=# DELETE FROM some_data WHERE id = 2;
DELETE 1

yugabyte=# INSERT INTO some_data VALUES (3, 'New Data3', current_timestamp);
INSERT 0 1

yugabyte=# INSERT INTO some_data VALUES (4, 'New Data4', current_timestamp);
INSERT 0 1

yugabyte=# SELECT * FROM some_data ORDER BY created_ts;
 id | the_data  |         created_ts
----+-----------+----------------------------
  1 | Old Data1 | 2024-05-24 12:07:30.073068
  3 | New Data3 | 2024-05-24 12:07:40.113604
  4 | New Data4 | 2024-05-24 12:07:40.114377
(3 rows)
				
			

Oops! I meant to run a delete on id 3, not id 2. A ROLLBACK here won’t help as in ysqlsh, the default behavior is that every statement is auto committed.

				
					yugabyte=# rollback;
WARNING:  there is no transaction in progress
ROLLBACK
				
			

Here’s where Follower Reads can come to the rescue!

Enabling Follower Reads and setting the read staleness to 5 seconds, I can see the data as of about 5 seconds ago, just prior to when I ran the incorrect delete statement.

				
					yugabyte=# SET yb_read_from_followers = TRUE;
SET

yugabyte=# SET yb_follower_read_staleness_ms = 5000;
SET

yugabyte=# SET default_transaction_read_only = TRUE;
SET

yugabyte=# SELECT * FROM some_data;
 id | the_data  |         created_ts
----+-----------+----------------------------
  1 | Old Data1 | 2024-05-24 12:07:30.073068
  2 | Old Data2 | 2024-05-24 12:07:30.078777
(2 rows)
				
			

I will need to temporarily store the data I deleted somewhere to reload later. I can’t insert the data into another table, or create a table (CTAS) because enabling Follower Reads requires that transactions are read only. 

So I will have to dump the data into a temporary file, then COPY it back into the table. 

				
					yugabyte=# \a
Output format is unaligned.

yugabyte=# \t
Tuples only is on.

yugabyte=# \o /tmp/some_data_save.bak

yugabyte=# SELECT * FROM some_data WHERE id = 2;

yugabyte=# \o

yugabyte=# \a
Output format is aligned.

yugabyte=# \t
Tuples only is off.

yugabyte=# SET default_transaction_read_only = FALSE;
SET

yugabyte=# COPY some_data FROM '/tmp/some_data_save.bak' DELIMITER '|';
COPY 1

yugabyte=# SELECT * FROM some_data ORDER BY created_ts;
 id | the_data  |         created_ts
----+-----------+----------------------------
  1 | Old Data1 | 2024-05-24 12:43:08.795391
  2 | Old Data2 | 2024-05-24 12:43:08.799605
  3 | New Data3 | 2024-05-24 12:43:18.814375
  4 | New Data4 | 2024-05-24 12:43:18.815146
(4 rows)
				
			

Phew! I got my data back. Now I can delete the row I intended to delete.

				
					yugabyte=# DELETE FROM some_data WHERE id = 3;
DELETE 1

yugabyte=# SELECT * FROM some_data ORDER BY created_ts;
 id | the_data  |         created_ts
----+-----------+----------------------------
  1 | Old Data1 | 2024-05-24 12:43:08.795391
  2 | Old Data2 | 2024-05-24 12:43:08.799605
  4 | New Data4 | 2024-05-24 12:43:18.815146
(3 rows)
				
			

Have Fun!