Organizations handling sensitive data must ensure that personally identifiable information (PII) is anonymized or pseudonymized to protect user privacy and avoid legal penalties.
The PostgreSQL Anonymizer extension is a tool that masks, obfuscates, or pseudonymizes sensitive data to help enforce privacy and compliance with regulations like GDPR and HIPAA.
Version v1.3.1 of the extension is now bundled with YugabyteDB 2.25.1.0 in Tech Preview.
Let’s look at an example where we anonymize employee last names and partially mask their social security numbers.
To enable the Anonymizer extension, add anon
to shared_preload_libraries
in the PostgreSQL server configuration using the YB-TServer --ysql_pg_conf_csv
flag.
[root@cloud-server-0 ~]# yugabyted version
----------------------------------------------------------------------
| Version |
----------------------------------------------------------------------
| Version : 2.25.1.0-b381 |
| Build Time : 16 Mar 2025 20:59:55 UTC |
| Build Hash : 98f8845f7828449f31bd4de043d827e8819ef7ac |
----------------------------------------------------------------------
[root@cloud-server-0 ~]# yugabyted start --tserver_flags="ysql_pg_conf_csv={shared_preload_libraries=anon}" > start.log
Next, we need to install the extension and initialize a set of “fake” data covering various categories (e.g., first and last names, email addresses, addresses, etc.).
yugabyte=# CREATE EXTENSION anon;
CREATE EXTENSION
yugabyte=# BEGIN;
BEGIN
yugabyte=*# SET LOCAL yb_default_copy_from_rows_per_transaction = 0;
SET
yugabyte=*# SET LOCAL yb_non_ddl_txn_for_sys_tables_allowed TO TRUE;
SET
yugabyte=*# SELECT anon.init(); -- Loads fake data
init
------
t
(1 row)
yugabyte=*# COMMIT;
COMMIT
One feature of the anon
extension is that it provides various functions for generating realistic fake data.
yugabyte=# \df anon.fake*
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+-----------------+------------------+---------------------+------
anon | fake_address | text | | func
anon | fake_city | text | | func
anon | fake_company | text | | func
anon | fake_country | text | | func
anon | fake_email | text | | func
anon | fake_first_name | text | | func
anon | fake_iban | text | | func
anon | fake_last_name | text | | func
anon | fake_postcode | text | | func
anon | fake_siret | text | | func
(10 rows)
yugabyte=# SELECT anon.fake_first_name() || ' ' || anon.fake_last_name() AS "Full Name"
yugabyte-# FROM generate_series(1, 10);
Full Name
------------------
Jackie Whitney
Bruce Short
Nancy Lyons
Karla Hobbs
Joann Little
Ronnie Santana
Seth Ramirez
Alejandro Bolton
Donna Lewis
Jack King
(10 rows)
Now let’s create an employee
table and poulate it with a few records:
yugabyte=# CREATE TABLE employee (id BIGINT PRIMARY KEY, first_name TEXT, last_name TEXT, ssn TEXT);
CREATE TABLE
yugabyte=# INSERT INTO employee VALUES (1,'Joshua', 'Baskin','123-45-6789'), (2,'Susan', 'Lawrence','111-22-3333')
, (3,'Billy', 'Kopecki','222-33-4444');
INSERT 0 3
yugabyte=# SELECT * FROM employee ORDER BY id;
id | first_name | last_name | ssn
----+------------+-----------+-------------
1 | Joshua | Baskin | 123-45-6789
2 | Susan | Lawrence | 111-22-3333
3 | Billy | Kopecki | 222-33-4444
(3 rows)
last_name
amd ssn
columns for particular users. We need to declare masking rules for the columns and then apply the new rules to dynamic masking:
yugabyte=# SECURITY LABEL FOR anon ON COLUMN employee.last_name IS 'MASKED WITH FUNCTION anon.fake_last_name()';
SECURITY LABEL
yugabyte=# SECURITY LABEL FOR anon ON COLUMN employee.ssn IS 'MASKED WITH FUNCTION anon.partial(ssn, 0, $$***-**-$$, 4)';
SECURITY LABEL
yugabyte=# SELECT anon.start_dynamic_masking(false);
start_dynamic_masking
-----------------------
t
(1 row)
employee
table:
yugabyte=# CREATE ROLE zoltar WITH LOGIN;
CREATE ROLE
yugabyte=# SECURITY LABEL FOR anon ON ROLE zoltar IS 'MASKED';
SECURITY LABEL
yugabyte=# GRANT SELECT ON employee TO zoltar;
GRANT
Next, let’s turn on dynamic masking for the yugabyte
database:
yugabyte=# ALTER DATABASE yugabyte SET anon.transparent_dynamic_masking TO true;
ALTER DATABASE
employee
table, the last_name
and ssn
columns will have the masking rules applied:
yugabyte=# \c - zoltar
You are now connected to database "yugabyte" as user "zoltar".
yugabyte=> SELECT * FROM employee ORDER BY id;
id | first_name | last_name | ssn
----+------------+-----------+-------------
1 | Joshua | Long | ***-**-6789
2 | Susan | Bentley | ***-**-3333
3 | Billy | Zamora | ***-**-4444
(3 rows)
Here are the SQL commands from today’s YugabyteDB Tip, so you can try this new feature out for yourself:
CREATE EXTENSION anon;
BEGIN;
SET LOCAL yb_default_copy_from_rows_per_transaction = 0;
SET LOCAL yb_non_ddl_txn_for_sys_tables_allowed TO TRUE;
SELECT anon.init(); -- Loads fake data
COMMIT;
\df anon.fake*
SELECT anon.fake_first_name() || ' ' || anon.fake_last_name() AS "Full Name" FROM generate_series(1, 10);
CREATE TABLE employee (id BIGINT PRIMARY KEY, first_name TEXT, last_name TEXT, ssn TEXT);
INSERT INTO employee VALUES (1,'Joshua', 'Baskin','123-45-6789'), (2,'Susan', 'Lawrence','111-22-3333')
, (3,'Billy', 'Kopecki','222-33-4444');
SELECT * FROM employee ORDER BY id;
SECURITY LABEL FOR anon ON COLUMN employee.last_name IS 'MASKED WITH FUNCTION anon.fake_last_name()';
SECURITY LABEL FOR anon ON COLUMN employee.ssn IS 'MASKED WITH FUNCTION anon.partial(ssn, 0, $$***-**-$$, 4)';
SELECT anon.start_dynamic_masking(false);
CREATE ROLE zoltar WITH LOGIN;
SECURITY LABEL FOR anon ON ROLE zoltar IS 'MASKED';
GRANT SELECT ON employee TO zoltar;
ALTER DATABASE yugabyte SET anon.transparent_dynamic_masking TO true;
\c - zoltar
SELECT * FROM employee ORDER BY id;
Have Fun!
