Cleaner Conditional Aggregates in YugabyteDB with PostgreSQL’s FILTER Clause

Sometimes a simple SQL feature can make reporting queries much easier to read.

One of my favorite examples is PostgreSQL’s FILTER clause for aggregate functions. It also works in YugabyteDB’s YSQL API and is a nice pattern for dashboard queries, summary reports, and operational checks.

The idea is simple: instead of putting a CASE WHEN expression inside an aggregate, attach the condition directly to the aggregate.

For example, instead of this:

				
					sum(CASE WHEN status = 'paid' THEN 1 ELSE 0 END)
				
			

you can write this:

				
					count(*) FILTER (WHERE status = 'paid')
				
			

That reads much more naturally:

  • Count the rows where the status is paid.
Key Insight

The FILTER clause makes conditional aggregates easier to read and maintain. In YugabyteDB, it also encourages a good distributed SQL pattern: scan the relevant row set once, then calculate several related metrics from that same stream of rows.

Demo Setup

Let’s create a simple orders table.

				
					DROP TABLE IF EXISTS orders;

CREATE TABLE orders (
    order_id    int PRIMARY KEY,
    customer_id int NOT NULL,
    status      text NOT NULL,
    amount      numeric(10,2) NOT NULL,
    created_at  timestamptz DEFAULT now()
);
				
			

Insert a few sample rows.

				
					INSERT INTO orders (order_id, customer_id, status, amount, created_at) VALUES
(1, 101, 'paid',     25.00, now() - interval '5 days'),
(2, 101, 'pending',  10.00, now() - interval '4 days'),
(3, 102, 'paid',     40.00, now() - interval '3 days'),
(4, 103, 'failed',   15.00, now() - interval '2 days'),
(5, 103, 'paid',     30.00, now() - interval '1 day'),
(6, 104, 'pending',  50.00, now());
				
			

The Traditional CASE WHEN Pattern

A common way to write conditional aggregates is to use CASE WHEN inside the aggregate.

				
					SELECT
    count(*) AS total_recent_orders,

    sum(CASE WHEN status = 'paid' THEN 1 ELSE 0 END) AS paid_orders,
    sum(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) AS pending_orders,
    sum(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) AS failed_orders,

    coalesce(sum(CASE WHEN status = 'paid' THEN amount ELSE 0 END), 0) AS paid_revenue
FROM orders
WHERE created_at >= now() - interval '7 days';
				
			

Example result:

				
					.total_recent_orders | paid_orders | pending_orders | failed_orders | paid_revenue
---------------------+-------------+----------------+---------------+--------------
                   6 |           3 |              2 |             1 |        95.00
				
			

This works, but it gets noisy fast.

Each aggregate has to repeat the conditional logic, and the real business question gets buried inside the CASE expressions.

That is fine for one or two metrics, but it becomes harder to read as the query grows.

The Cleaner FILTER Pattern

Here is the same query using PostgreSQL’s FILTER clause.

				
					SELECT
    count(*) AS total_recent_orders,

    count(*) FILTER (WHERE status = 'paid') AS paid_orders,
    count(*) FILTER (WHERE status = 'pending') AS pending_orders,
    count(*) FILTER (WHERE status = 'failed') AS failed_orders,

    coalesce(sum(amount) FILTER (WHERE status = 'paid'), 0) AS paid_revenue
FROM orders
WHERE created_at >= now() - interval '7 days';
				
			

Example result:

				
					 .total_recent_orders | paid_orders | pending_orders | failed_orders | paid_revenue
---------------------+-------------+----------------+---------------+--------------
                   6 |           3 |              2 |             1 |        95.00
				
			

The result is the same, but the intent is clearer.

This:

				
					count(*) FILTER (WHERE status = 'paid')
				
			

is easier to understand than this:

				
					sum(CASE WHEN status = 'paid' THEN 1 ELSE 0 END)
				
			

The FILTER version says exactly what the aggregate is doing.

Why FILTER Is Better

The big advantage is readability.

With CASE WHEN, the reader has to mentally translate this:

				
					sum(CASE WHEN status = 'paid' THEN 1 ELSE 0 END)
				
			

into:

  • Count paid orders.

With FILTER, the SQL already says that:

				
					count(*) FILTER (WHERE status = 'paid')
				
			

That becomes even more valuable when a query has several related metrics:

				
					SELECT
    count(*) AS total_recent_orders,

    count(*) FILTER (WHERE status = 'paid') AS paid_orders,
    count(*) FILTER (WHERE status = 'pending') AS pending_orders,
    count(*) FILTER (WHERE status = 'failed') AS failed_orders,

    coalesce(sum(amount) FILTER (WHERE status = 'paid'), 0) AS paid_revenue,
    coalesce(avg(amount) FILTER (WHERE status = 'paid'), 0) AS average_paid_order_amount,

    min(created_at) AS first_order_time,
    max(created_at) AS last_order_time
FROM orders
WHERE created_at >= now() - interval '7 days';
				
			

This type of query is common in dashboards and summary reports. You want several related metrics from the same filtered row set.

FILTER keeps those metrics readable.

Why This Is a Good Pattern in YugabyteDB

The FILTER clause does not magically make aggregation free.

If a query needs to scan a large table across many tablets, YugabyteDB still has to read the relevant rows and aggregate the results.

The value is that FILTER encourages a better query shape.

Instead of running several separate queries like this:

				
					SELECT count(*)
FROM orders
WHERE created_at >= now() - interval '7 days'
  AND status = 'paid';

SELECT count(*)
FROM orders
WHERE created_at >= now() - interval '7 days'
  AND status = 'pending';

SELECT count(*)
FROM orders
WHERE created_at >= now() - interval '7 days'
  AND status = 'failed';

SELECT coalesce(sum(amount), 0)
FROM orders
WHERE created_at >= now() - interval '7 days'
  AND status = 'paid';
				
			

you can summarize the same filtered row set in one query:

				
					SELECT
    count(*) AS total_recent_orders,

    count(*) FILTER (WHERE status = 'paid') AS paid_orders,
    count(*) FILTER (WHERE status = 'pending') AS pending_orders,
    count(*) FILTER (WHERE status = 'failed') AS failed_orders,

    coalesce(sum(amount) FILTER (WHERE status = 'paid'), 0) AS paid_revenue
FROM orders
WHERE created_at >= now() - interval '7 days';
				
			

That is usually a better distributed SQL pattern.

Use the regular WHERE clause to reduce the base row set first:

				
					WHERE created_at >= now() - interval '7 days'
				
			

Then use FILTER to calculate multiple metrics from that scoped set of rows:

				
					count(*) FILTER (WHERE status = 'paid')
count(*) FILTER (WHERE status = 'pending')
count(*) FILTER (WHERE status = 'failed')
sum(amount) FILTER (WHERE status = 'paid')
				
			

The WHERE clause controls which rows are read by the query.

The FILTER clause controls which of those rows are included in each aggregate.

Distributed SQL Reminder

Use the regular WHERE clause to reduce the base row set first. Then use FILTER to calculate several conditional aggregates from that scoped set of rows. This avoids turning one reporting question into several separate scans.

What About Indexes?

FILTER is not a replacement for indexing.

If your query frequently filters on created_at, customer_id, tenant_id, or another access-path column, you still need to index and model the table for that pattern.

For example:

				
					CREATE INDEX orders_customer_created_at_idx
ON orders (customer_id, created_at);
				
			

The simple rule is:

  • ● Use WHERE and indexes to find the right rows efficiently.
  • ● Use FILTER to calculate several conditional aggregates from those rows.

Final Takeaway

PostgreSQL’s FILTER clause is a simple but useful SQL feature that also works in YugabyteDB.

It makes conditional aggregates cleaner:

				
					count(*) FILTER (WHERE status = 'paid')
				
			

instead of:

				
					sum(CASE WHEN status = 'paid' THEN 1 ELSE 0 END)
				
			

The result is the same, but the query is easier to read and easier to maintain.

In YugabyteDB, the bigger lesson is query shape. Use a regular WHERE clause to reduce the base row set first, then use FILTER to calculate several related metrics from that same stream of rows.

Bottom Line

Use FILTER when you need multiple conditional aggregates over the same scoped row set. It is cleaner than repeating CASE WHEN, and in YugabyteDB it helps express several related metrics as one well-scoped aggregate query instead of several separate scans.

Have Fun!

Finding old photos while packing is always an emotional experience, and a move is the perfect catalyst for reaching out! My wife has been gently encouraging me to downsize our things, but this picture is definitely one to save. It brought back so many memories of the beach. It’s been years, but I’d love to reconnect and hear how her life has been.