Presented by:

Alena Rybakina

Postgres Professional

I am a PostgreSQL database developer in the field of optimization, namely, I write code for the optimizer and statistics. I'm active contributor in PostgreSQL Community: several patches were committed (OR-to-ANY transformation and Values-to-ANY transformation). I have participated and gave talks in conferences: PGConf Canada 2024, 2025, PG Conf Ivory SQL 2025.

No video of the event yet, sorry!

Pull-up is an optimization that already exists in PostgreSQL’s planner. It rewrites eligible subqueries into the main query level so that the planner can work with a unified join tree rather than treating subqueries as independent nodes.

Currently, PostgreSQL applies pull-up to several common subquery forms, including: - IN subqueries - EXISTS subqueries - ANY expressions (where applicable)

For example, without pull-up, PostgreSQL might generate a plan like this:

``` create table ta as select id, id%5 as val from generate_series(1,10000) as id; create table tb as select id, id%5 as aval from generate_series(10001,100100) as id;; create table tc as select id, id%5 as aid from generate_series(100101,101101) as id;

EXPLAIN SELECT ta.id FROM ta WHERE EXISTS ( SELECT 1 FROM tb JOIN tc ON tc.id = ta.id );

Seq Scan on ta (actual time=0.295..3184.348 rows=1000 loops=1) Filter: EXISTS(SubPlan 1) Rows Removed by Filter: 9000 SubPlan 1 -> Nested Loop (actual time=0.317..0.317 rows=0.1 loops=10000) -> Seq Scan on tc (actual time=0.315..0.315 rows=0.1 loops=10000) Filter: (id = ta.id) Rows Removed by Filter: 950 -> Seq Scan on tb (actual time=0.006..0.006 rows=1.0 loops=1000) Planning Time: 1.376 ms
Execution Time: 3184.632 ms ```

Pull-up optimization, however, lifts the subquery into the main query, giving the planner freedom to reorder joins and apply transformations - often leading to far better execution paths:

Hash Join (actual time=64.144..67.705 rows=1000 loops=1) Hash Cond: (ta.id = tc.id) -> Seq Scan on ta (actual time=0.023..1.344 rows=10000 loops=1) -> Hash (actual time=64.112..64.114 rows=1000 loops=1) -> HashAggregate (actual time=63.486..63.801 rows=1000 loops=1) Group Key: tc.id -> Nested Loop (actual time=0.029..22.195 rows=100000 loops=1) -> Seq Scan on tc (actual time=0.012..0.228 rows=1000 loops=1) -> Materialize (actual time=0.000..0.009 rows=100 loops=1000) -> Seq Scan on tb (actual time=0.010..0.026 rows=100 loops=1) Planning Time: 0.225 ms Execution Time: 67.869 ms

This enables: - Join reordering - joins from the subquery can be reordered with others to minimize cost - Condition propagation - WHERE clauses can be applied earlier, improving selectivity and index usage - Expression deduplication - redundant computations can be eliminated once all expressions are visible at the same level - Fewer repeated executions - correlated subqueries avoid per-row re-evaluation

We consider how pull-up is implemented in PostgreSQL today for IN and EXISTS; summarize the conditions that allow or block the transformation; and consider whether the current limitations are optimal or could be expanded [0].

[0] https://www.postgresql.org/message-id/flat/e172de23-7acb-4330-a40a-58ead7a47944%40postgrespro.ru#93f17d065b5702a0185d6379dd1d76c8

Date:
2025 October 17 10:50 +11
Duration:
40 min
Room:
Oxford I + II
Conference:
PG Down Under 2025
Language:
Track:
Development
Difficulty:
Medium