Modern databases must juggle performance and scalability, especially when handling large datasets. PostgreSQL offers several knobs to help fine-tune performance, one of which is enable_async_append. Introduced in PostgreSQL 14, this setting can unlock substantial speed-ups in specific parallel query scenarios — but only when used wisely.
Let’s explore what enable_async_append does, how it affects parallel query execution, when you should use it, and what caveats to keep in mind.
Understanding Append and Parallel Append in PostgreSQL
Before diving into async behavior, it helps to understand what Append and Parallel Append nodes are in PostgreSQL's execution plans:
- Append is an execution node used when querying multiple child relations, such as partitions or subqueries in a UNION ALL. It processes them one at a time.
- Parallel Append, enabled when parallel query plans are in play, distributes these child scans across multiple workers — improving performance by utilizing more CPU cores.
Example:
SELECT * FROM sales_2023
UNION ALL
SELECT * FROM sales_2024;
OR
SELECT * FROM sales WHERE sale_date BETWEEN '2023-01-01' AND '2024-12-31';
If sales is partitioned by year, PostgreSQL will append all relevant partitions into one result set.
How async_append Improves Parallel Query Performance
The traditional Parallel Append node assigns one child relation per worker, which means a worker might have to wait for slower relations to finish before it can start new ones. That’s fine for uniform workloads, but problematic when:
- Partitions are unevenly sized
- Some child tables are slower due to I/O bottlenecks
enable_async_append changes the game by allowing workers to fetch and scan child relations asynchronously — essentially overlapping I/O and CPU work. It transforms a "one-at-a-time" model into a more dynamic "grab-what-you-can" strategy.
This is particularly useful when child tables are stored on different disks or if there’s significant variation in access speed.
When to Use It: Partitioned Tables and UNION ALL
Here are some concrete use cases:
- Large Partitioned Tables: For example, a time-partitioned fact table in a data warehouse. Asynchronous append helps efficiently fetch data across multiple partitions in parallel.
- UNION ALL queries: Multiple subqueries hitting different tables or indexes can benefit from asynchronous coordination.
Tip: Always analyze your query plan with EXPLAIN (ANALYZE, VERBOSE) to see if async append is being used.
Benchmark: Before and After Enabling
Let’s say we have a 12-partition table with uneven data sizes. A simple query like:
SELECT COUNT(*) FROM sales WHERE sale_date BETWEEN '2023-01-01' AND '2024-12-31';
Might show a significant drop in total query time when switching from traditional parallel append to async append.
To test:
SET enable_async_append = off;
-- Run and time the query
SET enable_async_append = on;
-- Run and time the same query
Typical improvements range from 10% to 40%, depending on disk I/O and partition balance.
It’s Not Always Faster
Despite its benefits, async_append is not a silver bullet:
- Disk I/O Concurrency Limits: If your disks or storage engine don’t handle parallel reads well, async behavior may cause contention.
- CPU-bound Queries: If your query is already CPU-heavy, adding async I/O won’t help much.
- Too Many Partitions: PostgreSQL still needs to plan all child scans. With thousands of partitions, planning overhead might overshadow runtime gains.
Best Practices
- Use with partitioned tables and UNION ALL queries that scan many sources.
- Monitor performance using pg_stat_statements and EXPLAIN ANALYZE.
- Pair it with good partition pruning, proper indexes, and sufficient I/O capacity.
- Benchmark in your environment — async append shines when the workload and hardware align.
enable_async_append is a powerful feature in PostgreSQL’s parallel query arsenal. It optimizes how workers fetch and process data from multiple sources, particularly useful for partitioned workloads and large-scale analytics. But like any performance tool, its value depends on your data layout and hardware. Benchmark thoughtfully, and you might find a hidden gem of performance waiting to be unlocked.