PostgreSQL 13 introduced a game-changing optimization feature called incremental sorting that can dramatically improve query performance when dealing with partially sorted data. In this comprehensive guide, we'll explore what incremental sorting is, how it works under the hood, and demonstrate its performance benefits with real examples.
What is Incremental Sorting?
Incremental sorting is an intelligent optimization technique that leverages existing partial order in your data to reduce sorting overhead. Instead of performing a full sort on the entire dataset, PostgreSQL identifies sections of data that are already sorted by some prefix of the required sort keys and only sorts within those groups.
Think of it like organizing a library where books are already grouped by genre. Instead of sorting all books from scratch, you only need to sort within each genre group – a much more efficient approach.
How Incremental Sorting Works?
The incremental sorting algorithm follows these key steps:
1. Presorted Prefix Detection
PostgreSQL's query planner analyzes the input data stream and identifies which columns are already sorted. This often happens when:
- An index scan provides naturally ordered data
- Previous operations in the query plan have established a partial order
- Data is clustered by certain columns
2. Group-wise Processing
Once presorted columns are identified, the algorithm:
- Reads data sequentially
- Identifies boundaries where presorted column values change
- Sorts only the remaining columns within each group
- Outputs results incrementally
3. Memory Management
Unlike traditional sorting that requires loading the entire dataset into memory, incremental sorting processes one group at a time, significantly reducing memory requirements.
How to Enable Incremental Sorting
Incremental sorting is enabled by default in PostgreSQL 13 and later versions. However, you can control it using the following configuration parameter:
-- Check current setting
SHOW enable_incremental_sort;
-- Enable incremental sorting (default)
SET enable_incremental_sort = on;
-- Disable incremental sorting (for testing purposes)
SET enable_incremental_sort = off;
The query planner automatically detects opportunities to use incremental sorting – no manual intervention required!
Performance Demonstration: Before and After
Let's see incremental sorting in action with a practical example.
1. Setting Up Test Data
-- Create a simple table
CREATE TABLE test_data (
id SERIAL PRIMARY KEY,
category TEXT,
value INT
);
-- Create an index so category is presorted
CREATE INDEX idx_test_data_category ON test_data(category);
-- Insert 50,000 rows in 5 categories
INSERT INTO test_data (category, value)
SELECT
'Cat_' || (g % 5), -- Cat_0 to Cat_4
(random() * 1000)::int -- random values
FROM generate_series(1, 50000) g;
-- Gather stats for the planner
ANALYZE test_data;
2. Query to Test Incremental Sort
SELECT category, value
FROM test_data
ORDER BY category, value;
3. With Incremental Sorting Enabled
To get the incremental sorting to work, the planner should not use a sequential scan. Since the sequential scan doesn’t produce any ordering, there’s no presorted prefix, which means no incremental sort. So, for this demo purpose, let’s disable sequential scan also.
SET enable_incremental_sort = ON;
SET enable_seqscan = OFF; -- force index usage for testing
EXPLAIN (ANALYZE, BUFFERS, COSTS OFF, TIMING OFF)
SELECT category, value
FROM test_data
ORDER BY category, value;
Output:
QUERY PLAN
-------------------------------------------------------------------------------------------
Incremental Sort (actual rows=50000 loops=1)
Sort Key: category, value
Presorted Key: category
Full-sort Groups: 5 Sort Method: quicksort Average Memory: 27kB Peak Memory: 27kB
Pre-sorted Groups: 5 Sort Method: quicksort Average Memory: 697kB Peak Memory: 697kB
Buffers: shared hit=1357 read=42
-> Index Scan using idx_test_data_category on test_data (actual rows=50000 loops=1)
Buffers: shared hit=1355 read=41
Planning Time: 0.065 ms
Execution Time: 25.404 ms
4. With Incremental Sorting Disabled
SET enable_incremental_sort = OFF;
EXPLAIN (ANALYZE, BUFFERS, COSTS OFF, TIMING OFF)
SELECT category, value
FROM test_data
ORDER BY category, value;
Output:
QUERY PLAN
----------------------------------------------------------------------------------------
Sort (actual rows=50000 loops=1)
Sort Key: category, value
Sort Method: quicksort Memory: 3099kB
Buffers: shared hit=1396
-> Index Scan using idx_test_data_category on test_data (actual rows=50000 loops=1)
Buffers: shared hit=1396
Planning Time: 0.085 ms
Execution Time: 47.947 ms
Performance Analysis
Metric | Incremental Sort (Index Scan) | Traditional Sort (Index Scan) | Improvement |
Execution Time | ~25.4 ms | ~47.9 ms | ~47% faster |
Memory Usage | ~724 kB total (27 kB + 697 kB) | ~3,099 kB | ~76% less |
Buffers Read | 1,398 total | 1,438 total | Slightly fewer |
Why Incremental Sort Wins Here
- Sorting Smaller Chunks: The index already produces rows grouped by category. Incremental sort sorts only within each of these 5 groups instead of all 50,000 rows at once.
- Reduced Memory Footprint: Since each group is smaller (~10,000 rows per category), the in-memory quicksort uses far less memory, avoiding large allocations.
- Better CPU Cache Locality: Sorting smaller arrays improves cache efficiency, further reducing execution time.
Incremental sorting represents a significant advancement in PostgreSQL's query optimization capabilities. By intelligently leveraging existing data order, it can deliver substantial performance improvements while reducing memory usage and eliminating expensive disk-based sorting operations.