Jan 22, 2026

How PostgreSQL’s Query Planner Chooses (or Rejects) Your Index

“I created the index, but PostgreSQL still does a sequential scan.”

If you’ve worked with PostgreSQL, you’ve almost certainly faced this situation.

This behavior is not a bug. It’s a deliberate decision made by PostgreSQL’s cost-based query planner.

In this article, we’ll explore how PostgreSQL decides whether to use an index, why sequential scans are often chosen even when indexes exist, and how to correctly analyze and fix these situations.

How the PostgreSQL Query Planner Works

PostgreSQL uses a cost-based optimizer.

This means:

It generates multiple possible execution plans
Estimates the cost of each plan
Chooses the plan with the lowest estimated cost

PostgreSQL does not choose the fastest plan in reality—it chooses the cheapest plan based on available statistics.

Cost-Based Optimization: The Core Idea

Each execution plan is evaluated using a simplified cost model:

total_cost = startup_cost + (cpu_cost + io_cost) × estimated_rows

Some important cost parameters include:

Parameter	Purpose
seq_page_cost	Cost of sequential page reads
random_page_cost	Cost of random page reads
cpu_tuple_cost	CPU cost per row
cpu_operator_cost	Cost of evaluating expressions
effective_cache_size	Planner’s assumption about cache size

By default:

seq_page_cost    = 1.0
random_page_cost = 4.0

This means PostgreSQL assumes random I/O is 4× more expensive than sequential I/O.

Sequential Scan vs Index Scan

Let’s look at a simple query:

SELECT * FROM orders WHERE status = 'completed';

PostgreSQL evaluates two main options.

Option 1: Sequential Scan

Read table pages sequentially
Apply filter on each row
Very cache-friendly

Option 2: Index Scan

Traverse B-tree index
Follow index pointers to heap rows

Causes random heap access

If PostgreSQL estimates that many rows match the condition, a sequential scan is cheaper — even if an index exists.

The Role of Statistics (ANALYZE)

Planner decisions depend heavily on statistics collected by ANALYZE.

PostgreSQL tracks:

Total row count
Number of distinct values
Most Common Values (MCV)
Histograms
Column correlation

If statistics are outdated or inaccurate, the planner will make poor decisions.

ANALYZE orders;

For skewed data distributions, you may need higher statistics targets:

ALTER TABLE orders
ALTER COLUMN status SET STATISTICS 1000;
ANALYZE orders;

Correlation and Physical Data Order

Indexes store logical order, not physical row order.

PostgreSQL measures how well heap rows are ordered relative to an index using correlation.

SELECT correlation
FROM pg_stats
WHERE tablename = 'orders'
  AND attname = 'id';

~1.0
Heap is ordered like the index

~0.0
Heap is randomly ordered

~-1.0
Reverse order

Low correlation means index scans cause random heap reads, which are expensive.

This is why even ORDER BY id LIMIT 50 can result in a sequential scan.

Common Myths About Index Usage

“If an index exists, PostgreSQL must use it”

Indexes are optional access paths, not instructions.

“Sequential scans are always bad”

Sequential scans are:

Fast
Cache-efficient
Often optimal for large result sets

“Disable sequential scans to force index usage”

SET enable_seqscan = off;

This breaks planner logic and causes unpredictable performance.

Never use this in production.

Real EXPLAIN ANALYZE Examples

Example 1: Highly Selective Predicate

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'user@example.com';
Index Scan using users_email_idx
Actual Rows: 1

Index scan is optimal.

Example 2: Low Selectivity Predicate

EXPLAIN ANALYZE
SELECT * FROM users WHERE is_active = true;
Seq Scan on users
Actual Rows: 950000

Sequential scan is cheaper.

When PostgreSQL Should Use Your Index (But Doesn’t)

This usually indicates bad statistics or poor index design.

Possible fixes:

Run ANALYZE
Increase statistics target
Use partial indexes

Example:

CREATE INDEX idx_orders_completed
ON orders(id)
WHERE status = 'completed';

Partial indexes reduce index size and improve selectivity.

Best Practices for Reliable Index Usage

Always use EXPLAIN ANALYZE, not just EXPLAIN
Compare estimated rows vs actual rows
Monitor correlation and data layout
Prefer index-only scans when possible
Design indexes for access patterns, not assumptions
Avoid forcing planner decisions

Final Takeaways

PostgreSQL does not “ignore” your index.

It simply decides:

“This index is more expensive than a sequential scan.”

Indexes alone do not guarantee performance.

Correct statistics, good data locality, and realistic cost assumptions do.

Indexes don’t make queries fast — accurate estimates do