Most PostgreSQL tuning guides tell you to increase shared_buffers first, set effective_cache_size to three-quarters of your RAM, and call it done. That advice is not wrong exactly, but it skips over the part that actually explains what these numbers mean and why PostgreSQL needs them in the first place. Once you understand what each setting does internally, you stop guessing at values and start making deliberate choices.
work_mem: The Per-Operation Memory Budget
The first thing to understand about work_mem is what the word "per" actually refers to. It is not per-query. It is per operation. One complex query can have multiple sort operations and multiple hash joins, and each one gets its own work_mem budget independently. A query with three sort steps and two hash joins running on four parallel workers could be using ten or more work_mem chunks simultaneously. On a system with fifty concurrent connections, the real memory consumption from work_mem alone can be many times larger than the setting implies.
The default memory budget is 4 megabytes, which is relatively less space by modern standards, and it was chosen conservatively for exactly this reason to prevent a single query from consuming all available memory on a busy server.
What work_mem controls in practice is the decision between doing work in memory or spilling to disk. Every sort operation starts by loading rows into a memory array. The executor tracks how much memory is in use and compares it against work_mem continuously. The moment the available memory drops below zero, then it means the data exceeds the budget. So, PostgreSQL sorts what it has in memory, writes that sorted chunk to a temporary file on disk, and starts loading the next batch. At the end, it merges all the chunks. This is an external merge sort, and it is significantly slower than an in-memory sort because it involves real disk I/O and multiple passes over the data.
Raising work_mem for a session before a heavy analytical query allows more data to stay in memory, which can turn a multi-second sort into a sub-second one. The practical pattern is to set a low global default and raise it selectively with SET work_mem = '256MB' for specific reporting queries or ETL sessions, rather than raising it globally and multiplying that memory usage across every connection on the server.
effective_cache_size: The Hint That Costs Nothing But Matters a Lot
Here is the most misunderstood thing about effective_cache_size: it does not allocate any memory. It does not reserve anything. It does not tell PostgreSQL how much memory to use. It is a number you tell the planner about your system so the planner can make better decisions, and if you get it wrong, the planner makes worse decisions.
Specifically, effective_cache_size is used in the planner's model for estimating how many disk pages a query will actually have to read from a physical disk versus finding it already cached in memory. PostgreSQL uses a mathematical model called the Mackert-Lohman formula that, given a table size and the fraction of it being scanned, estimates how many unique pages will need to be fetched. The effective cache size is an input to that formula which represents how many pages PostgreSQL plus the operating system's disk cache can hold together.
The default is around 4 gigabytes expressed in 8-kilobyte pages. If your server has 32 gigabytes of RAM and most of it is available for caching, you should set this to something like 24 gigabytes worth of pages. If you leave it at the default on a machine with much more memory, the planner thinks the cache is small, overestimates how many index pages it will have to read from disk, and sometimes prefers a sequential scan over an index scan by choosing the slower path because it believes the faster path will cost more I/O than it actually will.
Set it too high on a machine that genuinely has limited memory and the opposite problem occurs. The planner assumes pages will be cached that will not be, underestimates the real I/O cost of index scans, and chooses index paths that then hammer the disk harder than a sequential scan would have.
The right value is roughly your total RAM minus what the operating system, PostgreSQL processes, and other applications. Then, you can use the portion of memory that realistically stays warm with database pages.
hash_mem_multiplier: Why Hash Joins Need More Room Than Sorts
Sorts and hash joins both use work_mem, but they use memory differently. A sort can incrementally spill to disk one chunk at a time and then merge. A hash join cannot. It has to build a complete in-memory hash table from the smaller side of the join, and if that hash table overflows, it has to partition the entire dataset into batches and process each batch separately. Each batch requires its own in-memory table and its own temporary files. If the partitioning happens to be uneven ,like one partition much larger than the others, then the large partition must be re-partitioned recursively.
This means hash operations are more sensitive to memory pressure than sorts. A sort that gets 80 percent of what it needs performs almost as well as one that gets 100 percent. A hash join that gets 80 percent of what it needs may need to triple the number of passes over the data to compensate.
hash_mem_multiplier exists to address this difference without inflating work_mem for everything. The actual memory budget for a hash operation is work_mem multiplied by hash_mem_multiplier. With the default multiplier of 2.0 and a work_mem of 4 megabytes, each hash join gets 8 megabytes rather than 4. This lets you keep work_mem low for sorts and other operations while giving hash joins the extra headroom they need to avoid excessive batching.
The multiplier also covers something that pure tuple storage does not. When a hash join spills to disk, it maintains read and write buffers for each batch file. On queries with many batches, these buffers add up on top of the hash table itself. The extra memory from the multiplier provides room for this overhead without triggering additional spills.
How They Work Together
These three settings do not overlap, but they interact. work_mem sets the baseline for everything. hash_mem_multiplier raises that baseline specifically for hash operations. effective_cache_size operates in a completely separate domain , it influences what kind of plan the planner chooses before any execution memory is allocated.
A query that the planner expects to run as a hash join followed by a sort will, at execution time, use work_mem * hash_mem_multiplier for the hash phase and work_mem for the sort phase. The planner made its choice based in part on effective_cache_size. The executor then runs what the planner chose using the actual memory the GUC settings allow.
Getting these three numbers right is not about maximising them. It is about giving the planner and executor an accurate picture of the machine they are running on.