Apr 28, 2026

How to Understand Internal Background Workers in PostgreSQL Source Code

PostgreSQL is not powered by a single process. Behind every running database server, multiple internal processes work continuously to maintain performance, durability, replication, storage health, and query efficiency. These internal processes are known as background workers.

If you are exploring PostgreSQL source code, one of the most valuable files to study is bgworker.c. It helps explain how PostgreSQL registers workers, finds their entry-point functions, starts them, and allows them to run independent tasks.

What Are Background Workers?

A background worker is a PostgreSQL server process created to perform internal work without depending on a user session.

These workers run in parallel with client connections and help the database continue operating efficiently while users execute queries.

Common responsibilities include:

Automatic cleanup
WAL writing
Checkpoint creation
Buffer flushing
Replication management
Parallel query support
Internal maintenance

Instead of assigning every responsibility to one process, PostgreSQL distributes work across multiple specialized processes.

Why PostgreSQL Uses Background Workers

Consider a busy production database:

Users are inserting orders
Reports are running
Tables are receiving updates
Dead tuples need cleanup
WAL must be flushed
Dirty buffers must be written
Replication must stay synchronized

If all of this happened in one process, performance would degrade quickly. Background workers solve this by dividing responsibilities into dedicated processes.

Benefits

Better concurrency
Improved performance
Cleaner architecture
Faster recovery
Reduced latency spikes
Easier maintenance
Extensible design

The Internal Worker Registry in bgworker.c

There is an array named InternalBGWorkers[] that we can see in the file named bgworker.c in postgres source code.

InternalBGWorkers[] =

{
    { .fn_name = "ApplyLauncherMain", .fn_addr = ApplyLauncherMain },
    { .fn_name = "ApplyWorkerMain", .fn_addr = ApplyWorkerMain },
    { .fn_name = "ParallelApplyWorkerMain", .fn_addr = ParallelApplyWorkerMain },
    { .fn_name = "ParallelWorkerMain", .fn_addr = ParallelWorkerMain },
    { .fn_name = "RepackWorkerMain", .fn_addr = RepackWorkerMain },
    { .fn_name = "SequenceSyncWorkerMain", .fn_addr = SequenceSyncWorkerMain },
    { .fn_name = "TableSyncWorkerMain", .fn_addr = TableSyncWorkerMain },
    { .fn_name = "DataChecksumsWorkerLauncherMain", .fn_addr = DataChecksumsWorkerLauncherMain },
    { .fn_name = "DataChecksumsWorkerMain", .fn_addr = DataChecksumsWorkerMain },
};

This array acts like an internal registry.

Each entry stores:

Worker function name as text
Actual C function address

When PostgreSQL needs to start a worker, it can locate the correct function using this mapping.

Note: This example is from Postgres version 19, and in this postgres version, a new feature named the repack command was introduced. So that’s why you can see this new background worker named RepackWorkerMain

In postgres version 18, we have only 5 internal background workers

	InternalBGWorkers[] =
{
	{
		"ParallelWorkerMain", ParallelWorkerMain
	},
	{
		"ApplyLauncherMain", ApplyLauncherMain
	},
	{
		"ApplyWorkerMain", ApplyWorkerMain
	},
	{
		"ParallelApplyWorkerMain", ParallelApplyWorkerMain
	},
	{
		"TablesyncWorkerMain", TablesyncWorkerMain
	}
};

In bgworker.c, there is a function named LookupBackgroundWorkerFunction(). The purpose of the function LookupBackgroundWorkerFunction() is to find and return the correct entry-point function for a background worker so PostgreSQL can start that worker process.

How LookupBackgroundWorkerFunction() Works

Flow

Receive the library name and the function name
If the library name is postgres, search InternalBGWorkers[]
Return matching internal function
Otherwise, load the function from an external library
Start the worker process using that function

This supports both:

Core PostgreSQL workers
Extension-provided workers

Real Workers Seen in pg_stat_activity

Query :

Select * from pg_stat_activity where state = 'active';

Result :

datid | datname |  pid  | leader_pid | usesysid | usename  | application_name | client_addr | client_hostname | client_port |          backend_start           | xact_start | query_start | state_change | 
wait_event_type |     wait_event      | state | backend_xid | backend_xmin | query_id | query |         backend_type         
-------+---------+-------+------------+----------+----------+------------------+-------------+-----------------+-------------+----------------------------------+------------+-------------+--------------+-
----------------+---------------------+-------+-------------+--------------+----------+-------+------------------------------
       |         | 21480 |            |       10 | cybrosys |                  |             |                 |             | 2026-04-23 20:13:45.167556+05:30 |            |             |              | 
Activity        | LogicalLauncherMain |       |             |              |          |       | logical replication launcher
       |         | 21478 |            |          |          |                  |             |                 |             | 2026-04-23 20:13:45.167142+05:30 |            |             |              | 
Activity        | AutovacuumMain      |       |             |              |          |       | autovacuum launcher
       |         | 21472 |            |          |          |                  |             |                 |             | 2026-04-23 20:13:44.858805+05:30 |            |             |              | 
Activity        | IoWorkerMain        |       |             |              |          |       | io worker
       |         | 21473 |            |          |          |                  |             |                 |             | 2026-04-23 20:13:44.858993+05:30 |            |             |              | 
Activity        | IoWorkerMain        |       |             |              |          |       | io worker
       |         | 21474 |            |          |          |                  |             |                 |             | 2026-04-23 20:13:44.859189+05:30 |            |             |              | 
Activity        | CheckpointerMain    |       |             |              |          |       | checkpointer
       |         | 21477 |            |          |          |                  |             |                 |             | 2026-04-23 20:13:45.166859+05:30 |            |             |              | 
Activity        | WalWriterMain       |       |             |              |          |       | walwriter
       |         | 21475 |            |          |          |                  |             |                 |             | 2026-04-23 20:13:44.859392+05:30 |            |             |              | 
Activity        | BgwriterMain        |       |             |              |          |       | background writer
(7 rows)

These are real background processes currently running inside PostgreSQL.

Now, let us understand each one in detail.

1. LogicalLauncherMain

backend_type = logical replication launcher

This process manages logical replication workers.

It usually does not apply row changes directly. Instead, it watches subscriptions and starts the required workers.

What It Monitors

Enabled subscriptions
Worker failures
Databases needing replication
Initial sync requirements
Restart needs after errors

Typical Flow

Wake up
Read subscription metadata
Launch apply workers
Monitor status
Sleep and repeat

Example

If one PostgreSQL server subscribes to another:

Launcher detects subscription
Starts applying to the worker
Apply worker receives changes
Subscriber stays updated

Without this launcher, replication would require manual worker control.

2. AutovacuumMain

backend_type = autovacuum launcher

This process manages automatic vacuum and analyzes workers.

Why Is a Vacuum Needed

When rows are updated or deleted, old row versions remain.

Without cleanup:

Tables grow larger
Indexes bloat
Queries slow down
Storage increases
Wraparound danger rises

AutovacuumMain checks statistics and launches workers for tables needing maintenance.

Worker Jobs

Remove dead tuples
Freeze old transaction IDs
Update visibility maps
Refresh planner statistics

Example

A frequently updated orders table creates many dead rows. Autovacuum detects thresholds and cleans them automatically.

Autovacuum is essential for long-term PostgreSQL health.

You can inspect the postgres configuration parameters related to the autovacuum like this.

postgres=# show autovacuum
autovacuum                                autovacuum_max_parallel_workers           autovacuum_vacuum_cost_limit              autovacuum_vacuum_score_weight
autovacuum_analyze_scale_factor           autovacuum_max_workers                    autovacuum_vacuum_insert_scale_factor     autovacuum_vacuum_threshold
autovacuum_analyze_score_weight           autovacuum_multixact_freeze_max_age       autovacuum_vacuum_insert_score_weight     autovacuum_worker_slots
autovacuum_analyze_threshold              autovacuum_multixact_freeze_score_weight  autovacuum_vacuum_insert_threshold        autovacuum_work_mem
autovacuum_freeze_max_age                 autovacuum_naptime                        autovacuum_vacuum_max_threshold           
autovacuum_freeze_score_weight            autovacuum_vacuum_cost_delay              autovacuum_vacuum_scale_factor

3. IoWorkerMain

backend_type = io worker

This worker assists with internal I/O operations.

Why It Exists

Disk operations are slower than memory operations. Better I/O coordination improves performance.

Possible Tasks

Prefetch support
Asynchronous reads
Background storage operations
Reduced backend waiting
Improved page access patterns

Example

A large scan may need many blocks. Internal I/O workers can help prepare data access more efficiently.

Benefit

Lower latency
Better throughput

Smoother performance

4. CheckpointerMain

backend_type = checkpointer

Writes modified pages from shared buffers to disk and creates checkpoints.

What Is a Checkpoint?

A checkpoint is a synchronization point where PostgreSQL ensures many dirty pages are safely written.

Without checkpoints:

Crash recovery takes longer
More WAL replay is needed
Too many dirty pages accumulate

What It Does

Writes dirty buffers
Updates checkpoint records
Controls write bursts
Maintains recovery positions

Benefit

Faster restart after crash
Controlled write behavior
Better durability management

5. WalWriterMain

backend_type = walwriter

Flushes Write-Ahead Log data to disk.

What is WAL? (Write Ahead Log)

Before table data is permanently updated, PostgreSQL records the change in WAL.

If a crash occurs, WAL is replayed to recover committed work.

Why a Dedicated WAL Writer Exists

If every backend flushed WAL independently:

More contention occurs
More fsync pressure occurs
Commits may slow down

What It Does

Writes WAL buffers
Flushes WAL files
Supports commit durability
Reduces backend overhead

Benefit

Faster commits
Strong crash safety
Better throughput

6. BgwriterMain

Writes dirty shared buffers gradually before client backends are forced to do it.

Why It Exists

When a backend needs a free buffer, writing a dirty page itself can slow queries.

The background writer reduces that pressure.

What It Does

Scans buffers
Writes dirty pages
Free reusable buffers
Smooths memory pressure

Benefit

Lower latency spikes
Better cache reuse
Smoother runtime performance

Do Background Workers Use Separate Transactions?

A common question is whether background workers have their own transactions like normal sessions, using:

BEGIN;

COMMIT;

Yes. Many background workers can create and use their own transactions when needed.

However, they usually do this through PostgreSQL's internal C APIs rather than typing SQL statements directly.

Internal Transaction Functions

In source code, workers often use functions such as:

StartTransactionCommand();
CommitTransactionCommand();
AbortCurrentTransaction();

These are the internal equivalents of:

BEGIN;
COMMIT;
ROLLBACK;

Why Separate Transactions Matter

Each background worker is its own PostgreSQL process, so it has its own separate:

Transaction state
Snapshots
Locks
Memory contexts
Error handling
Resource ownership

That means a worker transaction is independent from your client session transaction.

Example: Autovacuum Worker

An autovacuum worker may:

Start transaction
Scan a table
Remove dead tuples
Commit
Move to the next task

Important Note

Not every worker stays inside a transaction all the time.

Many workers spend time:

Sleeping
Waiting for signals
Monitoring system state
Starting a transaction only when work begins

So transactions are used when needed.

How Workers Start Inside PostgreSQL

General lifecycle:

Postmaster starts
Registers workers
Needs a worker
Looks up the function in bgworker.c
Forks new process
Runs worker main function
Performs task
Repeats or exits

Background workers are the silent engine behind PostgreSQL reliability and speed. They clean tables, flush WAL, write buffers, coordinate replication, and perform internal maintenance.

The InternalBGWorkers[] array is not just a list of functions. It is a map of PostgreSQL responsibilities.

When you also understand that many workers run with their own separate transaction contexts, you begin to see PostgreSQL as a highly organized multi-process database engine rather than a simple query server.

How to Understand Internal Background Workers in PostgreSQL Source Code

What Are Background Workers?

Why PostgreSQL Uses Background Workers

The Internal Worker Registry in bgworker.c

How LookupBackgroundWorkerFunction() Works

Real Workers Seen in pg_stat_activity

1. LogicalLauncherMain

2. AutovacuumMain

Why Is a Vacuum Needed

3. IoWorkerMain

4. CheckpointerMain

What Is a Checkpoint?

5. WalWriterMain

What is WAL? (Write Ahead Log)

Why a Dedicated WAL Writer Exists

What It Does

6. BgwriterMain

Do Background Workers Use Separate Transactions?

Internal Transaction Functions

Why Separate Transactions Matter

Example: Autovacuum Worker

Important Note

How Workers Start Inside PostgreSQL

Category

Related Post