PostgreSQL does not include a built-in job scheduler like some other database systems. This is where pgAgent becomes useful. pgAgent is a PostgreSQL extension that allows you to automate tasks such as data imports, backups, maintenance queries, or periodic updates.
In this guide, we will build a real working example step-by-step. The goal is to create a scheduled job that runs every two minutes and loads data from a CSV file into a PostgreSQL table.
Understanding the Workflow
Before jumping into queries, it is important to understand how pgAgent works internally.
A pgAgent workflow contains several components:
- Job – the main container that represents a scheduled task.
- Job Step – the actual SQL or script executed by the job.
- Schedule – defines when the job runs.
- pgAgent Daemon – an external process that continuously checks for jobs to execute.
- Logs – tables where pgAgent records execution details.
In this example, the flow will be:
- Create a table.
- Prepare a CSV file.
- Create a pgAgent job.
- Add a SQL step to import CSV data.
- Attach a schedule to run every two minutes.
- Start the pgAgent daemon.
- Monitor logs and verify execution.
Before dive into this,create the extension named pgagent
create extension pgagent ;
Check the installed version by this command
select * from pg_available_extensions where name = 'pgagent';
Result :
name | default_version | installed_version | comment
---------+-----------------+-------------------+----------------------------
pgagent | 4.2 | 4.2 | A PostgreSQL job scheduler
(1 row)
Explore the functions and tables related to this pgagent extension by this command
\dx+ pgagent
Result :
Objects in extension "pgagent"
Object description
-------------------------------------------------------------------------------------------------------------------------------------------------
function pgagent.pga_exception_trigger()
function pgagent.pgagent_schema_version()
function pgagent.pga_is_leap_year(smallint)
function pgagent.pga_job_trigger()
function pgagent.pga_next_schedule(integer,timestamp with time zone,timestamp with time zone,boolean[],boolean[],boolean[],boolean[],boolean[])
function pgagent.pga_schedule_trigger()
table pgagent.pga_exception
table pgagent.pga_job
table pgagent.pga_jobagent
table pgagent.pga_jobclass
table pgagent.pga_joblog
table pgagent.pga_jobstep
table pgagent.pga_jobsteplog
table pgagent.pga_schedule
Step 1: Creating the Target Table
First, we create a table named employees. This table will store records loaded from the CSV file.
CREATE TABLE IF NOT EXISTS public.employees (
employee_id INTEGER PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
department VARCHAR(50),
salary NUMERIC(10,2),
hire_date DATE,
is_active BOOLEAN DEFAULT true,
loaded_at TIMESTAMPTZ DEFAULT now()
);
Purpose
- Defines a structured destination for CSV data.
- employee_id is a primary key to avoid duplicates.
- loaded_at helps track when the data was last updated.
Step 2: Preparing Initial Data
We insert some sample rows manually to verify that the table structure works correctly.
INSERT INTO public.employees
(employee_id, first_name, last_name, email, department, salary, hire_date, is_active)
VALUES
(1,'John','Smith','john.smith@company.com','Engineering',85000.00,'2021-03-15',true)
ON CONFLICT (employee_id) DO NOTHING;
Purpose
- Ensures constraints and data types are correct.
- Validates that the table accepts inserts.
Step 3: Creating the CSV File
From the Linux terminal, create a CSV file under /tmp.
sudo -u postgres bash -c 'cat > /tmp/employees.csv << EOF
employee_id,first_name,last_name,email,department,salary,hire_date,is_active
1,John,Smith,john.smith@company.com,Engineering,85000.00,2021-03-15,true
2,Sarah,Johnson,sarah.j@company.com,Marketing,72000.00,2020-07-22,true
3,Michael,Brown,m.brown@company.com,Engineering,91000.00,2019-11-01,true
4,Emily,Davis,emily.d@company.com,HR,65000.00,2022-01-10,true
5,James,Wilson,james.w@company.com,Finance,78000.00,2021-09-05,true
6,Linda,Taylor,linda.t@company.com,Marketing,69000.00,2020-04-18,true
7,Robert,Anderson,r.anderson@company.com,Engineering,95000.00,2018-06-30,true
8,Karen,Thomas,karen.t@company.com,HR,63000.00,2023-02-14,true
9,Charles,Jackson,c.jackson@company.com,Finance,82000.00,2019-08-21,true
10,Barbara,White,b.white@company.com,Engineering,88000.00,2020-12-03,true
11,Tom,Harris,tom.h@company.com,Engineering,76000.00,2024-01-01,true
12,Anna,Clark,anna.c@company.com,HR,67000.00,2024-03-15,true
13,David,Lewis,david.l@company.com,Finance,80000.00,2024-06-20,true
EOF'
Verify the file:
sudo -u postgres ls -la /tmp/employees.csv
sudo -u postgres cat /tmp/employees.csv
Purpose
- pgAgent executes COPY from the database server side.
- The CSV must be accessible to the PostgreSQL system user.That’s why we created this file inside the csv folder
Step 4: Creating the pgAgent Job
INSERT INTO pgagent.pga_job (jobjclid, jobname, jobdesc, jobenabled)
VALUES (1, 'Import Employees CSV Every 2 Minutes', 'Loads employees.csv into public.employees', true);
What Happens Internally
- A new job entry is created.
- jobnextrun remains NULL because no schedule exists yet.
- A trigger inside pgAgent prepares metadata automatically.
select * from pgagent.pga_job;
Result :
-[ RECORD 1 ]+------------------------------------------
jobid | 1
jobjclid | 1
jobname | Import Employees CSV Every 2 Minutes
jobdesc | Loads employees.csv into public.employees
jobhostagent |
jobenabled | t
jobcreated | 2026-02-23 22:09:41.950536+05:30
jobchanged | 2026-02-23 22:09:41.950536+05:30
jobagentid |
jobnextrun |
joblastrun |
Step 5: Creating the Job Step
This is the core logic executed by the scheduler.
INSERT INTO pgagent.pga_jobstep (jstjobid, jstname, jstenabled, jstkind, jstdbname, jstonerror, jstcode)
VALUES (
1,
'COPY employees from CSV',
true,
's',
'postgres',
'f',
'
CREATE TEMP TABLE IF NOT EXISTS emp_staging (
employee_id INTEGER,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100),
department VARCHAR(50),
salary NUMERIC(10,2),
hire_date DATE,
is_active BOOLEAN
);
COPY emp_staging (employee_id, first_name, last_name, email, department, salary, hire_date, is_active)
FROM ''/tmp/employees.csv''
WITH (FORMAT csv, HEADER true);
INSERT INTO public.employees (employee_id, first_name, last_name, email, department, salary, hire_date, is_active)
SELECT employee_id, first_name, last_name, email, department, salary, hire_date, is_active
FROM emp_staging
ON CONFLICT (employee_id) DO NOTHING;
UPDATE public.employees SET loaded_at = now();
'
);
Purpose
The step performs multiple operations:
- Creates a temporary staging table for directly copy this csv data
- Loads CSV rows into staging using COPY.
- Inserts new rows into the main table.
- Updates timestamps.
Using a staging table improves safety and allows transformations before inserting.
Step 6: Creating the Schedule
INSERT INTO pgagent.pga_schedule (
jscjobid, jscname, jscenabled, jscstart,
jscminutes, jschours, jscweekdays, jscmonthdays, jscmonths
)
VALUES (
1,
'Every 2 Minutes',
true,
now(),
ARRAY(SELECT (i % 2 = 0) FROM generate_series(0,59) i),
ARRAY(SELECT true FROM generate_series(0,23)),
ARRAY[true,true,true,true,true,true,true],
ARRAY(SELECT true FROM generate_series(0,31)),
ARRAY(SELECT true FROM generate_series(0,11))
);
When creating a schedule in pgAgent, the most important part is defining when a job should run. Instead of using cron-style text, pgAgent uses Boolean arrays to describe allowed time values. Each position inside an array represents a unit of time, and the value true or false determines whether execution is allowed at that position.
Each column corresponds to a different time component. pgAgent checks the current timestamp against these arrays to decide if the job should execute.
1. jscminutes
This array controls which minutes within an hour the job is allowed to run.
It always contains 60 Boolean values, representing minutes 0 through 59.
2. jschours
Defines which hours of the day the job can run.
It contains 24 Boolean values representing hours 0 through 23.
3. jscweekdays
Controls which days of the week are valid for execution.
This array contains 7 Boolean values representing:
Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday
4. jscmonthdays
Specifies allowed days within a month.
It contains 32 Boolean values representing day numbers 0 through 31.
5. jscmonths
Defines allowed months of the year.
It contains 12 Boolean values representing months 0 through 11.
Purpose
- Defines when the job executes.
- The minute array marks every even minute as TRUE, meaning the job runs every two minutes.
- After inserting the schedule, pgAgent automatically calculates jobnextrun.
select jobnextrun from pgagent.pga_job;
Example result:
jobnextrun | 2026-02-23 21:12:00+05:30
Step 8: Optional Exceptions
INSERT INTO pgagent.pga_exception (jexscid, jexdate, jextime)VALUES (1, current_date + 1, '00:00:00');
Purpose
Allows blackout periods where the job will not execute.
Step 9: Starting the pgAgent Daemon
pgAgent is not a PostgreSQL background worker. It must run separately.
pgagent -f -l 2 "host=localhost port=5432 dbname=postgres user=postgres"
What the Daemon Does
- Connects to PostgreSQL.
- Periodically checks for due jobs.
- Executes SQL steps.
- Writes logs into pgAgent tables.
Your daemon logs showed:
Mon Feb 23 21:12:02 2026 DEBUG: Allocating new connection for the database with connection string: user=postgres dbname=postgres host=localhost port=5432...
Mon Feb 23 21:12:02 2026 DEBUG: Executing SQL step 1(part of job 1)
Mon Feb 23 21:12:02 2026 DEBUG: Parsing connection information...
Mon Feb 23 21:12:02 2026 DEBUG: user: postgres
Mon Feb 23 21:12:02 2026 DEBUG: dbname: postgres
Mon Feb 23 21:12:02 2026 DEBUG: host: localhost
Mon Feb 23 21:12:02 2026 DEBUG: port: 5432
Mon Feb 23 21:12:02 2026 DEBUG: Returning the connection to the connection pool: 'user=postgres dbname=postgres host=localhost port=5432'...
Mon Feb 23 21:12:02 2026 DEBUG: Parsing connection information...
Mon Feb 23 21:12:02 2026 DEBUG: user: postgres
Mon Feb 23 21:12:02 2026 DEBUG: dbname: postgres
Mon Feb 23 21:12:02 2026 DEBUG: host: localhost
Mon Feb 23 21:12:02 2026 DEBUG: port: 5432
Mon Feb 23 21:12:02 2026 DEBUG: Returning the connection to the connection pool: 'user=postgres dbname=postgres host=localhost port=5432'...
Mon Feb 23 21:12:02 2026 DEBUG: Completed job: 1
Mon Feb 23 21:12:02 2026 DEBUG: Destroying job thread for job 1
This confirms successful execution.
Step 10: Monitoring Execution
Check job history:
SELECT jlgid, jlgstatus, jlgstart
FROM pgagent.pga_joblog;
Result :
jlgid | jlgstatus | jlgstart
-------+-----------+----------------------------------
1 | s | 2026-02-23 22:22:03.822471+05:30
(1 row)
Check step results:
SELECT jslstatus, jslresult
FROM pgagent.pga_jobsteplog;
Example output:
jslstatus = s
jslresult = 13
Meaning:
- Status s indicates success.
- 13 rows were processed from the CSV.
Step 11: Verifying Data Growth
Before execution:
count = 0
After scheduled run:
count = 13
This confirms that pgAgent executed the COPY step automatically at the scheduled time.
You can append rows to the CSV:
echo "14,Nina,Scott,nina.s@company.com,Marketing,71000.00,2025-01-10,true" >> /tmp/employees.csv
On the next run, the count increases.
How pgAgent Internally Managed the Job
From the execution logs:
- A job agent registered in pga_jobagent.
- A job log entry recorded execution start time.
- A step log captured result count.
- joblastrun and jobnextrun updated automatically.
This shows that pgAgent is fully metadata-driven. Once configured, the scheduler works without manual triggers.
Key Advantages of Using pgAgent
- Fully SQL-based job creation.
- Flexible scheduling arrays.
- Detailed execution logs.
- Supports complex multi-step workflows.
- Works well for automated data imports.
In this guide, we created a complete pgAgent workflow that imports employee records from a CSV file into PostgreSQL every two minutes. The process involved creating a job, adding a SQL step, defining a schedule, and running the pgAgent daemon.
The working results demonstrated:
- Automatic scheduling.
- Successful step execution.
- Real-time data insertion.
- Log-based monitoring.
pgAgent becomes especially powerful when used for recurring ETL tasks, maintenance jobs, or periodic reporting processes. Once configured properly, it provides a reliable automation layer for PostgreSQL databases.