As databases grow, frequent updates and deletions can lead to table and index bloat, consuming unnecessary disk space and slowing down queries. While PostgreSQL’s built-in VACUUM and AUTOVACUUM help maintain performance, they don’t always reclaim space efficiently, and VACUUM FULL requires an exclusive lock, causing downtime.
To solve this, PostgreSQL provides a powerful extension called pg_repack, which reorganizes tables and indexes without blocking read or write operations. It’s the ideal choice for live production systems that cannot afford downtime.
Installing pg_repack
On PostgreSQL 18 or later, you can install pg_repack easily using APT:
sudo apt install postgresql-18-repack
Alternatively, you can build it manually from the source:
git clone https://github.com/reorg/pg_repack.git
cd pg_repack
Make
sudo make install
Once installed, enable the extension in your database:
CREATE EXTENSION pg_repack;
Verify that it’s available:
SELECT * FROM pg_available_extensions WHERE name = 'pg_repack';
Output:
name | default_version | installed_version | comment
-----------+-----------------+-------------------+--------------------------------------------------------------
pg_repack | 1.5 | 1.5 | Reorganize tables without blocking writes
Schema and Functions Installed by pg_repack
When you create the extension, PostgreSQL adds a new schema named repack containing internal functions, views, and types. You can explore them with:
\dx+ pg_repack
Here’s an excerpt showing some of the key functions installed:
function repack.create_log_table(oid)
function repack.create_table(oid,name)
function repack.disable_autovacuum(regclass)
function repack.get_create_trigger(oid,oid)
function repack.repack_swap(oid)
These internal functions handle operations like:
- Creating temporary log tables
- Copying live data
- Applying triggers to sync changes
- Disabling autovacuum during rebuilds
- Swapping the new and old tables seamlessly
Running pg_repack in Action
Once installed, you can run pg_repack directly from your terminal to rebuild tables or indexes.
Example 1 – Repacking a Specific Table
Here, the employees table is repacked in the postgres database running on port 5433:
postgres@cybrosys:/home/cybrosys/pg_repack$ pg_repack -d postgres -p 5433
INFO: repacking table "public.employees"
This command performs a full online rebuild of the employees table. During this process, PostgreSQL remains accessible — reads and writes continue as usual.
Example 2 – Repacking a Table in Another Database
Suppose your application database is averigo_18 and you want to repack the res_partner table:
postgres@cybrosys:/home/cybrosys/pg_repack$ pg_repack -d postgres -t res_partner -p 5433
INFO: repacking table "public.res_partner"
This command only reorganizes the res_partner table, reclaiming space and optimizing performance while keeping it online.
Exploring pg_repack Command Options
To understand all available commands, run:
postgres@cybrosys:/home/cybrosys/pg_repack$ pg_repack --help
It displays a detailed list of options:
pg_repack re-organizes a PostgreSQL database.
Usage:
pg_repack [OPTION]... [DBNAME]
Options:
-a, --all repack all databases
-t, --table=TABLE repack specific table only
-I, --parent-table=TABLE repack parent table and inheritors
-c, --schema=SCHEMA repack tables in specific schema
-s, --tablespace=TBLSPC move repacked tables to a new tablespace
-S, --moveidx move repacked indexes too
-o, --order-by=COLUMNS order by custom columns
-n, --no-order perform vacuum full instead of cluster
-N, --dry-run print what would be repacked
-j, --jobs=NUM use multiple parallel jobs
-i, --index=INDEX move a specific index
-x, --only-indexes repack only indexes
-T, --wait-timeout=SECS set timeout for conflicts
-Z, --no-analyze skip analyze at end
--apply-count batch tuples applied per transaction
--switch-threshold threshold before swapping tables
This shows the flexibility of pg_repack, you can repack all databases, specific tables, schemas, or even individual indexes.
Example of Internal Function Outputs
Let’s explore some of the internal functions provided by the pg_repack extension and observe their real outputs from the database `averigo_18`.
SELECT repack.version();
Output:
version
-----------------
pg_repack 1.5.2
(1 row)
SELECT repack.version_sql();
Output:
version_sql
-----------------
pg_repack 1.5.2
(1 row)
SELECT repack.get_table_and_inheritors('public.res_partner');Output:
get_table_and_inheritors
--------------------------
{3556394}
(1 row)
SELECT repack.get_index_columns('public.res_partner_pkey'::regclass);Output:
get_index_columns
-------------------
'id'
(1 row)
SELECT repack.get_storage_param('public.res_partner'::regclass);Output:
get_storage_param
-------------------
oids = false
(1 row)
These examples illustrate how the `pg_repack` internal functions expose valuable insights about the database’s structure—including version details, table identifiers, primary key columns, and storage parameters.
Such visibility helps database administrators understand how the extension interacts with internal table metadata during the repacking process.
When to Use pg_repack
pg_repack should be used when:
- Tables or indexes have grown excessively due to frequent updates/deletes.
- You need to reclaim storage without service downtime.
- VACUUM FULL is too disruptive for production.
- Query performance degrades due to table fragmentation.
For instance, after months of operation in odoo, models like mail_message in an Odoo database might grow large due to transactional updates. Running pg_repack reclaims that unused space efficiently.
Performance Tips
- Ensure there’s enough disk space: pg_repack temporarily duplicates the target table during rebuild.
- The table must have a primary key or unique index.
- Run during low-traffic periods to minimize replication lag or I/O overhead.
- Regularly schedule repacking for high-traffic transactional tables.
Conclusion
pg_repack is one of the most valuable PostgreSQL extensions for live systems that handle continuous writes. It efficiently reorganizes tables and indexes, eliminates bloat, and enhances performance—all without interrupting database activity.
Whether you’re managing an Odoo production database, a high-traffic application, or any PostgreSQL-driven service, integrating pg_repack into your maintenance cycle ensures long-term stability and performance optimization.