How to Use PostgreSQL HypoPG Extension to Test Indexes Without Creating Them

PostgreSQL is a powerful and flexible relational database system widely used for both small-scale and enterprise applications. One of the most effective ways to improve query performance in PostgreSQL is by using indexes. Indexes help the database engine quickly locate rows that match certain conditions, which can drastically reduce query execution time.

However, creating indexes, especially on large tables, comes with its own challenges:

  • Time-consuming: Building an index on a big table can take several minutes or even hours.
  • Resource-heavy: Index creation uses CPU, memory, and disk I/O, which can impact performance.
  • Potentially unnecessary: You might create an index only to find that it doesn’t actually improve performance, or worse, it slows things down due to added maintenance overhead.

So, how do you test the performance impact of an index without the risk of actually creating it?

That’s where hypopg comes in.

hypopg is a PostgreSQL extension that lets you create hypothetical (virtual) indexes. These are indexes that exist only in the PostgreSQL query planner’s imagination. You can simulate an index, see how it affects the query plan, and then drop it instantly, without modifying your actual database or consuming disk space.

This is incredibly useful for:

  • Query optimization testing in staging or development environments.
  • Database tuning without downtime.
  • Experimenting with different indexing strategies.

Avoiding unnecessary index creation in production.

Step 1: Set Up the Environment

Before using hypopg, ensure it’s installed. Check if the extension is already installed in your database

 Select * from pg_available_extensions

You can typically install it using:

sudo apt install postgresql-17-hypopg

Then enable it in your PostgreSQL database:

CREATE EXTENSION hypopg;

Confirm it’s enabled:

\dx

Step 2: Create a Sample Table and Insert Bulk Rows

Let’s create a demo table sales_data and insert 100,000 rows for testing:

CREATE TABLE sales_data (
    id SERIAL PRIMARY KEY,
    customer_id INT,
    product_id INT,
    sale_date DATE,
    total_amount NUMERIC
);
-- Insert bulk data
INSERT INTO sales_data (customer_id, product_id, sale_date, total_amount)
SELECT
    (random() * 1000)::INT,
    (random() * 200)::INT,
    CURRENT_DATE - (random() * 365)::INT,
    (random() * 1000)::NUMERIC
FROM generate_series(1, 100000);

Step 3: Start Using hypopg

Now, let’s begin working with the actual functions in the hypopg extension.

hypopg_create_index(sql_order TEXT)

This function simulates an index creation. It does not create the index physically, but makes PostgreSQL believe it exists during query planning.

Before creating an index, just analyse the query execution from this table

-- Before index
EXPLAIN SELECT * FROM sales_data WHERE customer_id = 100;

You get results like this.

hypopg_test_database=# EXPLAIN ANALYZE SELECT * FROM sales_data WHERE customer_id = 100;
                                                QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Seq Scan on sales_data  (cost=0.00..1986.00 rows=98 width=27) (actual time=1.126..9.268 rows=102 loops=1)
   Filter: (customer_id = 100)
   Rows Removed by Filter: 99898
 Planning Time: 0.044 ms
 Execution Time: 9.413 ms
(5 rows)

Create the virtual index by using the function named hypopg_create_index()

-- Simulate index on customer_id
SELECT * FROM hypopg_create_index('CREATE INDEX ON sales_data(customer_id)');

This returns a result like this.

indexrelid |       indexname        
------------+-------------------------
     123456 | <hypopg>btree_sales_data_customer_id

Compare Query Plans

Run EXPLAIN before and after creating the hypothetical index:

-- Before index
EXPLAIN SELECT * FROM sales_data WHERE customer_id = 100;
hypopg_test_database=# EXPLAIN SELECT * FROM sales_data WHERE customer_id = 100;
                          QUERY PLAN                           
---------------------------------------------------------------
 Seq Scan on sales_data  (cost=0.00..1986.00 rows=98 width=27)
   Filter: (customer_id = 100)
-- After index
SELECT * FROM hypopg_create_index('CREATE INDEX ON sales_data(customer_id)');
EXPLAIN SELECT * FROM sales_data WHERE customer_id = 100;

After creating the index, you geta result like this.

hypopg_test_database=# EXPLAIN SELECT * FROM sales_data WHERE customer_id = 100;
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on sales_data  (cost=4.80..276.45 rows=98 width=27)
   Recheck Cond: (customer_id = 100)
   ->  Bitmap Index Scan on "<13581>btree_sales_data_customer_id"  (cost=0.00..4.78 rows=98 width=0)
         Index Cond: (customer_id = 100)
(4 rows)

You should now see a plan using an Index Scan instead of a Seq Scan.

hypopg() — List All Hypothetical Indexes

This function returns a list of all indexes currently being simulated:

SELECT * FROM hypopg();

You get results like this.

hypopg_test_database=# select hypopg();
                                 hypopg                                 
------------------------------------------------------------------------
 (<13581>btree_sales_data_customer_id,13581,671663,1,f,2,0,1978,,,,403)
(1 row)

This is a composite record with fields defined in the hypopg() function's return type:

  • indexname: The name of the hypothetical index, such as btree_sales_data_customer_id. This helps you identify the simulated index.
  • indexrelid: A fake OID (Object Identifier) assigned to the hypothetical index. You can use this to reference or drop the index using functions like hypopg_drop_index().
  • indrelid: The OID of the base table on which the index is defined. This links the hypothetical index to its corresponding real table.
  • indnatts: The number of columns included in the index. For example, if this value is 1, it means the index covers a single column like customer_id.
  • indisunique: A boolean flag (t for true, f for false) indicating whether the index is defined as UNIQUE.
  • indkey: Shows the internal PostgreSQL column number(s) used in the index. These correspond to the columns in the table, referenced by position.
  • indnkeyatts: The number of key columns used in the index. This is especially relevant for expression indexes.
  • indclass: The OID of the operator class used for indexing, such as int4_ops for a standard B-tree index on an integer column.
  • indcollation: Displays the OID of the collation used for the index, if applicable. If no collation is specified, it may be null.
  • indoption: Represents index-specific options, such as sort direction (ascending/descending). It's often null for basic indexes.
  • indexprs: If the index is based on an expression rather than direct column references, this field contains the internal representation of that expression.
  • indpred: Holds the predicate clause if the index is a partial index, i.e., it applies only to rows that satisfy a specific condition.

Use this function to remove a single hypothetical index using its indexrelid:

-- Get the indexrelid
SELECT indexrelid FROM hypopg();
-- Drop the index
SELECT hypopg_drop_index(123456); -- replace with actual ID

Then you get a result like this

hypopg_test_database=# SELECT hypopg_drop_index(13581); -- replace with actual ID
-[ RECORD 1 ]-----+--
hypopg_drop_index | t

hypopg_reset() — Drop All Hypothetical Indexes

Remove all hypothetical indexes in one command:

SELECT hypopg_reset();

Here's the summary of hypopg functions in unordered list format:

  • hypopg_create_index() – Simulate a new index
  • hypopg() – List all hypothetical indexes
  • hypopg_get_indexdef() – View index definition
  • hypopg_drop_index() – Drop one hypothetical index
  • hypopg_reset() / hypopg_reset_index() – Drop all hypothetical indexes
  • hypopg_hide_index() / hypopg_unhide_index() – Hide/unhide real indexes
  • hypopg_hidden_indexes() – View hidden indexes
  • hypopg_relation_size() – Estimate size of simulated index

You can explore each function by using this short command in the psql terminal.

\df+

The hypopg extension is a powerful diagnostic tool for PostgreSQL users who want to make informed indexing decisions without the overhead of physically creating indexes. Its ability to simulate indexes in-memory opens the door to advanced query tuning techniques that are safe, fast, and cost-effective.

Real-World Use Cases Where hypopg Shines:

  • Query Optimization: Quickly test whether an index would improve the performance of complex joins or WHERE clause filters before actually creating it.
  • Index Strategy Planning: Experiment with various single-column and composite indexes to identify the most efficient combinations.
  • Pre-Deployment Testing: Simulate indexing scenarios in staging environments to fine-tune performance before pushing to production.
  • Avoiding Index Bloat: Prevent unnecessary or unused indexes from cluttering your database by validating their impact first.
  • Index Rebuilding Decisions: Check if rebuilding or modifying an existing index (e.g., changing column order) would lead to better plans.
  • Performance Auditing: Work alongside tools like pg_stat_statements to identify slow queries and validate if a hypothetical index could help

Unlike traditional trial-and-error indexing, hypopg helps you plan with precision, ensuring your indexes are actually worth the cost in storage and maintenance

whatsapp_icon
location

Calicut

Cybrosys Technologies Pvt. Ltd.
Neospace, Kinfra Techno Park
Kakkancherry, Calicut
Kerala, India - 673635

location

Kochi

Cybrosys Technologies Pvt. Ltd.
1st Floor, Thapasya Building,
Infopark, Kakkanad,
Kochi, India - 682030.

location

Bangalore

Cybrosys Techno Solutions
The Estate, 8th Floor,
Dickenson Road,
Bangalore, India - 560042

Send Us A Message