How Page Numbers, Tuples, and Query Limits Work in PostgreSQL OFFSET

PostgreSQL is a powerful and sophisticated relational database, offering robust features for both developers and DBAs. But once you dive beneath the surface, into internals like storage pages and tuple pointers, you’ll start encountering terms like offset, page, and ctid.

While “offset” might seem like a simple term, it has different meanings depending on context in PostgreSQL. In this post, we’ll clarify what “offset” refers to across different layers of the system, storage, tuple addressing, and query result pagination.

1. Offset as Page Number (Block-Level Offset)

PostgreSQL stores all data in fixed-size blocks (pages), typically 8KB in size. These are often referred to as heap pages for tables, or index pages for B-tree/GiST/GIN indexes.

Each block has a block number (starting from 0), and this is sometimes referred to as an offset in storage-level access.

Example:

Using the pageinspect extension, you can access specific pages (blocks) like so:

-- Load the extension
CREATE EXTENSION IF NOT EXISTS pageinspect;
-- Fetch page 100 from 'my_table'
SELECT * FROM heap_page_items(get_raw_page('my_table', 100));

Here, 100 is the page offset, i.e., the 100th 8KB block in the relation file.

Why it matters:

  • Helps in low-level debugging
  • Useful in forensics, e.g., data recovery or page corruption analysis
  • Key in performance optimizations, like understanding bloat

2. Offset in Tuple Identifiers (ctid)

Each row (tuple) in PostgreSQL is stored in a page and can be uniquely identified by a TID (tuple identifier), which consists of:

(block_number, offset_number)

This is exposed via the system column ctid.

Example:

SELECT ctid, * FROM my_table LIMIT 5;

Output:

 ctid   | id | name
--------+----+------
 (42,3) | 1  | Alice
 (42,4) | 2  | Bob

42: Block number (page offset)

3: Tuple offset within the page

Pages maintain a line pointer array, and the offset tells you which tuple slot to look at.

Use Cases:

Precise row identification (e.g., for row-level locking)

Low-level tools (e.g., logical decoding, page inspection)

Understanding tuple versioning (MVCC)

3. Offset in SQL Queries (OFFSET N)

At a SQL layer, OFFSET is a clause used to skip rows in a result set, usually for pagination.

SELECT * FROM my_table ORDER BY id OFFSET 100 LIMIT 10;

This skips the first 100 rows and returns the next 10.

Do you get it?

This has nothing to do with page numbers or tuple positions in storage. It's purely a logical row offset in the result stream.

Performance Tip:

  • Using OFFSET with high values can be inefficient, as PostgreSQL still has to scan and discard the skipped rows.
  • Prefer using keyset pagination or indexed values (WHERE id > ?) when possible.

How PostgreSQL Internally Navigates Offsets

Internally, PostgreSQL uses ItemPointerData structures to keep track of each tuple's exact location. This structure combines the block number and line pointer index into a single identifier, which is then used by various subsystems such as MVCC (Multi-Version Concurrency Control), locking, and vacuuming.

Here’s a simplified look at how that structure is defined in C source code:

typedef struct ItemPointerData
{
    BlockIdData ip_blkid;       // Block (page) number
    OffsetNumber ip_posid;      // Line pointer index (tuple offset)
} ItemPointerData;

This means that every time PostgreSQL wants to retrieve or modify a specific tuple, it first uses the block number to load the correct 8KB page from disk or memory, then uses the line pointer (offset number) to find the tuple inside that page.

How It’s Used in Practice

Index lookups (e.g., using a B-tree) eventually resolve to a TID (ctid) that tells PostgreSQL exactly which tuple to fetch.

Vacuum processes use these offsets to identify and clean up dead tuples.

Updates and Deletes reference ctid to operate on the correct physical row, since PostgreSQL does not overwrite tuples in place.

Real-World Implications

Because PostgreSQL depends so heavily on accurate offset tracking:

  • Bloat can occur if old tuples (with old offsets) are not vacuumed.
  • Page-level locking is fine-tuned to avoid conflicts when multiple operations target different offsets in the same page.
  • Logical replication tracks changes at the TID level, enabling precise change streams.

Summary: Same Word, Different Meaning

ContextMeaning of "Offset"Example
Storage-level page accessBlock number (page #)get_raw_page('table', 100)
Tuple pointer (ctid)Tuple offset in page
ctid = (42,3)
SQL query (OFFSET)Row offset in result setOFFSET 100 LIMIT 10

Understanding how PostgreSQL uses the word “offset” at different layers, from low-level pages to high-level queries, helps you become a more effective developer or DBA. Whether you're writing fast SQL queries or digging into page internals, knowing where and how data is stored and accessed gives you the edge.

whatsapp_icon
location

Calicut

Cybrosys Technologies Pvt. Ltd.
Neospace, Kinfra Techno Park
Kakkancherry, Calicut
Kerala, India - 673635

location

Kochi

Cybrosys Technologies Pvt. Ltd.
1st Floor, Thapasya Building,
Infopark, Kakkanad,
Kochi, India - 682030.

location

Bangalore

Cybrosys Techno Solutions
The Estate, 8th Floor,
Dickenson Road,
Bangalore, India - 560042

Send Us A Message