Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Mail Systems
Eclipse Documentation

How To Guides
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Problem Solutions




Chapter 50. Database Physical Storage

Table of Contents
50.1. Database File Layout
50.2. TOAST
50.3. Database Page Layout

This chapter provides an overview of the physical storage format used by PostgreSQL databases.

50.1. Database File Layout

This section describes the storage format at the level of files and directories.

All the data needed for a database cluster is stored within the cluster's data directory, commonly referred to as PGDATA (after the name of the environment variable that can be used to define it). A common location for PGDATA is /var/lib/pgsql/data. Multiple clusters, managed by different postmasters, can exist on the same machine.

The PGDATA directory contains several subdirectories and control files, as shown in Table 50-1. In addition to these required items, the cluster configuration files postgresql.conf, pg_hba.conf, and pg_ident.conf are traditionally stored in PGDATA (although in PostgreSQL 8.0 and later, it is possible to keep them elsewhere).

Table 50-1. Contents of PGDATA

Item Description
PG_VERSION A file containing the major version number of PostgreSQL
base Subdirectory containing per-database subdirectories
global Subdirectory containing cluster-wide tables, such as pg_database
pg_clog Subdirectory containing transaction commit status data
pg_multixact Subdirectory containing multitransaction status data (used for shared row locks)
pg_subtrans Subdirectory containing subtransaction status data
pg_tblspc Subdirectory containing symbolic links to tablespaces
pg_twophase Subdirectory containing state files for prepared transactions
pg_xlog Subdirectory containing WAL (Write Ahead Log) files
postmaster.opts A file recording the command-line options the postmaster was last started with A lock file recording the current postmaster PID and shared memory segment ID (not present after postmaster shutdown)

For each database in the cluster there is a subdirectory within PGDATA /base, named after the database's OID in pg_database. This subdirectory is the default location for the database's files; in particular, its system catalogs are stored there.

Each table and index is stored in a separate file, named after the table or index's filenode number, which can be found in pg_class.relfilenode.


Note that while a table's filenode often matches its OID, this is not necessarily the case; some operations, like TRUNCATE, REINDEX, CLUSTER and some forms of ALTER TABLE, can change the filenode while preserving the OID. Avoid assuming that filenode and table OID are the same.

When a table or index exceeds 1Gb, it is divided into gigabyte-sized segments. The first segment's file name is the same as the filenode; subsequent segments are named filenode.1, filenode.2, etc. This arrangement avoids problems on platforms that have file size limitations. The contents of tables and indexes are discussed further in Section 50.3.

A table that has columns with potentially large entries will have an associated TOAST table, which is used for out-of-line storage of field values that are too large to keep in the table rows proper. pg_class.reltoastrelid links from a table to its TOAST table, if any. See Section 50.2 for more information.

Tablespaces make the scenario more complicated. Each user-defined tablespace has a symbolic link inside the PGDATA /pg_tblspc directory, which points to the physical tablespace directory (as specified in its CREATE TABLESPACE command). The symbolic link is named after the tablespace's OID. Inside the physical tablespace directory there is a subdirectory for each database that has elements in the tablespace, named after the database's OID. Tables within that directory follow the filenode naming scheme. The pg_default tablespace is not accessed through pg_tblspc, but corresponds to PGDATA /base. Similarly, the pg_global tablespace is not accessed through pg_tblspc, but corresponds to PGDATA /global.

  Published courtesy of The PostgreSQL Global Development Group Design by Interspire