PostgreSQL

Production architecture, high availability, performance tuning & operational best practices

01

Overview

PostgreSQL is an advanced, open-source object-relational database management system with nearly 40 years of active development (origins dating to 1986 at UC Berkeley). It has earned a reputation for reliability, data integrity, and correctness, making it the default choice for most greenfield projects that need a relational database. As of early 2026, PostgreSQL 18 is the current major release (September 2025), with PostgreSQL 17 widely deployed in production.

What sets PostgreSQL apart from other relational databases is its extensibility. You can define your own data types, build custom functions in multiple languages (PL/pgSQL, PL/Python, PL/Perl, C), create custom index types, and even write extensions that fundamentally change how the database behaves. This extensibility is why PostgreSQL has become the foundation for specialized databases like TimescaleDB (time-series), Citus (distributed), and pgvector (AI/ML embeddings).

Core Strengths

MVCC Concurrency

Multi-Version Concurrency Control allows readers and writers to never block each other. Each transaction sees a consistent snapshot of the database. No read locks, no write starvation — this is fundamental to PostgreSQL's performance under concurrent workloads.

SQL Standards Compliance

PostgreSQL implements a large portion of the SQL:2023 standard. CTEs, window functions, lateral joins, MERGE, JSON path queries, generated columns — features that other databases either lack or implement partially.

ACID Data Integrity

Full ACID compliance with Write-Ahead Logging (WAL). Serializable isolation level available. Foreign keys, check constraints, exclusion constraints, and triggers enforce data integrity at the database level.

EXTENSIBLE Ecosystem

Over 1,000 extensions available. PostGIS (geospatial), pgvector (embeddings), TimescaleDB (time-series), pg_cron (scheduling), pgAudit (audit logging). Extensions can add data types, operators, index methods, and even custom storage engines.

When to Choose PostgreSQL

PostgreSQL is the right choice for most workloads where you need a relational database. It excels at OLTP, handles analytical queries well (especially with partitioning and parallel query), supports JSON/JSONB for semi-structured data, and has native full-text search. The only cases where you might look elsewhere: extremely high write throughput with relaxed consistency (consider Cassandra/ScyllaDB), pure key-value at massive scale (consider Redis/DynamoDB), or when your team is deeply invested in another ecosystem (SQL Server/.NET, Oracle/Java EE).

Recommendation

If you're starting a new project and need a relational database, default to PostgreSQL unless you have a specific, documented reason not to. The ecosystem, community, and operational tooling are unmatched in the open-source world.

02

Architecture

Understanding PostgreSQL's process model and memory architecture is essential for proper tuning and troubleshooting. Unlike MySQL's thread-per-connection model, PostgreSQL uses a process-per-connection model — each client connection gets its own operating system process. (Note: as of PG 15, the stats collector process was removed and statistics are now stored in shared memory.)

Process Model

+-------------------------------------------------------------+ | Postmaster | | (PID 1 / main process) | | Listens on port 5432, forks backend processes | +------------+------------------------------------------------+ | fork() +----------------------+----------------------+ v v v +------------------+ +------------------+ +------------------+ | Backend Process | | Backend Process | | Backend Process | | (client conn 1) | | (client conn 2) | | (client conn N) | +------------------+ +------------------+ +------------------+ Background Workers (always running): +--------------+ +--------------+ +--------------+ +--------------+ | WAL Writer | | Checkpointer| | Autovacuum | | BG Writer | | | | | | Launcher | | | +--------------+ +--------------+ +--------------+ +--------------+ +--------------+ +--------------+ | Logical Rep | | WAL Sender | | Worker | | (per replica) | +--------------+ +--------------+

The postmaster is the parent process. It listens for incoming connections and forks a new backend process for each one. This model is simple and robust — a crash in one backend doesn't bring down others — but it means each connection consumes real OS memory (typically 5-10 MB per idle connection, more under load). This is why connection pooling is critical in production.

Shared Memory

All backend processes share a region of memory managed by PostgreSQL:

  • Shared Buffers — The main data cache. Pages read from disk are cached here. Default is 128 MB, but production systems typically set this to 25% of total RAM (up to about 8-16 GB, beyond which returns diminish).
  • WAL Buffers — Buffer for Write-Ahead Log entries before they're flushed to disk. Default auto-sized to 1/32 of shared_buffers, usually 16 MB is sufficient.
  • Lock Tables — Tracking all active locks across transactions.
  • Proc Array — Information about all active backends, used for MVCC visibility checks.

Write-Ahead Logging (WAL)

Every change to data is first written to the WAL before being applied to the actual data files. This ensures durability — if the server crashes, PostgreSQL replays the WAL to recover to a consistent state. WAL is also the foundation for replication (streaming replication sends WAL records to replicas) and Point-in-Time Recovery (PITR).

# WAL files live in pg_wal/ (formerly pg_xlog/)
# Each segment is 16 MB by default
ls -la /var/lib/postgresql/18/main/pg_wal/
# Watch WAL generation rate
pg_stat_wal  # PostgreSQL 14+

MVCC Internals

PostgreSQL's MVCC implementation stores old row versions directly in the table (heap). When a row is updated, the old version is marked with the transaction ID that invalidated it, and a new version is inserted. This means:

  • Updates are essentially delete + insert (table bloat is a real concern)
  • VACUUM is required to reclaim space from dead tuples
  • Long-running transactions prevent VACUUM from cleaning up, leading to table bloat
  • The visibility map tracks which pages contain only all-visible tuples (enables index-only scans)
Warning

Long-running transactions are one of the most common causes of production incidents in PostgreSQL. A single forgotten BEGIN without a COMMIT can prevent autovacuum from cleaning up dead tuples across the entire database, eventually leading to table bloat, degraded performance, and even transaction ID wraparound.

Tablespaces

Tablespaces allow you to place specific tables or indexes on different physical storage. Useful for putting hot indexes on fast NVMe while keeping large, infrequently accessed tables on cheaper storage. In practice, most cloud deployments use a single tablespace since the storage layer handles tiering.

-- Create a tablespace on fast storage
CREATE TABLESPACE fast_ssd LOCATION '/mnt/nvme/pg_data';

-- Move a table to the fast tablespace
ALTER TABLE orders SET TABLESPACE fast_ssd;

-- Move an index
ALTER INDEX orders_created_idx SET TABLESPACE fast_ssd;
03

Deployment Models

How you deploy PostgreSQL has a massive impact on operational complexity, performance, and cost. There's no single right answer — it depends on your team's capabilities, compliance requirements, and budget.

Deployment Comparison

Model Pros Cons Best For
Bare Metal Maximum performance, full control, no hypervisor overhead Hardware procurement, no live migration, capacity planning is manual High-performance OLTP, financial systems, latency-critical workloads
VMs Familiar, snapshotting, live migration, good isolation Some overhead, storage I/O can be a bottleneck, right-sizing is tricky Most production workloads, traditional enterprise environments
Containers (K8s) Declarative, GitOps-friendly, easy dev/staging parity Stateful workloads on K8s are hard, storage performance, networking complexity Dev/staging, teams already deep in K8s, with operators like CloudNativePG
Managed (RDS, CloudSQL) Automated backups, patching, HA, monitoring out of the box Higher cost, limited extension support, less tuning control, vendor lock-in Small teams, startups, when you don't have a dedicated DBA

Self-Hosted vs Managed

The decision to self-host vs use a managed service comes down to team expertise and control requirements. If your team has strong PostgreSQL operational knowledge and you need full control over extensions, configuration, and upgrade timing, self-host. If you're a small team without a DBA, managed services like AWS RDS, Google CloudSQL, or Azure Database for PostgreSQL handle the undifferentiated heavy lifting.

Info

Managed services often lag behind upstream PostgreSQL by several months. As of early 2026, RDS supports PostgreSQL through version 18.3, but new major versions typically take a few months to become available after upstream release. If you need day-one access to the latest features or specific extensions not yet supported by your provider, self-hosting is the only option.

Kubernetes Considerations

Running PostgreSQL on Kubernetes is increasingly viable but requires careful planning. The key challenges are storage performance, data durability, and operational tooling. Use a purpose-built operator:

Recommended CloudNativePG

CNCF Sandbox project. Declarative PostgreSQL clusters, automated failover, backup to S3/GCS, rolling updates. The most mature K8s-native operator. Doesn't rely on Patroni — has its own consensus mechanism.

Alternative Crunchy PGO

By Crunchy Data (v5). Uses Patroni for HA. Supports pgBackRest for backups, pgBouncer sidecar for connection pooling. Well-documented, production-tested. Open source (Apache 2.0).

Warning

Never run production PostgreSQL on Kubernetes with emptyDir or hostPath volumes. Always use persistent volumes backed by a proper storage class (e.g., gp3 on AWS, pd-ssd on GCP). Ensure your storage class has allowVolumeExpansion: true — you will need to grow volumes.

Installation on Ubuntu/Debian

Always use the official PGDG (PostgreSQL Global Development Group) repository for the latest stable packages:

# Add the official PostgreSQL repository
sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh

# Install PostgreSQL 18
sudo apt install -y postgresql-18

# Verify
pg_lsclusters
sudo -u postgres psql -c "SELECT version();"
04

High Availability

PostgreSQL does not have built-in automatic failover. It provides the replication primitives — streaming replication and logical replication — but you need external tooling to detect failures and promote a standby. This is by design: the database doesn't try to be a distributed consensus system.

Streaming Replication

Physical (streaming) replication ships WAL records from the primary to one or more standby servers. The standby continuously applies these records, maintaining an exact byte-for-byte copy of the primary. Standby servers can serve read-only queries (hot standby).

# postgresql.conf on primary
wal_level = replica                # minimal, replica, or logical
max_wal_senders = 10               # max number of standby connections
wal_keep_size = 1GB                # retain WAL for slow standbys (PG 13+)
synchronous_standby_names = ''     # empty = async, set for sync replication

# pg_hba.conf on primary
host replication replicator 10.0.0.0/24 scram-sha-256
# On the standby server
sudo -u postgres pg_basebackup \
  -h primary.example.com \
  -D /var/lib/postgresql/18/main \
  -U replicator -Fp -Xs -P -R

# -R creates standby.signal and sets primary_conninfo in postgresql.auto.conf

Synchronous vs Asynchronous Replication

Aspect Asynchronous Synchronous
Data Loss Risk Possible (transactions committed on primary but not yet shipped) Zero data loss (RPO = 0)
Commit Latency Local disk only Must wait for standby acknowledgment (network RTT added)
Availability Impact None — primary doesn't depend on standby If all sync standbys are down, writes block on primary
Use Case Most production deployments, read replicas Financial systems, regulatory requirements for zero data loss
Critical

With synchronous replication, if all synchronous standbys go offline, the primary will stop accepting writes entirely. Always configure at least two synchronous standbys or use synchronous_standby_names = 'ANY 1 (standby1, standby2, standby3)' to require only one of multiple standbys to acknowledge.

HA Tooling Comparison

Recommended Patroni

The industry standard for PostgreSQL HA. Uses a distributed consensus store (etcd, ZooKeeper, or Consul) for leader election. Handles automatic failover, planned switchover, and configuration management. Created by Zalando, battle-tested at massive scale.

  • REST API for management & monitoring
  • Supports synchronous replication management
  • Configurable failover conditions
  • Integrates with HAProxy or PgBouncer for connection routing

Alternative pg_auto_failover

By Citus/Microsoft. Simpler than Patroni — uses a "monitor" node instead of an external DCS. Good for smaller deployments where running etcd is overkill. Supports synchronous replication and read replicas. Limited to primary-standby topology.

Legacy repmgr

Mature tool, widely deployed. Handles replication setup, monitoring, and manual/automatic failover. Uses repmgrd daemon for automatic failover. Less opinionated than Patroni — requires more manual configuration. Community support has declined as Patroni gained dominance.

Cloud Managed HA

RDS Multi-AZ, CloudSQL HA, Azure HA: all handle failover automatically using storage-level replication. Failover takes 30-120 seconds. DNS-based endpoint switching. No topology control — you get what the provider gives you.

Patroni Architecture

+--------------+ +--------------+ +--------------+ | etcd 1 | | etcd 2 | | etcd 3 | | (node 1) | | (node 2) | | (node 3) | +------+-------+ +------+-------+ +------+-------+ | | | +--------+--------+--------+--------+ | DCS Cluster | +--------+-----------------+--------+ v v +--------------+ +--------------+ | Patroni + | streaming rep | Patroni + | | PostgreSQL | ---------------> | PostgreSQL | | (primary) | | (standby) | +------+-------+ +------+-------+ | | +------------+--------------------+ v +--------------+ | HAProxy | | :5000 (rw) | | :5001 (ro) | +--------------+

Split-Brain Prevention

Split-brain occurs when two nodes both believe they are the primary. Patroni prevents this through its DCS (distributed consensus store) — the leader key has a TTL, and the primary must continuously renew its lease. If it can't reach the DCS, it demotes itself. Additionally, configure watchdog support in Patroni to ensure the old primary's PostgreSQL is actually stopped before the new primary starts accepting writes.

Recommendation

Always run an odd number of DCS nodes (3 or 5) to maintain quorum. Place them in different failure domains (different racks, different AZs). The DCS cluster is the foundation of your HA — if it goes down, Patroni can't perform failover.

05

Connection Pooling

PostgreSQL's process-per-connection model means each connection consumes real memory and OS resources. A typical backend process uses 5-10 MB at idle and can grow to hundreds of MB under load. Without connection pooling, applications that open hundreds or thousands of connections will overwhelm the database — not through query load, but through sheer resource consumption.

Why You Need a Pooler

Most applications don't need to hold a dedicated database connection for every request. A web application serving 1,000 concurrent requests might only need 50-100 actual database connections if queries are fast. A connection pooler multiplexes application connections onto a smaller pool of real PostgreSQL connections.

# Check current connection count
sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"

# See connection limits
sudo -u postgres psql -c "SHOW max_connections;"
# Default: 100. Many production systems set 200-500.
# Beyond that, you almost certainly need a pooler.

Pooler Comparison

Feature PgBouncer Pgpool-II Odyssey
Architecture Lightweight, single-process, event-driven Multi-process, feature-rich middleware Multi-threaded, advanced routing
Memory Usage ~2 KB per connection ~100 KB per connection Moderate (multi-threaded)
Connection Modes Transaction, session, statement Session only Transaction, session
Load Balancing No (pair with HAProxy) Yes (read query routing) No
Prepared Statements Protocol-level only (PgBouncer 1.21+) Full support Full support
Best For Most production deployments Read query load balancing High-concurrency, multi-threaded workloads

PgBouncer Configuration

# /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

# Pool mode: transaction is recommended for most apps
pool_mode = transaction

# Pool sizing
default_pool_size = 25        # connections per user/database pair
min_pool_size = 5             # pre-warm this many connections
reserve_pool_size = 5         # extra connections for burst traffic
reserve_pool_timeout = 3      # seconds before using reserve pool

max_client_conn = 1000        # total client connections PgBouncer accepts
max_db_connections = 100      # max real connections to PostgreSQL

# Timeouts
server_idle_timeout = 600     # close idle server connections after 10 min
client_idle_timeout = 0       # 0 = no timeout for idle clients
query_timeout = 0             # 0 = no query timeout (handle in app)

# Logging
log_connections = 0
log_disconnections = 0
stats_period = 60

Pooling Modes Explained

Recommended Transaction Pooling

Connection is assigned to a client for the duration of a transaction, then returned to the pool. Most efficient mode. However, session-level features like SET, LISTEN/NOTIFY, prepared statements (without protocol-level support), and advisory locks won't work as expected.

Caution Session Pooling

Connection is held for the entire client session. No multiplexing benefit — you still need one real connection per client. Only useful for limiting total connections and providing a stable endpoint.

Warning

In transaction pooling mode, SET search_path, SET statement_timeout, and other session-level settings will leak between clients. Use SET LOCAL (which resets at transaction end) or pass settings per-query. This is the number one source of bugs when adopting PgBouncer.

06

Backup & Recovery

Backups are your last line of defense. A PostgreSQL deployment without tested, automated backups is a disaster waiting to happen. The strategy you choose depends on your RPO (Recovery Point Objective), RTO (Recovery Time Objective), and database size.

Backup Methods Comparison

Method Type RPO RTO Database Size
pg_dump Logical (SQL or custom format) Time of last dump Hours (proportional to DB size) < 100 GB
pg_basebackup Physical (full cluster copy) Time of last backup Minutes to hours Any
PITR (WAL archiving) Physical + continuous WAL Near-zero (last WAL segment) Minutes to hours Any
pgBackRest Physical + incremental + PITR Near-zero Minutes Any (optimized for large DBs)

pgBackRest — The Production Standard

pgBackRest is the go-to backup tool for production PostgreSQL. It supports full, differential, and incremental backups, parallel backup and restore, backup verification, and S3/GCS/Azure storage. If you're running PostgreSQL in production and not using pgBackRest, you should switch.

# /etc/pgbackrest/pgbackrest.conf
[global]
repo1-type=s3
repo1-s3-bucket=mycompany-pg-backups
repo1-s3-region=us-east-1
repo1-s3-endpoint=s3.amazonaws.com
repo1-path=/pgbackrest
repo1-retention-full=4
repo1-retention-diff=14
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=your-encryption-passphrase

process-max=4
compress-type=zst
compress-level=3

[mydb]
pg1-path=/var/lib/postgresql/18/main
pg1-port=5432
# Create stanza (one-time setup)
sudo -u postgres pgbackrest --stanza=mydb stanza-create

# Full backup
sudo -u postgres pgbackrest --stanza=mydb --type=full backup

# Incremental backup (daily)
sudo -u postgres pgbackrest --stanza=mydb --type=incr backup

# Verify backups (checks backup integrity without full restore)
sudo -u postgres pgbackrest --stanza=mydb verify

# Restore to a point in time
sudo -u postgres pgbackrest --stanza=mydb \
  --type=time "--target=2026-03-19 14:30:00" \
  --target-action=promote restore

WAL Archiving for PITR

# postgresql.conf
archive_mode = on
archive_command = 'pgbackrest --stanza=mydb archive-push %p'
# Or without pgbackrest:
# archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
Critical

A backup you haven't tested restoring is not a backup — it's a hope. Schedule regular restore tests. At minimum, do a monthly automated restore to a staging environment and verify data integrity. pgBackRest's verify command checks backup consistency but does not validate that the restored database is functionally correct.

Backup Automation

# Example cron schedule
# Full backup every Sunday at 2 AM
0 2 * * 0 postgres pgbackrest --stanza=mydb --type=full backup
# Incremental backup every other day at 2 AM
0 2 * * 1-6 postgres pgbackrest --stanza=mydb --type=incr backup
# Verify backup integrity daily at 6 AM
0 6 * * * postgres pgbackrest --stanza=mydb verify
Recommendation

Always encrypt backups at rest, especially when storing them in object storage. pgBackRest supports AES-256-CBC encryption natively. Store the encryption passphrase in a secrets manager (Vault, AWS Secrets Manager), not in the pgbackrest.conf file.

07

Performance Tuning

PostgreSQL's default configuration is intentionally conservative — it's designed to run on a system with 512 MB of RAM. For production, you need to tune the key memory parameters, understand query execution plans, choose the right indexes, and keep autovacuum healthy.

Essential Memory Parameters

Parameter Default Production Guideline Notes
shared_buffers 128 MB 25% of RAM (max ~16 GB) Main data cache. Beyond 16 GB, OS page cache is more efficient.
effective_cache_size 4 GB 50-75% of RAM Not allocated — tells the planner how much cache is likely available (shared_buffers + OS cache).
work_mem 4 MB 32-256 MB (careful!) Per-sort/hash operation, per query. A complex query can use multiple work_mem allocations. Set too high and you OOM.
maintenance_work_mem 64 MB 512 MB - 2 GB For VACUUM, CREATE INDEX, ALTER TABLE. Higher = faster maintenance operations.
wal_buffers -1 (auto) 64 MB Auto-sized to 1/32 of shared_buffers. 64 MB is safe for write-heavy workloads.
max_connections 100 200-500 (with pooler) Each connection = ~5-10 MB. Use PgBouncer and keep this modest.
# postgresql.conf - Example for a 64 GB RAM server
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 128MB
maintenance_work_mem = 2GB
wal_buffers = 64MB
max_connections = 300
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 16

EXPLAIN ANALYZE

The single most important skill for PostgreSQL performance troubleshooting. EXPLAIN ANALYZE executes the query and shows the actual execution plan with real timing and row counts.

-- Always use ANALYZE to see actual vs estimated rows
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > '2026-01-01'
  AND o.status = 'completed'
ORDER BY o.total DESC
LIMIT 50;

-- Key things to look for:
-- 1. Seq Scan on large tables (missing index?)
-- 2. actual rows vs estimated rows mismatch (stale statistics?)
-- 3. Sort with external merge (work_mem too low?)
-- 4. Nested Loop with high loop count (missing index on join key?)

Index Strategies

Default B-tree

The default index type. Supports equality and range queries. Covers 90% of use cases. Supports multicolumn indexes (put high-selectivity columns first). Use INCLUDE to create covering indexes for index-only scans.

Specialized GIN

Generalized Inverted Index. For full-text search, JSONB operators (@>, ?), arrays, and trigram similarity. Slower to update but very fast for lookups on multi-valued columns.

Specialized GiST

Generalized Search Tree. For geometric data, range types, full-text search (ts_vector), and PostGIS geospatial queries. Supports nearest-neighbor searches (ORDER BY <->).

Large Tables BRIN

Block Range Index. Tiny index size for naturally ordered data (timestamps, sequential IDs). A BRIN index on a 100 GB table might be only a few MB. Perfect for time-series append-only tables.

-- Covering index for index-only scans
CREATE INDEX idx_orders_status_created
ON orders (status, created_at DESC)
INCLUDE (total, customer_id);

-- Partial index (only index what you query)
CREATE INDEX idx_orders_active
ON orders (created_at DESC)
WHERE status = 'active';

-- BRIN for time-series data
CREATE INDEX idx_events_ts_brin
ON events USING brin (created_at)
WITH (pages_per_range = 32);

-- GIN for JSONB queries
CREATE INDEX idx_metadata_gin
ON documents USING gin (metadata jsonb_path_ops);

Vacuum & Autovacuum Tuning

VACUUM is not optional — it's a critical maintenance operation that reclaims space from dead tuples and updates visibility maps. Autovacuum runs this automatically, but the defaults are often too conservative for busy tables.

# postgresql.conf - Aggressive autovacuum for busy systems
autovacuum_max_workers = 6                  # default 3
autovacuum_naptime = 15s                    # default 1min
autovacuum_vacuum_threshold = 25            # default 50
autovacuum_vacuum_scale_factor = 0.01       # default 0.2 (triggers at 1% dead tuples)
autovacuum_analyze_threshold = 25
autovacuum_analyze_scale_factor = 0.005
autovacuum_vacuum_cost_delay = 2ms          # default 2ms (PG 12+)
autovacuum_vacuum_cost_limit = 1000         # default -1 (uses vacuum_cost_limit = 200)
Warning

If autovacuum falls behind on a high-write table, dead tuples accumulate, table bloat grows, and eventually you risk transaction ID wraparound — which forces PostgreSQL into single-user mode for emergency VACUUM. Monitor pg_stat_user_tables.n_dead_tup and pg_stat_user_tables.last_autovacuum for every table.

Table Partitioning

Declarative partitioning (PostgreSQL 10+) splits a large table into smaller physical tables based on a partition key. This improves query performance through partition pruning, makes maintenance operations faster (VACUUM on a smaller table), and enables efficient data lifecycle management (drop old partitions instead of deleting rows).

-- Range partitioning by date
CREATE TABLE events (
    id bigint GENERATED ALWAYS AS IDENTITY,
    event_type text NOT NULL,
    payload jsonb,
    created_at timestamptz NOT NULL
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

-- Automate with pg_partman
CREATE EXTENSION pg_partman;
SELECT partman.create_parent(
    p_parent_table := 'public.events',
    p_control := 'created_at',
    p_interval := '1 month',
    p_premake := 3
);
08

Security

PostgreSQL security operates at multiple layers: network access control (pg_hba.conf), authentication, encryption in transit (SSL/TLS), authorization (roles and privileges), row-level security, and audit logging. A production deployment should address all of these.

Authentication — pg_hba.conf

The pg_hba.conf file controls who can connect, from where, and how they authenticate. Rules are evaluated top-to-bottom; the first match wins.

# pg_hba.conf - Production example
# TYPE  DATABASE  USER       ADDRESS          METHOD

# Local socket connections
local   all       postgres                    peer
local   all       all                         scram-sha-256

# SSL-only remote connections
hostssl myapp     app_user   10.0.0.0/24      scram-sha-256
hostssl all       admin      10.0.1.0/24      scram-sha-256

# Replication (SSL required)
hostssl replication replicator 10.0.0.0/24    scram-sha-256

# Reject everything else
host    all       all        0.0.0.0/0        reject
Critical

Never use trust or md5 authentication in production. trust allows connections without a password. md5 is vulnerable to pass-the-hash attacks. Use scram-sha-256 (default for password_encryption since PostgreSQL 14). Set password_encryption = scram-sha-256 in postgresql.conf and update pg_hba.conf entries from md5 to scram-sha-256.

SSL/TLS Configuration

# postgresql.conf
ssl = on
ssl_cert_file = '/etc/postgresql/server.crt'
ssl_key_file = '/etc/postgresql/server.key'
ssl_ca_file = '/etc/postgresql/ca.crt'       # for client certificate verification
ssl_min_protocol_version = 'TLSv1.3'         # PG 12+
ssl_ciphers = 'HIGH:!aNULL:!MD5'

Roles & Privileges

Follow the principle of least privilege. Create separate roles for applications, read-only users, and administrators. Never let applications connect as the postgres superuser.

-- Create application role with minimal privileges
CREATE ROLE app_user WITH LOGIN PASSWORD 'strong-password-here';
GRANT CONNECT ON DATABASE myapp TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO app_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO app_user;

-- Read-only role for reporting
CREATE ROLE readonly WITH LOGIN PASSWORD 'another-strong-password';
GRANT CONNECT ON DATABASE myapp TO readonly;
GRANT USAGE ON SCHEMA public TO readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT ON TABLES TO readonly;

Row-Level Security (RLS)

RLS allows you to restrict which rows a user can see or modify based on their role or session properties. Essential for multi-tenant applications.

-- Enable RLS on a multi-tenant table
ALTER TABLE customer_data ENABLE ROW LEVEL SECURITY;

-- Policy: users can only see their own tenant's data
CREATE POLICY tenant_isolation ON customer_data
    USING (tenant_id = current_setting('app.tenant_id')::int);

-- Force RLS even for table owners
ALTER TABLE customer_data FORCE ROW LEVEL SECURITY;

-- Set tenant context in application
SET app.tenant_id = '42';

Audit Logging with pgAudit

# postgresql.conf
shared_preload_libraries = 'pgaudit'
pgaudit.log = 'ddl, role, write'       # log DDL, role changes, and write operations
pgaudit.log_catalog = off               # don't log system catalog queries
pgaudit.log_relation = on               # log table names, not just commands
pgaudit.log_parameter = on              # log query parameters
Recommendation

For compliance (SOC 2, HIPAA, PCI-DSS), pgAudit is a must-have. Ship audit logs to a centralized logging system (ELK, Loki, Splunk) where they can't be tampered with by database administrators. Include pgaudit.log_parameter = on carefully — it logs query parameters which may contain PII.

09

Upgrades

PostgreSQL releases a new major version annually (typically September) and minor versions quarterly (February, May, August, November). Major upgrades (e.g., 17 to 18) require a migration — the on-disk data format changes. Minor upgrades (e.g., 18.1 to 18.3) are always in-place and binary-compatible.

Upgrade Methods

Recommended pg_upgrade (In-Place)

The standard upgrade method. Copies or hard-links data files and rewrites system catalogs. Downtime is typically minutes to hours depending on database size (with --link mode, it's minutes even for terabyte databases). Run pg_upgrade --check first to validate compatibility.

  • Fast with --link mode (hard links instead of copying)
  • Requires both old and new PostgreSQL binaries installed
  • All extensions must be compatible with the new version
  • Run ANALYZE on all databases after upgrade to rebuild statistics

Zero-Downtime Logical Replication

Set up logical replication from the old version to a new-version instance. Once caught up, switch application connections. Near-zero downtime but requires more setup. Not all data types and DDL replicate — test thoroughly. Sequences, large objects, and DDL don't replicate.

  • Supports cross-version replication (PG 14+ to 18)
  • Application cutover in seconds
  • More complex to set up and validate
  • Need to handle sequences and schema changes manually

pg_upgrade Workflow

# 1. Install new PostgreSQL version alongside old
sudo apt install postgresql-18

# 2. Stop both clusters
sudo systemctl stop postgresql@17-main
sudo systemctl stop postgresql@18-main

# 3. Run upgrade check (dry run)
sudo -u postgres /usr/lib/postgresql/18/bin/pg_upgrade \
  --old-datadir=/var/lib/postgresql/18/main \
  --new-datadir=/var/lib/postgresql/18/main \
  --old-bindir=/usr/lib/postgresql/17/bin \
  --new-bindir=/usr/lib/postgresql/18/bin \
  --check

# 4. Run actual upgrade (--link for speed)
sudo -u postgres /usr/lib/postgresql/18/bin/pg_upgrade \
  --old-datadir=/var/lib/postgresql/18/main \
  --new-datadir=/var/lib/postgresql/18/main \
  --old-bindir=/usr/lib/postgresql/17/bin \
  --new-bindir=/usr/lib/postgresql/18/bin \
  --link

# 5. Start new cluster
sudo systemctl start postgresql@18-main

# 6. Rebuild optimizer statistics (critical!)
sudo -u postgres /usr/lib/postgresql/18/bin/vacuumdb \
  --all --analyze-in-stages

# 7. After verifying, remove old cluster
sudo pg_dropcluster 17 main
Warning

With --link mode, the old cluster becomes unusable after upgrade — the data files are shared. If the upgrade fails or you need to rollback, you must restore from backup. Always take a full backup before running pg_upgrade.

Minor Version Upgrades

Minor upgrades are straightforward and should be applied promptly — they contain security fixes and bug fixes. No data migration is needed.

# Minor upgrade (e.g., 18.1 → 18.3)
sudo apt update
sudo apt upgrade postgresql-18
sudo systemctl restart postgresql@18-main

# Verify
sudo -u postgres psql -c "SELECT version();"
Critical

Never skip minor version upgrades for extended periods. PostgreSQL minor releases fix bugs that can cause data corruption, security vulnerabilities, and crashes. Subscribe to the PostgreSQL security mailing list and apply patches within your maintenance window.

10

Monitoring

PostgreSQL exposes a wealth of internal statistics through its catalog views. The key is knowing which metrics matter, setting up proper alerting, and correlating database metrics with application-level observability.

Essential Statistics Views

Queries pg_stat_statements

The most important extension for performance analysis. Tracks execution statistics for every distinct query: call count, total/mean/min/max time, rows returned, block hits/reads. Requires shared_preload_libraries = 'pg_stat_statements'.

Sessions pg_stat_activity

Shows currently running queries, idle connections, waiting processes, and lock info. Essential for debugging slow queries and connection issues. Check state, wait_event_type, query_start.

Tables pg_stat_user_tables

Per-table stats: sequential scans, index scans, rows inserted/updated/deleted, dead tuples, last vacuum/analyze time. Use this to find tables missing indexes (high seq_scan count) or needing vacuum.

Replication pg_stat_replication

Shows connected standbys, their WAL positions, and replication lag. The replay_lag column (PG 10+) gives human-readable lag. Alert on any lag exceeding your RPO.

Key Metrics to Monitor

-- Cache hit ratio (should be > 99%)
SELECT
  sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) AS ratio
FROM pg_statio_user_tables;

-- Active connections by state
SELECT state, count(*)
FROM pg_stat_activity
GROUP BY state;

-- Replication lag (on primary)
SELECT client_addr, state, replay_lag
FROM pg_stat_replication;

-- Top 10 slowest queries (requires pg_stat_statements)
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Tables with most dead tuples (vacuum candidates)
SELECT schemaname, relname, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

-- Transaction ID age (wraparound risk)
SELECT datname, age(datfrozenxid) AS xid_age,
  current_setting('autovacuum_freeze_max_age') AS freeze_max
FROM pg_database
ORDER BY age(datfrozenxid) DESC;

Prometheus + Grafana Stack

The postgres_exporter by Prometheus Community is the standard way to expose PostgreSQL metrics to Prometheus. Combine with Grafana dashboards for visualization and alerting.

# docker-compose.yml snippet for postgres_exporter
postgres-exporter:
  image: prometheuscommunity/postgres-exporter:latest
  environment:
    DATA_SOURCE_NAME: "postgresql://monitor:password@postgres:5432/myapp?sslmode=require"
  ports:
    - "9187:9187"
-- Create a monitoring role with minimal privileges
CREATE ROLE monitor WITH LOGIN PASSWORD 'monitor-password';
GRANT pg_monitor TO monitor;  -- PG 10+ built-in monitoring role
GRANT CONNECT ON DATABASE myapp TO monitor;

Critical Alerts

Metric Warning Threshold Critical Threshold
Replication lag > 30 seconds > 5 minutes
Connection utilization > 70% of max_connections > 90% of max_connections
Cache hit ratio < 99% < 95%
Transaction ID age > 500M > 1B (wraparound imminent)
Dead tuples per table > 10M > 100M (autovacuum is stuck)
Disk space < 20% free < 10% free
Long-running queries > 5 minutes > 30 minutes
Recommendation

Use the community Grafana dashboard ID 9628 (PostgreSQL Database) as a starting point. It covers connection stats, transaction rates, tuple operations, cache ratios, and replication lag out of the box. Customize alerting rules based on your SLOs.

11

Extensions & Ecosystem

PostgreSQL's extension system is one of its greatest strengths. Extensions can add new data types, operators, index access methods, functions, and even background workers — all without forking the core project. Some extensions effectively turn PostgreSQL into a specialized database.

Essential Extensions

Geospatial PostGIS

The gold standard for geospatial data in any database. Adds geometry/geography types, spatial indexes (GiST/SP-GiST), and hundreds of spatial functions. If you need geospatial queries, PostGIS is why you choose PostgreSQL over every other database.

AI/ML pgvector

Vector similarity search for AI/ML embeddings. Supports exact and approximate nearest neighbor search (IVFFlat, HNSW indexes). Store embeddings alongside relational data — no need for a separate vector database. Critical for RAG applications.

Time-Series TimescaleDB

Turns PostgreSQL into a time-series database. Automatic partitioning (hypertables), compression, continuous aggregates, and data retention policies. Core is Apache 2.0; community features under Timescale License (TSL, free for non-DBaaS use). Company rebranded to TigerData in 2025.

Distributed Citus

Distributed PostgreSQL by Microsoft. Shards tables across multiple nodes for horizontal scaling. Supports distributed queries, reference tables, and columnar storage. Open source (AGPLv3). Now available as an extension, not just a fork.

Operations pg_cron

Cron-based job scheduler running inside PostgreSQL. Schedule SQL commands, VACUUM, ANALYZE, partition maintenance — without external cron jobs. Simple, reliable, and runs as a background worker.

Audit pgAudit

Provides detailed session and object audit logging. Required for compliance frameworks (SOC 2, HIPAA, PCI-DSS). Logs to the standard PostgreSQL log, so integrate with your existing log pipeline.

Partitioning pg_partman

Automates partition creation and retention for time-based and ID-based partitioning. Pre-creates future partitions and optionally drops or detaches old ones. Works with native declarative partitioning.

CDC Debezium

Change Data Capture using PostgreSQL's logical decoding. Streams row-level changes to Kafka, Pulsar, or other sinks. Requires wal_level = logical and the pgoutput or wal2json plugin. Essential for event-driven architectures.

Foreign Data Wrappers (FDW)

FDWs let you query external data sources as if they were PostgreSQL tables. The postgres_fdw extension (included in core) enables cross-server queries. Third-party FDWs exist for MySQL, MongoDB, Redis, S3/Parquet files, Elasticsearch, and more.

-- Query a remote PostgreSQL server
CREATE EXTENSION postgres_fdw;

CREATE SERVER remote_server
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'remote.example.com', port '5432', dbname 'analytics');

CREATE USER MAPPING FOR app_user
    SERVER remote_server
    OPTIONS (user 'remote_reader', password 'secret');

IMPORT FOREIGN SCHEMA public
    FROM SERVER remote_server
    INTO remote_schema;

-- Now query remote tables as if they were local
SELECT * FROM remote_schema.sales WHERE region = 'EMEA';

Logical Decoding & CDC

# postgresql.conf for logical decoding
wal_level = logical                    # required (not just 'replica')
max_replication_slots = 10             # one per consumer
max_wal_senders = 10
-- Create a logical replication slot for Debezium
SELECT pg_create_logical_replication_slot('debezium_slot', 'pgoutput');

-- Create a publication for CDC
CREATE PUBLICATION my_publication FOR TABLE orders, customers, products;
Warning

Logical replication slots that are not consumed will cause WAL to accumulate indefinitely, eventually filling your disk. Always monitor pg_replication_slots for inactive slots and drop any that are no longer in use: SELECT pg_drop_replication_slot('unused_slot');

12

Licensing

PostgreSQL is released under the PostgreSQL License, a liberal open-source license similar to the BSD and MIT licenses. It permits use, modification, and distribution for any purpose — commercial or otherwise — with no copyleft requirements. There is no "enterprise edition," no dual licensing, and no feature gating. Every feature is available to everyone.

License Comparison

Database License Enterprise Edition? Key Restrictions
PostgreSQL PostgreSQL License (BSD-like) No — all features included None. Fully permissive.
MySQL GPLv2 (Community) / Commercial (Enterprise) Yes — Thread Pool, Audit, Encryption, Backup GPL requires source sharing for derivative works, or buy a commercial license from Oracle.
MariaDB GPLv2 (Server) / BSL (MaxScale, some tools) Sort of — MaxScale and some features under BSL BSL converts to GPL after a period, but restricts production use initially.
MongoDB SSPL Yes (Atlas features) SSPL requires open-sourcing entire service stack if offering as a service.

Commercial Support Options

While PostgreSQL itself is free, commercial support is available from several companies. This is relevant for enterprises that need SLA-backed support, consulting, or managed services.

Enterprise EDB (EnterpriseDB)

The largest PostgreSQL company. Offers EDB Postgres Advanced Server (Oracle compatibility layer), 24/7 support, managed services, and migration tooling. Acquired 2ndQuadrant in 2020. Major contributor to PostgreSQL core.

Enterprise Crunchy Data

Kubernetes-native PostgreSQL platform. Crunchy Certified PostgreSQL, PGO operator, pgBackRest, pgBouncer, pgMonitor. Strong in US government and regulated industries. Open-source-first approach.

Support Percona

Database-agnostic support company. Offers Percona Distribution for PostgreSQL (packaged with extensions and tools), monitoring (PMM), backup, and 24/7 support. Good choice if you also run MySQL/MongoDB.

Managed Cloud Providers

AWS RDS/Aurora, Google CloudSQL/AlloyDB, Azure Database for PostgreSQL. Fully managed with automated HA, backups, and patching. Higher cost, less control, but zero operational burden for the database layer.

Recommendation

The PostgreSQL License means you will never face a licensing surprise, an audit, or a forced migration due to license changes. This is a strategic advantage over MySQL (Oracle), MongoDB (SSPL), and Redis (RSALv2/SSPLv1 since 7.4, no longer open source). When presenting database options to customers, lead with this.

13

Consultant's Checklist

Before deploying PostgreSQL for a customer, work through these questions. They'll surface requirements that drive architecture decisions and prevent post-deployment surprises.

  1. Data volume & growth rate? — Current database size, expected growth per month/year. This drives storage, partitioning, and backup strategy decisions. Terabyte databases need different tooling than 50 GB ones.
  2. HA requirements? — What's the acceptable downtime? RPO (data loss tolerance) and RTO (recovery time). Zero RPO requires synchronous replication. Sub-minute RTO requires Patroni with automated failover.
  3. Connection patterns? — Peak concurrent connections, connection duration (short-lived HTTP handlers vs long-lived batch jobs). This determines whether you need PgBouncer and how to size pools.
  4. Read/write ratio? — Read-heavy workloads benefit from read replicas. Write-heavy workloads need careful WAL tuning, connection pooling, and potentially Citus for horizontal scaling.
  5. Compliance requirements? — SOC 2, HIPAA, PCI-DSS, GDPR. These drive encryption at rest, SSL enforcement, pgAudit, row-level security, and data residency decisions.
  6. Backup & recovery targets? — RPO/RTO numbers, retention requirements, offsite storage mandates. pgBackRest to S3 with PITR is the standard, but some customers need backup to specific providers.
  7. Deployment platform? — Bare metal, VMs, Kubernetes, or managed service? Team expertise matters more than technical merits. If they don't have K8s skills, don't deploy on K8s.
  8. Upgrade strategy? — How will you handle major version upgrades? pg_upgrade with downtime window, or logical replication for near-zero downtime? How often will you apply minor patches?
  9. Extensions needed? — PostGIS, pgvector, TimescaleDB, pg_cron? This affects deployment model — managed services may not support all extensions. Verify compatibility before committing.
  10. Monitoring stack? — What's the existing observability platform? Prometheus/Grafana, Datadog, New Relic? Ensure postgres_exporter or the provider's integration is in place from day one.
  11. Team expertise? — Does the team have PostgreSQL operational experience? If not, budget for training, runbooks, and potentially commercial support (EDB, Crunchy Data, Percona).
  12. Migration source? — Coming from MySQL, Oracle, SQL Server? Plan for schema differences, stored procedure rewrites, and data type mapping. EDB has Oracle compatibility tools. pgLoader handles MySQL migrations.
Info

The most common mistake in PostgreSQL deployments is underestimating operational complexity. PostgreSQL is not "install and forget." Autovacuum, bloat management, connection pooling, backup testing, and upgrade planning all require ongoing attention. Budget for operations, not just initial deployment.