PostgreSQL
Production architecture, high availability, performance tuning & operational best practices
Overview
PostgreSQL is an advanced, open-source object-relational database management system with nearly 40 years of active development (origins dating to 1986 at UC Berkeley). It has earned a reputation for reliability, data integrity, and correctness, making it the default choice for most greenfield projects that need a relational database. As of early 2026, PostgreSQL 18 is the current major release (September 2025), with PostgreSQL 17 widely deployed in production.
What sets PostgreSQL apart from other relational databases is its extensibility. You can define your own data types, build custom functions in multiple languages (PL/pgSQL, PL/Python, PL/Perl, C), create custom index types, and even write extensions that fundamentally change how the database behaves. This extensibility is why PostgreSQL has become the foundation for specialized databases like TimescaleDB (time-series), Citus (distributed), and pgvector (AI/ML embeddings).
Core Strengths
MVCC Concurrency
Multi-Version Concurrency Control allows readers and writers to never block each other. Each transaction sees a consistent snapshot of the database. No read locks, no write starvation — this is fundamental to PostgreSQL's performance under concurrent workloads.
SQL Standards Compliance
PostgreSQL implements a large portion of the SQL:2023 standard. CTEs, window functions, lateral joins, MERGE, JSON path queries, generated columns — features that other databases either lack or implement partially.
ACID Data Integrity
Full ACID compliance with Write-Ahead Logging (WAL). Serializable isolation level available. Foreign keys, check constraints, exclusion constraints, and triggers enforce data integrity at the database level.
EXTENSIBLE Ecosystem
Over 1,000 extensions available. PostGIS (geospatial), pgvector (embeddings), TimescaleDB (time-series), pg_cron (scheduling), pgAudit (audit logging). Extensions can add data types, operators, index methods, and even custom storage engines.
When to Choose PostgreSQL
PostgreSQL is the right choice for most workloads where you need a relational database. It excels at OLTP, handles analytical queries well (especially with partitioning and parallel query), supports JSON/JSONB for semi-structured data, and has native full-text search. The only cases where you might look elsewhere: extremely high write throughput with relaxed consistency (consider Cassandra/ScyllaDB), pure key-value at massive scale (consider Redis/DynamoDB), or when your team is deeply invested in another ecosystem (SQL Server/.NET, Oracle/Java EE).
If you're starting a new project and need a relational database, default to PostgreSQL unless you have a specific, documented reason not to. The ecosystem, community, and operational tooling are unmatched in the open-source world.
Architecture
Understanding PostgreSQL's process model and memory architecture is essential for proper tuning and troubleshooting. Unlike MySQL's thread-per-connection model, PostgreSQL uses a process-per-connection model — each client connection gets its own operating system process. (Note: as of PG 15, the stats collector process was removed and statistics are now stored in shared memory.)
Process Model
The postmaster is the parent process. It listens for incoming connections and forks a new backend process for each one. This model is simple and robust — a crash in one backend doesn't bring down others — but it means each connection consumes real OS memory (typically 5-10 MB per idle connection, more under load). This is why connection pooling is critical in production.
Shared Memory
All backend processes share a region of memory managed by PostgreSQL:
- Shared Buffers — The main data cache. Pages read from disk are cached here. Default is 128 MB, but production systems typically set this to 25% of total RAM (up to about 8-16 GB, beyond which returns diminish).
- WAL Buffers — Buffer for Write-Ahead Log entries before they're flushed to disk. Default auto-sized to 1/32 of shared_buffers, usually 16 MB is sufficient.
- Lock Tables — Tracking all active locks across transactions.
- Proc Array — Information about all active backends, used for MVCC visibility checks.
Write-Ahead Logging (WAL)
Every change to data is first written to the WAL before being applied to the actual data files. This ensures durability — if the server crashes, PostgreSQL replays the WAL to recover to a consistent state. WAL is also the foundation for replication (streaming replication sends WAL records to replicas) and Point-in-Time Recovery (PITR).
# WAL files live in pg_wal/ (formerly pg_xlog/)
# Each segment is 16 MB by default
ls -la /var/lib/postgresql/18/main/pg_wal/
# Watch WAL generation rate
pg_stat_wal # PostgreSQL 14+
MVCC Internals
PostgreSQL's MVCC implementation stores old row versions directly in the table (heap). When a row is updated, the old version is marked with the transaction ID that invalidated it, and a new version is inserted. This means:
- Updates are essentially delete + insert (table bloat is a real concern)
- VACUUM is required to reclaim space from dead tuples
- Long-running transactions prevent VACUUM from cleaning up, leading to table bloat
- The visibility map tracks which pages contain only all-visible tuples (enables index-only scans)
Long-running transactions are one of the most common causes of production incidents in PostgreSQL. A single forgotten BEGIN without a COMMIT can prevent autovacuum from cleaning up dead tuples across the entire database, eventually leading to table bloat, degraded performance, and even transaction ID wraparound.
Tablespaces
Tablespaces allow you to place specific tables or indexes on different physical storage. Useful for putting hot indexes on fast NVMe while keeping large, infrequently accessed tables on cheaper storage. In practice, most cloud deployments use a single tablespace since the storage layer handles tiering.
-- Create a tablespace on fast storage
CREATE TABLESPACE fast_ssd LOCATION '/mnt/nvme/pg_data';
-- Move a table to the fast tablespace
ALTER TABLE orders SET TABLESPACE fast_ssd;
-- Move an index
ALTER INDEX orders_created_idx SET TABLESPACE fast_ssd;
Deployment Models
How you deploy PostgreSQL has a massive impact on operational complexity, performance, and cost. There's no single right answer — it depends on your team's capabilities, compliance requirements, and budget.
Deployment Comparison
| Model | Pros | Cons | Best For |
|---|---|---|---|
| Bare Metal | Maximum performance, full control, no hypervisor overhead | Hardware procurement, no live migration, capacity planning is manual | High-performance OLTP, financial systems, latency-critical workloads |
| VMs | Familiar, snapshotting, live migration, good isolation | Some overhead, storage I/O can be a bottleneck, right-sizing is tricky | Most production workloads, traditional enterprise environments |
| Containers (K8s) | Declarative, GitOps-friendly, easy dev/staging parity | Stateful workloads on K8s are hard, storage performance, networking complexity | Dev/staging, teams already deep in K8s, with operators like CloudNativePG |
| Managed (RDS, CloudSQL) | Automated backups, patching, HA, monitoring out of the box | Higher cost, limited extension support, less tuning control, vendor lock-in | Small teams, startups, when you don't have a dedicated DBA |
Self-Hosted vs Managed
The decision to self-host vs use a managed service comes down to team expertise and control requirements. If your team has strong PostgreSQL operational knowledge and you need full control over extensions, configuration, and upgrade timing, self-host. If you're a small team without a DBA, managed services like AWS RDS, Google CloudSQL, or Azure Database for PostgreSQL handle the undifferentiated heavy lifting.
Managed services often lag behind upstream PostgreSQL by several months. As of early 2026, RDS supports PostgreSQL through version 18.3, but new major versions typically take a few months to become available after upstream release. If you need day-one access to the latest features or specific extensions not yet supported by your provider, self-hosting is the only option.
Kubernetes Considerations
Running PostgreSQL on Kubernetes is increasingly viable but requires careful planning. The key challenges are storage performance, data durability, and operational tooling. Use a purpose-built operator:
Recommended CloudNativePG
CNCF Sandbox project. Declarative PostgreSQL clusters, automated failover, backup to S3/GCS, rolling updates. The most mature K8s-native operator. Doesn't rely on Patroni — has its own consensus mechanism.
Alternative Crunchy PGO
By Crunchy Data (v5). Uses Patroni for HA. Supports pgBackRest for backups, pgBouncer sidecar for connection pooling. Well-documented, production-tested. Open source (Apache 2.0).
Never run production PostgreSQL on Kubernetes with emptyDir or hostPath volumes. Always use persistent volumes backed by a proper storage class (e.g., gp3 on AWS, pd-ssd on GCP). Ensure your storage class has allowVolumeExpansion: true — you will need to grow volumes.
Installation on Ubuntu/Debian
Always use the official PGDG (PostgreSQL Global Development Group) repository for the latest stable packages:
# Add the official PostgreSQL repository
sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
# Install PostgreSQL 18
sudo apt install -y postgresql-18
# Verify
pg_lsclusters
sudo -u postgres psql -c "SELECT version();"
High Availability
PostgreSQL does not have built-in automatic failover. It provides the replication primitives — streaming replication and logical replication — but you need external tooling to detect failures and promote a standby. This is by design: the database doesn't try to be a distributed consensus system.
Streaming Replication
Physical (streaming) replication ships WAL records from the primary to one or more standby servers. The standby continuously applies these records, maintaining an exact byte-for-byte copy of the primary. Standby servers can serve read-only queries (hot standby).
# postgresql.conf on primary
wal_level = replica # minimal, replica, or logical
max_wal_senders = 10 # max number of standby connections
wal_keep_size = 1GB # retain WAL for slow standbys (PG 13+)
synchronous_standby_names = '' # empty = async, set for sync replication
# pg_hba.conf on primary
host replication replicator 10.0.0.0/24 scram-sha-256
# On the standby server
sudo -u postgres pg_basebackup \
-h primary.example.com \
-D /var/lib/postgresql/18/main \
-U replicator -Fp -Xs -P -R
# -R creates standby.signal and sets primary_conninfo in postgresql.auto.conf
Synchronous vs Asynchronous Replication
| Aspect | Asynchronous | Synchronous |
|---|---|---|
| Data Loss Risk | Possible (transactions committed on primary but not yet shipped) | Zero data loss (RPO = 0) |
| Commit Latency | Local disk only | Must wait for standby acknowledgment (network RTT added) |
| Availability Impact | None — primary doesn't depend on standby | If all sync standbys are down, writes block on primary |
| Use Case | Most production deployments, read replicas | Financial systems, regulatory requirements for zero data loss |
With synchronous replication, if all synchronous standbys go offline, the primary will stop accepting writes entirely. Always configure at least two synchronous standbys or use synchronous_standby_names = 'ANY 1 (standby1, standby2, standby3)' to require only one of multiple standbys to acknowledge.
HA Tooling Comparison
Recommended Patroni
The industry standard for PostgreSQL HA. Uses a distributed consensus store (etcd, ZooKeeper, or Consul) for leader election. Handles automatic failover, planned switchover, and configuration management. Created by Zalando, battle-tested at massive scale.
- REST API for management & monitoring
- Supports synchronous replication management
- Configurable failover conditions
- Integrates with HAProxy or PgBouncer for connection routing
Alternative pg_auto_failover
By Citus/Microsoft. Simpler than Patroni — uses a "monitor" node instead of an external DCS. Good for smaller deployments where running etcd is overkill. Supports synchronous replication and read replicas. Limited to primary-standby topology.
Legacy repmgr
Mature tool, widely deployed. Handles replication setup, monitoring, and manual/automatic failover. Uses repmgrd daemon for automatic failover. Less opinionated than Patroni — requires more manual configuration. Community support has declined as Patroni gained dominance.
Cloud Managed HA
RDS Multi-AZ, CloudSQL HA, Azure HA: all handle failover automatically using storage-level replication. Failover takes 30-120 seconds. DNS-based endpoint switching. No topology control — you get what the provider gives you.
Patroni Architecture
Split-Brain Prevention
Split-brain occurs when two nodes both believe they are the primary. Patroni prevents this through its DCS (distributed consensus store) — the leader key has a TTL, and the primary must continuously renew its lease. If it can't reach the DCS, it demotes itself. Additionally, configure watchdog support in Patroni to ensure the old primary's PostgreSQL is actually stopped before the new primary starts accepting writes.
Always run an odd number of DCS nodes (3 or 5) to maintain quorum. Place them in different failure domains (different racks, different AZs). The DCS cluster is the foundation of your HA — if it goes down, Patroni can't perform failover.
Connection Pooling
PostgreSQL's process-per-connection model means each connection consumes real memory and OS resources. A typical backend process uses 5-10 MB at idle and can grow to hundreds of MB under load. Without connection pooling, applications that open hundreds or thousands of connections will overwhelm the database — not through query load, but through sheer resource consumption.
Why You Need a Pooler
Most applications don't need to hold a dedicated database connection for every request. A web application serving 1,000 concurrent requests might only need 50-100 actual database connections if queries are fast. A connection pooler multiplexes application connections onto a smaller pool of real PostgreSQL connections.
# Check current connection count
sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"
# See connection limits
sudo -u postgres psql -c "SHOW max_connections;"
# Default: 100. Many production systems set 200-500.
# Beyond that, you almost certainly need a pooler.
Pooler Comparison
| Feature | PgBouncer | Pgpool-II | Odyssey |
|---|---|---|---|
| Architecture | Lightweight, single-process, event-driven | Multi-process, feature-rich middleware | Multi-threaded, advanced routing |
| Memory Usage | ~2 KB per connection | ~100 KB per connection | Moderate (multi-threaded) |
| Connection Modes | Transaction, session, statement | Session only | Transaction, session |
| Load Balancing | No (pair with HAProxy) | Yes (read query routing) | No |
| Prepared Statements | Protocol-level only (PgBouncer 1.21+) | Full support | Full support |
| Best For | Most production deployments | Read query load balancing | High-concurrency, multi-threaded workloads |
PgBouncer Configuration
# /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
# Pool mode: transaction is recommended for most apps
pool_mode = transaction
# Pool sizing
default_pool_size = 25 # connections per user/database pair
min_pool_size = 5 # pre-warm this many connections
reserve_pool_size = 5 # extra connections for burst traffic
reserve_pool_timeout = 3 # seconds before using reserve pool
max_client_conn = 1000 # total client connections PgBouncer accepts
max_db_connections = 100 # max real connections to PostgreSQL
# Timeouts
server_idle_timeout = 600 # close idle server connections after 10 min
client_idle_timeout = 0 # 0 = no timeout for idle clients
query_timeout = 0 # 0 = no query timeout (handle in app)
# Logging
log_connections = 0
log_disconnections = 0
stats_period = 60
Pooling Modes Explained
Recommended Transaction Pooling
Connection is assigned to a client for the duration of a transaction, then returned to the pool. Most efficient mode. However, session-level features like SET, LISTEN/NOTIFY, prepared statements (without protocol-level support), and advisory locks won't work as expected.
Caution Session Pooling
Connection is held for the entire client session. No multiplexing benefit — you still need one real connection per client. Only useful for limiting total connections and providing a stable endpoint.
In transaction pooling mode, SET search_path, SET statement_timeout, and other session-level settings will leak between clients. Use SET LOCAL (which resets at transaction end) or pass settings per-query. This is the number one source of bugs when adopting PgBouncer.
Backup & Recovery
Backups are your last line of defense. A PostgreSQL deployment without tested, automated backups is a disaster waiting to happen. The strategy you choose depends on your RPO (Recovery Point Objective), RTO (Recovery Time Objective), and database size.
Backup Methods Comparison
| Method | Type | RPO | RTO | Database Size |
|---|---|---|---|---|
| pg_dump | Logical (SQL or custom format) | Time of last dump | Hours (proportional to DB size) | < 100 GB |
| pg_basebackup | Physical (full cluster copy) | Time of last backup | Minutes to hours | Any |
| PITR (WAL archiving) | Physical + continuous WAL | Near-zero (last WAL segment) | Minutes to hours | Any |
| pgBackRest | Physical + incremental + PITR | Near-zero | Minutes | Any (optimized for large DBs) |
pgBackRest — The Production Standard
pgBackRest is the go-to backup tool for production PostgreSQL. It supports full, differential, and incremental backups, parallel backup and restore, backup verification, and S3/GCS/Azure storage. If you're running PostgreSQL in production and not using pgBackRest, you should switch.
# /etc/pgbackrest/pgbackrest.conf
[global]
repo1-type=s3
repo1-s3-bucket=mycompany-pg-backups
repo1-s3-region=us-east-1
repo1-s3-endpoint=s3.amazonaws.com
repo1-path=/pgbackrest
repo1-retention-full=4
repo1-retention-diff=14
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=your-encryption-passphrase
process-max=4
compress-type=zst
compress-level=3
[mydb]
pg1-path=/var/lib/postgresql/18/main
pg1-port=5432
# Create stanza (one-time setup)
sudo -u postgres pgbackrest --stanza=mydb stanza-create
# Full backup
sudo -u postgres pgbackrest --stanza=mydb --type=full backup
# Incremental backup (daily)
sudo -u postgres pgbackrest --stanza=mydb --type=incr backup
# Verify backups (checks backup integrity without full restore)
sudo -u postgres pgbackrest --stanza=mydb verify
# Restore to a point in time
sudo -u postgres pgbackrest --stanza=mydb \
--type=time "--target=2026-03-19 14:30:00" \
--target-action=promote restore
WAL Archiving for PITR
# postgresql.conf
archive_mode = on
archive_command = 'pgbackrest --stanza=mydb archive-push %p'
# Or without pgbackrest:
# archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
A backup you haven't tested restoring is not a backup — it's a hope. Schedule regular restore tests. At minimum, do a monthly automated restore to a staging environment and verify data integrity. pgBackRest's verify command checks backup consistency but does not validate that the restored database is functionally correct.
Backup Automation
# Example cron schedule
# Full backup every Sunday at 2 AM
0 2 * * 0 postgres pgbackrest --stanza=mydb --type=full backup
# Incremental backup every other day at 2 AM
0 2 * * 1-6 postgres pgbackrest --stanza=mydb --type=incr backup
# Verify backup integrity daily at 6 AM
0 6 * * * postgres pgbackrest --stanza=mydb verify
Always encrypt backups at rest, especially when storing them in object storage. pgBackRest supports AES-256-CBC encryption natively. Store the encryption passphrase in a secrets manager (Vault, AWS Secrets Manager), not in the pgbackrest.conf file.
Performance Tuning
PostgreSQL's default configuration is intentionally conservative — it's designed to run on a system with 512 MB of RAM. For production, you need to tune the key memory parameters, understand query execution plans, choose the right indexes, and keep autovacuum healthy.
Essential Memory Parameters
| Parameter | Default | Production Guideline | Notes |
|---|---|---|---|
shared_buffers |
128 MB | 25% of RAM (max ~16 GB) | Main data cache. Beyond 16 GB, OS page cache is more efficient. |
effective_cache_size |
4 GB | 50-75% of RAM | Not allocated — tells the planner how much cache is likely available (shared_buffers + OS cache). |
work_mem |
4 MB | 32-256 MB (careful!) | Per-sort/hash operation, per query. A complex query can use multiple work_mem allocations. Set too high and you OOM. |
maintenance_work_mem |
64 MB | 512 MB - 2 GB | For VACUUM, CREATE INDEX, ALTER TABLE. Higher = faster maintenance operations. |
wal_buffers |
-1 (auto) | 64 MB | Auto-sized to 1/32 of shared_buffers. 64 MB is safe for write-heavy workloads. |
max_connections |
100 | 200-500 (with pooler) | Each connection = ~5-10 MB. Use PgBouncer and keep this modest. |
# postgresql.conf - Example for a 64 GB RAM server
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 128MB
maintenance_work_mem = 2GB
wal_buffers = 64MB
max_connections = 300
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 16
EXPLAIN ANALYZE
The single most important skill for PostgreSQL performance troubleshooting. EXPLAIN ANALYZE executes the query and shows the actual execution plan with real timing and row counts.
-- Always use ANALYZE to see actual vs estimated rows
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > '2026-01-01'
AND o.status = 'completed'
ORDER BY o.total DESC
LIMIT 50;
-- Key things to look for:
-- 1. Seq Scan on large tables (missing index?)
-- 2. actual rows vs estimated rows mismatch (stale statistics?)
-- 3. Sort with external merge (work_mem too low?)
-- 4. Nested Loop with high loop count (missing index on join key?)
Index Strategies
Default B-tree
The default index type. Supports equality and range queries. Covers 90% of use cases. Supports multicolumn indexes (put high-selectivity columns first). Use INCLUDE to create covering indexes for index-only scans.
Specialized GIN
Generalized Inverted Index. For full-text search, JSONB operators (@>, ?), arrays, and trigram similarity. Slower to update but very fast for lookups on multi-valued columns.
Specialized GiST
Generalized Search Tree. For geometric data, range types, full-text search (ts_vector), and PostGIS geospatial queries. Supports nearest-neighbor searches (ORDER BY <->).
Large Tables BRIN
Block Range Index. Tiny index size for naturally ordered data (timestamps, sequential IDs). A BRIN index on a 100 GB table might be only a few MB. Perfect for time-series append-only tables.
-- Covering index for index-only scans
CREATE INDEX idx_orders_status_created
ON orders (status, created_at DESC)
INCLUDE (total, customer_id);
-- Partial index (only index what you query)
CREATE INDEX idx_orders_active
ON orders (created_at DESC)
WHERE status = 'active';
-- BRIN for time-series data
CREATE INDEX idx_events_ts_brin
ON events USING brin (created_at)
WITH (pages_per_range = 32);
-- GIN for JSONB queries
CREATE INDEX idx_metadata_gin
ON documents USING gin (metadata jsonb_path_ops);
Vacuum & Autovacuum Tuning
VACUUM is not optional — it's a critical maintenance operation that reclaims space from dead tuples and updates visibility maps. Autovacuum runs this automatically, but the defaults are often too conservative for busy tables.
# postgresql.conf - Aggressive autovacuum for busy systems
autovacuum_max_workers = 6 # default 3
autovacuum_naptime = 15s # default 1min
autovacuum_vacuum_threshold = 25 # default 50
autovacuum_vacuum_scale_factor = 0.01 # default 0.2 (triggers at 1% dead tuples)
autovacuum_analyze_threshold = 25
autovacuum_analyze_scale_factor = 0.005
autovacuum_vacuum_cost_delay = 2ms # default 2ms (PG 12+)
autovacuum_vacuum_cost_limit = 1000 # default -1 (uses vacuum_cost_limit = 200)
If autovacuum falls behind on a high-write table, dead tuples accumulate, table bloat grows, and eventually you risk transaction ID wraparound — which forces PostgreSQL into single-user mode for emergency VACUUM. Monitor pg_stat_user_tables.n_dead_tup and pg_stat_user_tables.last_autovacuum for every table.
Table Partitioning
Declarative partitioning (PostgreSQL 10+) splits a large table into smaller physical tables based on a partition key. This improves query performance through partition pruning, makes maintenance operations faster (VACUUM on a smaller table), and enables efficient data lifecycle management (drop old partitions instead of deleting rows).
-- Range partitioning by date
CREATE TABLE events (
id bigint GENERATED ALWAYS AS IDENTITY,
event_type text NOT NULL,
payload jsonb,
created_at timestamptz NOT NULL
) PARTITION BY RANGE (created_at);
-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
-- Automate with pg_partman
CREATE EXTENSION pg_partman;
SELECT partman.create_parent(
p_parent_table := 'public.events',
p_control := 'created_at',
p_interval := '1 month',
p_premake := 3
);
Security
PostgreSQL security operates at multiple layers: network access control (pg_hba.conf), authentication, encryption in transit (SSL/TLS), authorization (roles and privileges), row-level security, and audit logging. A production deployment should address all of these.
Authentication — pg_hba.conf
The pg_hba.conf file controls who can connect, from where, and how they authenticate. Rules are evaluated top-to-bottom; the first match wins.
# pg_hba.conf - Production example
# TYPE DATABASE USER ADDRESS METHOD
# Local socket connections
local all postgres peer
local all all scram-sha-256
# SSL-only remote connections
hostssl myapp app_user 10.0.0.0/24 scram-sha-256
hostssl all admin 10.0.1.0/24 scram-sha-256
# Replication (SSL required)
hostssl replication replicator 10.0.0.0/24 scram-sha-256
# Reject everything else
host all all 0.0.0.0/0 reject
Never use trust or md5 authentication in production. trust allows connections without a password. md5 is vulnerable to pass-the-hash attacks. Use scram-sha-256 (default for password_encryption since PostgreSQL 14). Set password_encryption = scram-sha-256 in postgresql.conf and update pg_hba.conf entries from md5 to scram-sha-256.
SSL/TLS Configuration
# postgresql.conf
ssl = on
ssl_cert_file = '/etc/postgresql/server.crt'
ssl_key_file = '/etc/postgresql/server.key'
ssl_ca_file = '/etc/postgresql/ca.crt' # for client certificate verification
ssl_min_protocol_version = 'TLSv1.3' # PG 12+
ssl_ciphers = 'HIGH:!aNULL:!MD5'
Roles & Privileges
Follow the principle of least privilege. Create separate roles for applications, read-only users, and administrators. Never let applications connect as the postgres superuser.
-- Create application role with minimal privileges
CREATE ROLE app_user WITH LOGIN PASSWORD 'strong-password-here';
GRANT CONNECT ON DATABASE myapp TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO app_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO app_user;
-- Read-only role for reporting
CREATE ROLE readonly WITH LOGIN PASSWORD 'another-strong-password';
GRANT CONNECT ON DATABASE myapp TO readonly;
GRANT USAGE ON SCHEMA public TO readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO readonly;
Row-Level Security (RLS)
RLS allows you to restrict which rows a user can see or modify based on their role or session properties. Essential for multi-tenant applications.
-- Enable RLS on a multi-tenant table
ALTER TABLE customer_data ENABLE ROW LEVEL SECURITY;
-- Policy: users can only see their own tenant's data
CREATE POLICY tenant_isolation ON customer_data
USING (tenant_id = current_setting('app.tenant_id')::int);
-- Force RLS even for table owners
ALTER TABLE customer_data FORCE ROW LEVEL SECURITY;
-- Set tenant context in application
SET app.tenant_id = '42';
Audit Logging with pgAudit
# postgresql.conf
shared_preload_libraries = 'pgaudit'
pgaudit.log = 'ddl, role, write' # log DDL, role changes, and write operations
pgaudit.log_catalog = off # don't log system catalog queries
pgaudit.log_relation = on # log table names, not just commands
pgaudit.log_parameter = on # log query parameters
For compliance (SOC 2, HIPAA, PCI-DSS), pgAudit is a must-have. Ship audit logs to a centralized logging system (ELK, Loki, Splunk) where they can't be tampered with by database administrators. Include pgaudit.log_parameter = on carefully — it logs query parameters which may contain PII.
Upgrades
PostgreSQL releases a new major version annually (typically September) and minor versions quarterly (February, May, August, November). Major upgrades (e.g., 17 to 18) require a migration — the on-disk data format changes. Minor upgrades (e.g., 18.1 to 18.3) are always in-place and binary-compatible.
Upgrade Methods
Recommended pg_upgrade (In-Place)
The standard upgrade method. Copies or hard-links data files and rewrites system catalogs. Downtime is typically minutes to hours depending on database size (with --link mode, it's minutes even for terabyte databases). Run pg_upgrade --check first to validate compatibility.
- Fast with
--linkmode (hard links instead of copying) - Requires both old and new PostgreSQL binaries installed
- All extensions must be compatible with the new version
- Run
ANALYZEon all databases after upgrade to rebuild statistics
Zero-Downtime Logical Replication
Set up logical replication from the old version to a new-version instance. Once caught up, switch application connections. Near-zero downtime but requires more setup. Not all data types and DDL replicate — test thoroughly. Sequences, large objects, and DDL don't replicate.
- Supports cross-version replication (PG 14+ to 18)
- Application cutover in seconds
- More complex to set up and validate
- Need to handle sequences and schema changes manually
pg_upgrade Workflow
# 1. Install new PostgreSQL version alongside old
sudo apt install postgresql-18
# 2. Stop both clusters
sudo systemctl stop postgresql@17-main
sudo systemctl stop postgresql@18-main
# 3. Run upgrade check (dry run)
sudo -u postgres /usr/lib/postgresql/18/bin/pg_upgrade \
--old-datadir=/var/lib/postgresql/18/main \
--new-datadir=/var/lib/postgresql/18/main \
--old-bindir=/usr/lib/postgresql/17/bin \
--new-bindir=/usr/lib/postgresql/18/bin \
--check
# 4. Run actual upgrade (--link for speed)
sudo -u postgres /usr/lib/postgresql/18/bin/pg_upgrade \
--old-datadir=/var/lib/postgresql/18/main \
--new-datadir=/var/lib/postgresql/18/main \
--old-bindir=/usr/lib/postgresql/17/bin \
--new-bindir=/usr/lib/postgresql/18/bin \
--link
# 5. Start new cluster
sudo systemctl start postgresql@18-main
# 6. Rebuild optimizer statistics (critical!)
sudo -u postgres /usr/lib/postgresql/18/bin/vacuumdb \
--all --analyze-in-stages
# 7. After verifying, remove old cluster
sudo pg_dropcluster 17 main
With --link mode, the old cluster becomes unusable after upgrade — the data files are shared. If the upgrade fails or you need to rollback, you must restore from backup. Always take a full backup before running pg_upgrade.
Minor Version Upgrades
Minor upgrades are straightforward and should be applied promptly — they contain security fixes and bug fixes. No data migration is needed.
# Minor upgrade (e.g., 18.1 → 18.3)
sudo apt update
sudo apt upgrade postgresql-18
sudo systemctl restart postgresql@18-main
# Verify
sudo -u postgres psql -c "SELECT version();"
Never skip minor version upgrades for extended periods. PostgreSQL minor releases fix bugs that can cause data corruption, security vulnerabilities, and crashes. Subscribe to the PostgreSQL security mailing list and apply patches within your maintenance window.
Monitoring
PostgreSQL exposes a wealth of internal statistics through its catalog views. The key is knowing which metrics matter, setting up proper alerting, and correlating database metrics with application-level observability.
Essential Statistics Views
Queries pg_stat_statements
The most important extension for performance analysis. Tracks execution statistics for every distinct query: call count, total/mean/min/max time, rows returned, block hits/reads. Requires shared_preload_libraries = 'pg_stat_statements'.
Sessions pg_stat_activity
Shows currently running queries, idle connections, waiting processes, and lock info. Essential for debugging slow queries and connection issues. Check state, wait_event_type, query_start.
Tables pg_stat_user_tables
Per-table stats: sequential scans, index scans, rows inserted/updated/deleted, dead tuples, last vacuum/analyze time. Use this to find tables missing indexes (high seq_scan count) or needing vacuum.
Replication pg_stat_replication
Shows connected standbys, their WAL positions, and replication lag. The replay_lag column (PG 10+) gives human-readable lag. Alert on any lag exceeding your RPO.
Key Metrics to Monitor
-- Cache hit ratio (should be > 99%)
SELECT
sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) AS ratio
FROM pg_statio_user_tables;
-- Active connections by state
SELECT state, count(*)
FROM pg_stat_activity
GROUP BY state;
-- Replication lag (on primary)
SELECT client_addr, state, replay_lag
FROM pg_stat_replication;
-- Top 10 slowest queries (requires pg_stat_statements)
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Tables with most dead tuples (vacuum candidates)
SELECT schemaname, relname, n_dead_tup, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;
-- Transaction ID age (wraparound risk)
SELECT datname, age(datfrozenxid) AS xid_age,
current_setting('autovacuum_freeze_max_age') AS freeze_max
FROM pg_database
ORDER BY age(datfrozenxid) DESC;
Prometheus + Grafana Stack
The postgres_exporter by Prometheus Community is the standard way to expose PostgreSQL metrics to Prometheus. Combine with Grafana dashboards for visualization and alerting.
# docker-compose.yml snippet for postgres_exporter
postgres-exporter:
image: prometheuscommunity/postgres-exporter:latest
environment:
DATA_SOURCE_NAME: "postgresql://monitor:password@postgres:5432/myapp?sslmode=require"
ports:
- "9187:9187"
-- Create a monitoring role with minimal privileges
CREATE ROLE monitor WITH LOGIN PASSWORD 'monitor-password';
GRANT pg_monitor TO monitor; -- PG 10+ built-in monitoring role
GRANT CONNECT ON DATABASE myapp TO monitor;
Critical Alerts
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
| Replication lag | > 30 seconds | > 5 minutes |
| Connection utilization | > 70% of max_connections | > 90% of max_connections |
| Cache hit ratio | < 99% | < 95% |
| Transaction ID age | > 500M | > 1B (wraparound imminent) |
| Dead tuples per table | > 10M | > 100M (autovacuum is stuck) |
| Disk space | < 20% free | < 10% free |
| Long-running queries | > 5 minutes | > 30 minutes |
Use the community Grafana dashboard ID 9628 (PostgreSQL Database) as a starting point. It covers connection stats, transaction rates, tuple operations, cache ratios, and replication lag out of the box. Customize alerting rules based on your SLOs.
Extensions & Ecosystem
PostgreSQL's extension system is one of its greatest strengths. Extensions can add new data types, operators, index access methods, functions, and even background workers — all without forking the core project. Some extensions effectively turn PostgreSQL into a specialized database.
Essential Extensions
Geospatial PostGIS
The gold standard for geospatial data in any database. Adds geometry/geography types, spatial indexes (GiST/SP-GiST), and hundreds of spatial functions. If you need geospatial queries, PostGIS is why you choose PostgreSQL over every other database.
AI/ML pgvector
Vector similarity search for AI/ML embeddings. Supports exact and approximate nearest neighbor search (IVFFlat, HNSW indexes). Store embeddings alongside relational data — no need for a separate vector database. Critical for RAG applications.
Time-Series TimescaleDB
Turns PostgreSQL into a time-series database. Automatic partitioning (hypertables), compression, continuous aggregates, and data retention policies. Core is Apache 2.0; community features under Timescale License (TSL, free for non-DBaaS use). Company rebranded to TigerData in 2025.
Distributed Citus
Distributed PostgreSQL by Microsoft. Shards tables across multiple nodes for horizontal scaling. Supports distributed queries, reference tables, and columnar storage. Open source (AGPLv3). Now available as an extension, not just a fork.
Operations pg_cron
Cron-based job scheduler running inside PostgreSQL. Schedule SQL commands, VACUUM, ANALYZE, partition maintenance — without external cron jobs. Simple, reliable, and runs as a background worker.
Audit pgAudit
Provides detailed session and object audit logging. Required for compliance frameworks (SOC 2, HIPAA, PCI-DSS). Logs to the standard PostgreSQL log, so integrate with your existing log pipeline.
Partitioning pg_partman
Automates partition creation and retention for time-based and ID-based partitioning. Pre-creates future partitions and optionally drops or detaches old ones. Works with native declarative partitioning.
CDC Debezium
Change Data Capture using PostgreSQL's logical decoding. Streams row-level changes to Kafka, Pulsar, or other sinks. Requires wal_level = logical and the pgoutput or wal2json plugin. Essential for event-driven architectures.
Foreign Data Wrappers (FDW)
FDWs let you query external data sources as if they were PostgreSQL tables. The postgres_fdw extension (included in core) enables cross-server queries. Third-party FDWs exist for MySQL, MongoDB, Redis, S3/Parquet files, Elasticsearch, and more.
-- Query a remote PostgreSQL server
CREATE EXTENSION postgres_fdw;
CREATE SERVER remote_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'remote.example.com', port '5432', dbname 'analytics');
CREATE USER MAPPING FOR app_user
SERVER remote_server
OPTIONS (user 'remote_reader', password 'secret');
IMPORT FOREIGN SCHEMA public
FROM SERVER remote_server
INTO remote_schema;
-- Now query remote tables as if they were local
SELECT * FROM remote_schema.sales WHERE region = 'EMEA';
Logical Decoding & CDC
# postgresql.conf for logical decoding
wal_level = logical # required (not just 'replica')
max_replication_slots = 10 # one per consumer
max_wal_senders = 10
-- Create a logical replication slot for Debezium
SELECT pg_create_logical_replication_slot('debezium_slot', 'pgoutput');
-- Create a publication for CDC
CREATE PUBLICATION my_publication FOR TABLE orders, customers, products;
Logical replication slots that are not consumed will cause WAL to accumulate indefinitely, eventually filling your disk. Always monitor pg_replication_slots for inactive slots and drop any that are no longer in use: SELECT pg_drop_replication_slot('unused_slot');
Licensing
PostgreSQL is released under the PostgreSQL License, a liberal open-source license similar to the BSD and MIT licenses. It permits use, modification, and distribution for any purpose — commercial or otherwise — with no copyleft requirements. There is no "enterprise edition," no dual licensing, and no feature gating. Every feature is available to everyone.
License Comparison
| Database | License | Enterprise Edition? | Key Restrictions |
|---|---|---|---|
| PostgreSQL | PostgreSQL License (BSD-like) | No — all features included | None. Fully permissive. |
| MySQL | GPLv2 (Community) / Commercial (Enterprise) | Yes — Thread Pool, Audit, Encryption, Backup | GPL requires source sharing for derivative works, or buy a commercial license from Oracle. |
| MariaDB | GPLv2 (Server) / BSL (MaxScale, some tools) | Sort of — MaxScale and some features under BSL | BSL converts to GPL after a period, but restricts production use initially. |
| MongoDB | SSPL | Yes (Atlas features) | SSPL requires open-sourcing entire service stack if offering as a service. |
Commercial Support Options
While PostgreSQL itself is free, commercial support is available from several companies. This is relevant for enterprises that need SLA-backed support, consulting, or managed services.
Enterprise EDB (EnterpriseDB)
The largest PostgreSQL company. Offers EDB Postgres Advanced Server (Oracle compatibility layer), 24/7 support, managed services, and migration tooling. Acquired 2ndQuadrant in 2020. Major contributor to PostgreSQL core.
Enterprise Crunchy Data
Kubernetes-native PostgreSQL platform. Crunchy Certified PostgreSQL, PGO operator, pgBackRest, pgBouncer, pgMonitor. Strong in US government and regulated industries. Open-source-first approach.
Support Percona
Database-agnostic support company. Offers Percona Distribution for PostgreSQL (packaged with extensions and tools), monitoring (PMM), backup, and 24/7 support. Good choice if you also run MySQL/MongoDB.
Managed Cloud Providers
AWS RDS/Aurora, Google CloudSQL/AlloyDB, Azure Database for PostgreSQL. Fully managed with automated HA, backups, and patching. Higher cost, less control, but zero operational burden for the database layer.
The PostgreSQL License means you will never face a licensing surprise, an audit, or a forced migration due to license changes. This is a strategic advantage over MySQL (Oracle), MongoDB (SSPL), and Redis (RSALv2/SSPLv1 since 7.4, no longer open source). When presenting database options to customers, lead with this.
Consultant's Checklist
Before deploying PostgreSQL for a customer, work through these questions. They'll surface requirements that drive architecture decisions and prevent post-deployment surprises.
- Data volume & growth rate? — Current database size, expected growth per month/year. This drives storage, partitioning, and backup strategy decisions. Terabyte databases need different tooling than 50 GB ones.
- HA requirements? — What's the acceptable downtime? RPO (data loss tolerance) and RTO (recovery time). Zero RPO requires synchronous replication. Sub-minute RTO requires Patroni with automated failover.
- Connection patterns? — Peak concurrent connections, connection duration (short-lived HTTP handlers vs long-lived batch jobs). This determines whether you need PgBouncer and how to size pools.
- Read/write ratio? — Read-heavy workloads benefit from read replicas. Write-heavy workloads need careful WAL tuning, connection pooling, and potentially Citus for horizontal scaling.
- Compliance requirements? — SOC 2, HIPAA, PCI-DSS, GDPR. These drive encryption at rest, SSL enforcement, pgAudit, row-level security, and data residency decisions.
- Backup & recovery targets? — RPO/RTO numbers, retention requirements, offsite storage mandates. pgBackRest to S3 with PITR is the standard, but some customers need backup to specific providers.
- Deployment platform? — Bare metal, VMs, Kubernetes, or managed service? Team expertise matters more than technical merits. If they don't have K8s skills, don't deploy on K8s.
- Upgrade strategy? — How will you handle major version upgrades? pg_upgrade with downtime window, or logical replication for near-zero downtime? How often will you apply minor patches?
- Extensions needed? — PostGIS, pgvector, TimescaleDB, pg_cron? This affects deployment model — managed services may not support all extensions. Verify compatibility before committing.
- Monitoring stack? — What's the existing observability platform? Prometheus/Grafana, Datadog, New Relic? Ensure postgres_exporter or the provider's integration is in place from day one.
- Team expertise? — Does the team have PostgreSQL operational experience? If not, budget for training, runbooks, and potentially commercial support (EDB, Crunchy Data, Percona).
- Migration source? — Coming from MySQL, Oracle, SQL Server? Plan for schema differences, stored procedure rewrites, and data type mapping. EDB has Oracle compatibility tools. pgLoader handles MySQL migrations.
The most common mistake in PostgreSQL deployments is underestimating operational complexity. PostgreSQL is not "install and forget." Autovacuum, bloat management, connection pooling, backup testing, and upgrade planning all require ongoing attention. Budget for operations, not just initial deployment.