BlogBest Practices
Best Practices

PostgreSQL Performance Tuning in 2026: Indexes, Query Optimization, and Scaling to Millions of Rows

Your database is probably the bottleneck. This guide covers PostgreSQL performance tuning from the ground up: understanding EXPLAIN ANALYZE, index strategies, query optimization, connection pooling, partitioning, and when to add read replicas.

E

Emily Watson

Technical Writer and Developer Advocate who simplifies complex technology for everyday readers.

February 7, 2026
25 min read

PostgreSQL is the most popular database for new projects in 2026 (Stack Overflow Developer Survey, DB-Engines ranking). It's excellent out of the box, but "out of the box" configuration handles maybe 100 concurrent connections and a few million rows before performance degrades. Most production PostgreSQL issues we see fall into three categories: missing indexes (80% of all performance problems), poor query design (15%), and inadequate connection management (5%). This guide addresses all three.

Understanding EXPLAIN ANALYZE: Your Most Important Tool

Before optimizing anything, you need to understand how PostgreSQL executes your queries. EXPLAIN ANALYZE shows the query plan (what the database does) and actual execution metrics (how long each step took).

-- Always use EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) for troubleshooting
EXPLAIN (ANALYZE, BUFFERS) SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending'
AND o.created_at > '2026-01-01'
ORDER BY o.created_at DESC
LIMIT 20;

-- Output example (BEFORE optimization):
-- Limit  (cost=45230.12..45230.17 rows=20) (actual time=1247.3..1247.4 rows=20)
--   ->  Sort  (cost=45230.12..45560.45 rows=132132)
--         Sort Key: o.created_at DESC
--         Sort Method: top-N heapsort  Memory: 28kB
--         ->  Hash Join  (cost=12.50..43891.02 rows=132132)
--               Hash Cond: (o.customer_id = c.id)
--               ->  Seq Scan on orders o  (cost=0.00..41234.00 rows=132132)
--                     Filter: (status = 'pending' AND created_at > '2026-01-01')
--                     Rows Removed by Filter: 3867868
--                     Buffers: shared hit=2134 read=28456
--               ->  Hash  (cost=10.00..10.00 rows=200)
--                     ->  Seq Scan on customers c
-- Planning Time: 0.45 ms
-- Execution Time: 1247.89 ms

-- Problems identified:
-- 1. Sequential Scan on orders (4 million rows scanned, 132K match)
-- 2. Buffers: shared read=28456 (28K pages read from disk, not cache)
-- 3. 1247ms execution time — unacceptable for a web request

Index Strategies: Solving 80% of Performance Problems

An index is a data structure that lets PostgreSQL find rows without scanning the entire table. The right index turns a 1-second query into a 1-millisecond query. The wrong index (or a missing index) is almost always the cause of slow queries.

-- Fix for the query above: composite index on the filtered and sorted columns
CREATE INDEX idx_orders_status_created ON orders (status, created_at DESC);

-- After adding the index:
-- Limit  (cost=0.56..44.23 rows=20) (actual time=0.12..0.18 rows=20)
--   ->  Nested Loop  (cost=0.56..28912.34 rows=132132)
--         ->  Index Scan using idx_orders_status_created on orders o
--               Index Cond: (status = 'pending' AND created_at > '2026-01-01')
--               Buffers: shared hit=4
--         ->  Index Scan using customers_pkey on customers c
--               Index Cond: (id = o.customer_id)
--               Buffers: shared hit=40
-- Execution Time: 0.23 ms   ← 5,400x faster!

-- Index types and when to use each:

-- B-tree (default): Equality and range queries
CREATE INDEX idx_email ON users (email);
CREATE INDEX idx_price_range ON products (price);

-- Partial index: Index only rows that match a condition
-- Saves space, faster to update, faster to scan
CREATE INDEX idx_active_orders ON orders (created_at)
  WHERE status IN ('pending', 'processing');

-- Covering index (INCLUDE): Include extra columns to enable index-only scans
CREATE INDEX idx_orders_covering ON orders (status, created_at DESC)
  INCLUDE (total, customer_id);

-- GIN index: Full-text search and JSONB containment
CREATE INDEX idx_products_search ON products
  USING GIN (to_tsvector('english', name || ' ' || description));
CREATE INDEX idx_metadata ON events USING GIN (metadata jsonb_path_ops);

-- BRIN index: Very large tables with naturally ordered data
-- Uses 1000x less space than B-tree
CREATE INDEX idx_logs_created ON logs USING BRIN (created_at);

-- Expression index: Index computed values
CREATE INDEX idx_lower_email ON users (LOWER(email));
-- Now WHERE LOWER(email) = 'user@example.com' uses this index

Query Optimization Patterns

Avoid SELECT *. Only fetch the columns you need. This reduces I/O, network transfer, and memory usage. It also enables index-only scans when a covering index contains all needed columns.

Use EXISTS instead of IN for subqueries. WHERE id IN (SELECT id FROM ...) materializes the entire subquery result. WHERE EXISTS (SELECT 1 FROM ... WHERE ...) stops at the first match. For large subquery results, EXISTS can be orders of magnitude faster.

-- Slow: IN with large subquery
SELECT * FROM products
WHERE category_id IN (
  SELECT id FROM categories WHERE department = 'electronics'
);

-- Fast: EXISTS (stops at first match)
SELECT * FROM products p
WHERE EXISTS (
  SELECT 1 FROM categories c
  WHERE c.id = p.category_id AND c.department = 'electronics'
);

-- Slow: Counting all rows to check if any exist
SELECT COUNT(*) FROM orders WHERE customer_id = 123;
-- Then checking if count > 0 in application code

-- Fast: Use EXISTS directly
SELECT EXISTS(SELECT 1 FROM orders WHERE customer_id = 123);

-- Pagination: Don't use OFFSET for deep pages
-- Slow: OFFSET 100000 scans and discards 100K rows
SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 100000;

-- Fast: Keyset pagination (seek method)
SELECT * FROM products
WHERE id > 100000  -- Last ID from previous page
ORDER BY id
LIMIT 20;

Connection Pooling: The Silent Killer

PostgreSQL creates a new process for every connection (fork model). Each connection consumes ~5-10MB of RAM. If your application opens 200 connections, that's 1-2GB of RAM just for connection overhead — before any queries run. At 500+ connections, PostgreSQL spends more time context-switching between processes than executing queries.

The solution: use a connection pooler like PgBouncer or the built-in pgpool-II. PgBouncer sits between your application and PostgreSQL, maintaining a small pool of actual database connections and multiplexing application connections onto them.

# PgBouncer configuration (/etc/pgbouncer/pgbouncer.ini)
[databases]
myapp = host=localhost port=5432 dbname=myapp

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Pool mode:
# session    — Connection held for entire session (default, most compatible)
# transaction — Connection returned after each transaction (recommended)
# statement  — Connection returned after each statement (limited)
pool_mode = transaction

# Pool sizing
default_pool_size = 20        # Connections per database/user pair
max_client_conn = 1000        # Max application connections
min_pool_size = 5             # Keep at least 5 connections warm
reserve_pool_size = 5         # Extra connections for spikes
reserve_pool_timeout = 3      # Wait 3s before using reserve

# Timeouts
server_idle_timeout = 600     # Close idle server connections after 10min
client_idle_timeout = 0       # Never close idle client connections
query_timeout = 30            # Kill queries running longer than 30s

Partitioning: When Tables Get Too Big

When a table exceeds 100 million rows, even well-indexed queries start to slow down because the index itself becomes large (multi-level B-tree traversal) and maintenance operations (VACUUM, REINDEX) take hours. Table partitioning splits a large table into smaller physical tables (partitions) while maintaining a single logical table for queries.

-- Range partitioning by date (most common pattern)
CREATE TABLE events (
  id bigserial,
  event_type text NOT NULL,
  payload jsonb,
  created_at timestamptz NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
  FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

-- Create indexes on each partition (inherited automatically in PG 14+)
CREATE INDEX ON events (created_at, event_type);

-- Queries automatically use partition pruning:
-- Only scans the relevant partition(s)
SELECT * FROM events
WHERE created_at BETWEEN '2026-02-01' AND '2026-02-28'
AND event_type = 'page_view';
-- PostgreSQL only scans events_2026_02, ignoring all other partitions

-- Automate partition creation with pg_partman
CREATE EXTENSION pg_partman;
SELECT partman.create_parent('public.events', 'created_at',
  'native', 'monthly', p_premake := 3);

Monitoring: Know Before Users Complain

Essential PostgreSQL metrics to monitor: active connections vs. max_connections (alert at 80%), query duration percentiles (p95, p99), dead tuples count (indicates need for VACUUM), cache hit ratio (should be >99% — if lower, you need more RAM), replication lag (for read replicas), and disk space usage.

The pg_stat_statements extension is indispensable — it tracks execution statistics for every unique query, showing total time, call count, rows returned, and I/O statistics. Enable it and review the top 10 slowest queries weekly.

ZeonEdge provides PostgreSQL performance audits and optimization services. We analyze your query patterns, indexing strategy, and configuration to deliver measurable performance improvements. Get a free database assessment.

E

Emily Watson

Technical Writer and Developer Advocate who simplifies complex technology for everyday readers.

Related Articles

Best Practices

Redis Mastery in 2026: Caching, Queues, Pub/Sub, Streams, and Beyond

Redis is far more than a cache. It is an in-memory data structure server that can serve as a cache, message broker, queue, session store, rate limiter, leaderboard, and real-time analytics engine. This comprehensive guide covers every Redis data structure, caching patterns, Pub/Sub messaging, Streams for event sourcing, Lua scripting, Redis Cluster for horizontal scaling, persistence strategies, and production operational best practices.

Emily Watson44 min read
AI & Automation

Building and Scaling a SaaS MVP from Zero to Launch in 2026

You have a SaaS idea, but turning it into a launched product is overwhelming. This comprehensive guide covers the entire journey from validating your idea through building the MVP, choosing the right tech stack, implementing authentication and billing, designing multi-tenant architecture, deploying to production, and preparing for scale. Practical advice from real-world experience.

Daniel Park44 min read
Cloud & Infrastructure

DNS Deep Dive in 2026: How DNS Works, How to Secure It, and How to Optimize It

DNS is the invisible infrastructure that makes the internet work. Every website visit, every API call, every email delivery starts with a DNS query. Yet most developers barely understand how DNS works, let alone how to secure it. This exhaustive guide covers DNS resolution, record types, DNSSEC, DNS-over-HTTPS, DNS-over-TLS, split-horizon DNS, DNS-based load balancing, failover strategies, and common misconfigurations.

Marcus Rodriguez42 min read

Ready to Transform Your Infrastructure?

Let's discuss how we can help you achieve similar results.