TestForge | Aidevops | 📊 Plogger ✍️ Blog 📚 Docs
plogger

AI DevOps Korea

Turn AI service development and operations into one improvement loop

Aidevops.kr covers LLMOps, RAG, agents, observability, evaluation, and cost-performance optimization for production AI services.

A Practical Guide to SQL Performance Optimization

· Updated Apr 21
A Practical Guide to SQL Performance Optimization
A Practical Guide to SQL Performance Optimization diagram
Visual guide to the key flow, architecture, and decision points covered in this post.
SQL performance problems are almost never just "a slow query." They are usually a mismatch between workload shape, access path, data volume, and index design. That is why tuning based only on query text tends to disappoint. The SQL string is only the surface. The execution plan is where the cost lives.

Good SQL tuning is therefore less about writing clever statements and more about reducing how much work the database has to do.

Start With Workload, Not Heroic Query Rewrites

One of the most common mistakes in tuning is focusing on a single query in isolation before understanding:

  • how often it runs
  • whether it is slow at p95 or only under specific bursts
  • whether the problem is CPU, I/O, lock contention, or network
  • what business path it affects

Without that context, teams often optimize the most emotionally visible query instead of the most expensive query.

The right opening question is not “How do we make this SQL prettier?” It is “What workload cost are we actually paying?”

EXPLAIN Is About the Plan, Not the Syntax

EXPLAIN and EXPLAIN ANALYZE matter because databases do not execute intent. They execute plans.

The main things to look for are:

  • which table is accessed first
  • whether the access path uses an index or a broad scan
  • whether estimated rows and actual rows diverge badly
  • whether sort, temporary structures, or hash phases dominate the cost
  • whether joins begin from selective conditions or broad ones

This is why tuning is usually a planner problem before it is a syntax problem.

A Query Can Be Correct and Still Operationally Wrong

Many slow queries are logically fine. Their cost is just badly aligned with the workload.

Typical examples include:

  • list endpoints that read too many columns
  • admin pages that demand exact counts too often
  • analytics-style joins pushed into request paths
  • offset pagination used deep into large datasets

The query text may look harmless while the workload cost is completely unsound.

Indexes Must Match Access Patterns, Not Developer Hopes

Indexes are often treated as a universal fix. In reality, they help only when they align with real access behavior.

Good index design considers:

  • WHERE selectivity
  • JOIN path support
  • ORDER BY and GROUP BY cost
  • write amplification and maintenance cost

That is why one of the worst habits in tuning is adding indexes by intuition before reading the plan.

Composite indexes are especially valuable when filtering and ordering happen together. But they should be justified by real dominant query patterns, not by a generic checklist.

Covering Indexes Are Powerful but Expensive

A covering index can avoid table lookups and significantly reduce latency on read-heavy endpoints. That can be excellent for:

  • list APIs
  • dashboard summaries
  • hot read paths with stable projection shapes

But trying to cover every important query causes:

  • larger indexes
  • slower writes
  • higher storage and maintenance cost

So covering indexes should be a selective optimization for important and repeatable access paths.

Offset Pagination Is Usually a Hidden Tax

SELECT *
FROM posts
WHERE id < :lastId
ORDER BY id DESC
LIMIT 20;

Offset-based pagination looks simple, but its cost grows with depth because the engine still reads and discards earlier rows.

Cursor-based strategies are usually better for:

  • infinite scroll
  • large admin tables
  • high-volume timelines
  • stable sorting under concurrent writes

Pagination is therefore not only a frontend concern. It is often one of the most important database performance decisions in an API.

Aggregation Cost Is Usually Read Cost Plus Sort Cost

GROUP BY, DISTINCT, and aggregate queries often become expensive because they:

  • read too much data
  • sort large sets
  • build temporary structures
  • compete with transactional traffic

This means the right fix is often not only “optimize the query.” It may be:

  • narrow the aggregation scope
  • precompute summaries
  • separate analytical workloads
  • move some queries off the request path

SQL tuning is as much about workload placement as about statement shape.

COUNT(*) Is Often a Product Question

Teams frequently optimize exact count queries without asking whether exactness is even required.

For many use cases, it is enough to know:

  • whether another page exists
  • whether a threshold was crossed
  • whether an approximate count is acceptable

The fastest count is often the one the product no longer insists on calculating precisely in real time.

ORMs Can Hide the Real Cost Structure

ORM-generated SQL is not bad by default, but it can hide:

  • repeated queries
  • broad entity loading
  • accidental N+1 access
  • unstable sort and filter patterns

That is why good SQL tuning often requires looking above the database layer:

  • API shape
  • screen behavior
  • repository design
  • serialization needs

A bad read path can generate perfectly valid SQL and still be architecturally wasteful.

A Good Tuning Loop

measure -> explain -> change one variable -> verify -> monitor regression

This loop matters because production tuning is iterative. Several variables change at once:

  • data distribution
  • concurrency
  • cache warmth
  • traffic shape

The safest improvements are focused and measurable.

Common Anti-Patterns

  • optimizing emotionally visible queries instead of high-total-cost queries
  • adding indexes before understanding the plan
  • tuning execution time while ignoring lock contention
  • rewriting SQL for elegance without reducing data access
  • keeping analytical queries on latency-sensitive request paths

These mistakes are common because query text is visible and workload behavior is less visible.

Review Checklist

  • Is the query expensive in the workload that actually matters?
  • Do estimated and actual rows tell the same story?
  • Is the index helping the real access path or just existing nearby?
  • Is pagination, counting, or aggregation creating hidden cost?
  • Could the workload be moved or reshaped instead of only tuning syntax?

Closing Judgment

The heart of SQL performance tuning is not syntax cleverness. It is disciplined reduction of data access, planner cost, and operational waste. The best tuning work usually makes the workload smaller, not the query more impressive.

Continue Reading

Related posts

Next Path

Keep exploring this topic as a system