A Practical Guide to Elasticsearch
Elasticsearch is often introduced as a “search engine,” but in real systems what matters more than search itself is what documents you index, in what shape, with which analyzer, and how you handle query and aggregation patterns at what cost.
Structure at a Glance
[Application Data]
|
v
[Indexing Pipeline]
mapping / analyzer / transform
|
v
[Elasticsearch Index]
|
+--> full-text query
+--> filter query
+--> aggregation
|
v
[Search Result / Ranking / Highlight]
The key point in this structure is that Elasticsearch is not a simple read store. It is a system that reshapes data into a search-oriented structure at index time.
Elasticsearch Is Not an RDB Replacement, but a Search-Specialized Engine
One common mistake when first adopting Elasticsearch is thinking, “If we query everything through ES instead of SQL, it will be faster.” But ES is designed for search, analytics, large-scale filtering, and full-text retrieval more than strong transactional consistency.
| Elasticsearch | RDB |
|---|---|
| Index | Table |
| Document | Row |
| Mapping | Schema |
| Analyzer | None |
| Relevance score | Usually none |
In other words, Elasticsearch is usually best treated not as a full replacement for the source-of-truth datastore, but as a read-optimized layer dedicated to search.
Mapping Design Is Half of Search Quality
One of the most common regrets in production is thinking about mappings too late. Dynamic mapping feels convenient early on, but later it tends to create problems with sorting, aggregations, filtering, and Korean search quality.
PUT /posts
{
"mappings": {
"properties": {
"title": { "type": "text", "analyzer": "nori" },
"title_kw": { "type": "keyword" },
"content": { "type": "text", "analyzer": "nori" },
"tags": { "type": "keyword" },
"viewCount": { "type": "integer" },
"createdAt": { "type": "date" }
}
}
}
Even for the same string field, separating text and keyword by usage is important. text is for full-text search, while keyword is for exact match, filtering, aggregations, and sorting.
Analyzer Choice Defines the Search UX
Unlike English search, Korean search quality depends heavily on analyzer choice. When selecting an analyzer, you need to consider how aggressively to split compound nouns, whether you need typo correction or autocomplete, and whether the search terms and indexed terms should go through the same analyzer.
A Single match Query Is Not Enough; You Need to Model Search Intent
Production search usually mixes full-text search, category and tag filters, recency or popularity sorting, field boosts, and highlighting. That means you need to separate must and filter, and design for both search quality and performance at the same time.
Aggregations Mean It Is Also an Analytics Engine
One of Elasticsearch’s strengths is that it can naturally run aggregations alongside search results. But frequent large-range aggregations on high-cardinality fields can drive up memory and CPU cost, so you need to be selective about which fields are used this way and how often.
Index Design Is a Balance Between Write Cost and Read Quality
Search systems are often thought of as read-only, but in practice the indexing pipeline is often more important. You need to decide how changes in the source DB reach ES, whether real-time indexing is truly required, whether some fields can tolerate async delay, and whether the document model should be denormalized.
Points Teams Often Miss in Operations
- Choosing too many shards and wasting memory
- Relying on dynamic mapping without separating
textandkeyword - Using large aggregation and sorting queries indiscriminately
- Changing mappings without a reindex strategy
- Treating ES like a source-of-truth system and creating data consistency problems
When Elasticsearch Is an Especially Good Fit
- When full-text quality matters, such as in blogs, documents, or commerce
- When you need an exploratory UI that combines filters, search, and aggregations
- When searches frequently combine tags, categories, dates, and sorting
- When RDB
LIKEqueries are not good enough in either quality or performance
Wrap-Up
The core of Elasticsearch is not just calling a search API. It is how you model searchable documents and design the search experience through analyzers and mappings. Good search quality is determined earlier by index design and field type selection than by query tricks.
What Gets Hard in Production
- Elasticsearch is powerful for search and analytics, but schema, shard layout, and indexing policy decisions have long-lived operational consequences.
- Many failures come from treating it like a generic document database rather than a specialized search engine.
- Relevance tuning and cluster operations are as important as query syntax.
Architecture Decisions That Matter
- Model indices around search use cases and retention policy, not around arbitrary service boundaries.
- Choose analyzers, mappings, and shard counts deliberately because they are expensive to change later.
- Keep ingestion, refresh policy, and query latency expectations aligned with business needs.
Practical Example
Search quality usually depends on explicit mapping choices more than on query cleverness:
title -> text + keyword
category -> keyword
published_at -> date
content -> text with language analyzer
Anti-Patterns to Avoid
- Using dynamic mapping everywhere and discovering field explosions later.
- Choosing shard counts based on folklore instead of workload.
- Treating Elasticsearch as the source of truth for transactional state.
Operational Checklist
- Review index size, shard balance, and refresh cost regularly.
- Measure relevance with representative queries and click or success signals.
- Plan reindexing strategy before schema evolution is needed.
- Monitor heap pressure, circuit breakers, and slow queries.
Final Judgment
Elasticsearch delivers outsized value when it is treated as a search system with operational discipline. Without that discipline, it turns simple indexing into a cluster management problem.
Continue Reading
Related posts
A Complete Guide to Redis Data Structures — String, Hash, List, Set, ZSet
This post summarizes Redis's five core data structures and practical use cases. It covers how to build sessions, rankings, real-time feeds, and distributed locks with Redis.
🗄️ DatabaseA Guide to MongoDB Schema Design
This post explains MongoDB schema design from a practical perspective, going beyond embedding-vs-reference comparisons to cover access patterns, document boundaries, indexes, aggregations, and transaction cost.
🔧 ToolsDesigning Search Architecture for Engineering Docs
As documentation grows, the problem shifts from writing to finding. This guide explains how to design search-friendly engineering documentation.
📈 TrendsPostgreSQL 18 Trends: What Actually Matters in Practice
PostgreSQL 18 is more than an upgrade headline. AIO, skip scan, better post-upgrade recovery, OAuth, and generated columns all point to a release focused on operational cost reduction.
Next Path