TestForge | Aidevops | 📊 Plogger ✍️ Blog 📚 Docs
plogger

AI DevOps Korea

Turn AI service development and operations into one improvement loop

Aidevops.kr covers LLMOps, RAG, agents, observability, evaluation, and cost-performance optimization for production AI services.

A Practical Guide to Elasticsearch

· Updated Apr 17
Elasticsearch architecture showing application data, indexing pipeline, Elasticsearch index, and search query and aggregation paths
Elasticsearch creates value by reshaping source data into a search-optimized index before full-text queries, filters, and aggregations ever run.

Elasticsearch is often introduced as a “search engine,” but in real systems what matters more than search itself is what documents you index, in what shape, with which analyzer, and how you handle query and aggregation patterns at what cost.

Structure at a Glance

[Application Data]
      |
      v
[Indexing Pipeline]
  mapping / analyzer / transform
      |
      v
[Elasticsearch Index]
      |
      +--> full-text query
      +--> filter query
      +--> aggregation
      |
      v
[Search Result / Ranking / Highlight]

The key point in this structure is that Elasticsearch is not a simple read store. It is a system that reshapes data into a search-oriented structure at index time.

Elasticsearch Is Not an RDB Replacement, but a Search-Specialized Engine

One common mistake when first adopting Elasticsearch is thinking, “If we query everything through ES instead of SQL, it will be faster.” But ES is designed for search, analytics, large-scale filtering, and full-text retrieval more than strong transactional consistency.

ElasticsearchRDB
IndexTable
DocumentRow
MappingSchema
AnalyzerNone
Relevance scoreUsually none

In other words, Elasticsearch is usually best treated not as a full replacement for the source-of-truth datastore, but as a read-optimized layer dedicated to search.

Mapping Design Is Half of Search Quality

One of the most common regrets in production is thinking about mappings too late. Dynamic mapping feels convenient early on, but later it tends to create problems with sorting, aggregations, filtering, and Korean search quality.

PUT /posts
{
  "mappings": {
    "properties": {
      "title":     { "type": "text", "analyzer": "nori" },
      "title_kw":  { "type": "keyword" },
      "content":   { "type": "text", "analyzer": "nori" },
      "tags":      { "type": "keyword" },
      "viewCount": { "type": "integer" },
      "createdAt": { "type": "date" }
    }
  }
}

Even for the same string field, separating text and keyword by usage is important. text is for full-text search, while keyword is for exact match, filtering, aggregations, and sorting.

Analyzer Choice Defines the Search UX

Unlike English search, Korean search quality depends heavily on analyzer choice. When selecting an analyzer, you need to consider how aggressively to split compound nouns, whether you need typo correction or autocomplete, and whether the search terms and indexed terms should go through the same analyzer.

A Single match Query Is Not Enough; You Need to Model Search Intent

Production search usually mixes full-text search, category and tag filters, recency or popularity sorting, field boosts, and highlighting. That means you need to separate must and filter, and design for both search quality and performance at the same time.

Aggregations Mean It Is Also an Analytics Engine

One of Elasticsearch’s strengths is that it can naturally run aggregations alongside search results. But frequent large-range aggregations on high-cardinality fields can drive up memory and CPU cost, so you need to be selective about which fields are used this way and how often.

Index Design Is a Balance Between Write Cost and Read Quality

Search systems are often thought of as read-only, but in practice the indexing pipeline is often more important. You need to decide how changes in the source DB reach ES, whether real-time indexing is truly required, whether some fields can tolerate async delay, and whether the document model should be denormalized.

Points Teams Often Miss in Operations

  • Choosing too many shards and wasting memory
  • Relying on dynamic mapping without separating text and keyword
  • Using large aggregation and sorting queries indiscriminately
  • Changing mappings without a reindex strategy
  • Treating ES like a source-of-truth system and creating data consistency problems

When Elasticsearch Is an Especially Good Fit

  • When full-text quality matters, such as in blogs, documents, or commerce
  • When you need an exploratory UI that combines filters, search, and aggregations
  • When searches frequently combine tags, categories, dates, and sorting
  • When RDB LIKE queries are not good enough in either quality or performance

Wrap-Up

The core of Elasticsearch is not just calling a search API. It is how you model searchable documents and design the search experience through analyzers and mappings. Good search quality is determined earlier by index design and field type selection than by query tricks.

What Gets Hard in Production

  • Elasticsearch is powerful for search and analytics, but schema, shard layout, and indexing policy decisions have long-lived operational consequences.
  • Many failures come from treating it like a generic document database rather than a specialized search engine.
  • Relevance tuning and cluster operations are as important as query syntax.

Architecture Decisions That Matter

  • Model indices around search use cases and retention policy, not around arbitrary service boundaries.
  • Choose analyzers, mappings, and shard counts deliberately because they are expensive to change later.
  • Keep ingestion, refresh policy, and query latency expectations aligned with business needs.

Practical Example

Search quality usually depends on explicit mapping choices more than on query cleverness:

title -> text + keyword
category -> keyword
published_at -> date
content -> text with language analyzer

Anti-Patterns to Avoid

  • Using dynamic mapping everywhere and discovering field explosions later.
  • Choosing shard counts based on folklore instead of workload.
  • Treating Elasticsearch as the source of truth for transactional state.

Operational Checklist

  • Review index size, shard balance, and refresh cost regularly.
  • Measure relevance with representative queries and click or success signals.
  • Plan reindexing strategy before schema evolution is needed.
  • Monitor heap pressure, circuit breakers, and slow queries.

Final Judgment

Elasticsearch delivers outsized value when it is treated as a search system with operational discipline. Without that discipline, it turns simple indexing into a cluster management problem.

Continue Reading

Related posts

Next Path

Keep exploring this topic as a system