Java Memory Leak Hunting Playbook
Most Java memory incidents do not start as obvious crashes. They start as a service that restarts more often, pauses longer, or slowly loses headroom after every release.
Start with the shape of growth
Before blaming a leak, ask:
- is heap usage returning after GC
- does retained memory grow by deployment or by traffic pattern
- are off-heap buffers or thread stacks involved instead
The first job is classification, not guesswork.
Common leak sources
- unbounded in-memory caches
- listeners or callbacks that are never deregistered
- request context stored in static references
- large collections attached to long-lived singleton objects
These are usually lifecycle design bugs, not language flaws.
Use dumps to find retention owners
Heap dumps matter because they show which objects are still strongly reachable. Focus on:
- dominator tree size
- suspicious collections
- classloader retention after redeploy
- large strings, byte arrays, and serialization buffers
The question is not which object is large. It is why it still has an owner.
Pair runtime metrics with code boundaries
Memory debugging becomes faster when you connect graphs to releases and traffic features. Watch:
- old gen occupancy
- allocation rate spikes
- full GC frequency
- endpoints or jobs correlated with growth
That creates a shortlist before you open the dump.
Prevention matters more than heroics
Strong teams add guardrails:
- bounded caches with eviction
- explicit lifecycle cleanup
- load tests that watch memory trend, not only latency
- dashboards comparing heap after each release
The best leak investigation ends with a design rule that prevents the same class of failure from returning.
Continue Reading
Related posts
Kotlin Basics for Java Developers
A practical guide to Kotlin for Java developers through a production lens. Learn what Kotlin changes in team habits, not just in syntax, especially around null safety, state modeling, and coroutines.
💬 LanguageJava 21 Virtual Threads: A Practical Concurrency Guide
A production-focused guide to Java 21 Virtual Threads. Learn where they improve throughput, where they do not help, and what to validate before rolling them into a Spring Boot service.
📈 TrendsJDK 25 Trends: How to Read LTS Adoption in Practice
JDK 25 reached GA on September 16, 2025 and serves as the reference implementation of Java 25. The real question is not how many JEPs landed, but which ones deserve production attention now.
🖥️ FrontendOptimizing Core Web Vitals: A Practical Guide to LCP, CLS, and INP
This guide explains Core Web Vitals not as a checklist, but from the perspective of perceived performance and rendering structure. It shows why LCP, CLS, and INP degrade, what to measure first, and how to optimize them in the right order.
Next Path