6 min read

Java's Garbage Collection

Many developers are surprised when Java heap size doesn’t go down even after GC frees memory. This article explains why that happens by breaking down JVM heap structure, garbage collection basics, and VisualVM metrics in a beginner-friendly way.
Java's Garbage Collection
Photo by zibik / Unsplash

Few days ago a junior came and asked me to explain the weird behavior she was seeing in the visualVM. For those of you who don't know visualVM analyzes java process and shows many metrics mostly related to heap. In simple terms the behavior was that the heap size didn't go down even though the used heap usage had gone down.

This is a classic example of how the Java's garbage collection works and is usually unknown to juniors. If you would like to know more about it keep reading!

But before going into the problem itself and why it occurred, I would like to take the time to explain the basics of Java's garbage collection and related terminologies.

Terminologies

  1. JVM: Formally, Java Virtual Machine or JVM is a specification which that provides a runtime environment in which Java bytecode can be executed. You could also understand with an analogy of Virtual Machines or VM that you can install and run on your hardware. In simple terms, if you have a windows installed on your laptop, with the help of VM software you can install other operating systems e.g. Linux. This would share/take the resources from your host system i.e. your laptop's resources and would be shared with your host systems. Similarly JVM is installed on top of your operating system which provides a layer of abstraction between your OS and the native machine language from the Java programs that your write. I will not go into the details of how actually a JVM works as it would take a bit of a different direction, perhaps a topic for next post.
  2. Heap: Heap is the place in memory for dynamic allocation of objects. It is different from the stack memory and objects in the heap usually have longer lifespan. So whenever you create a new object using new keyword in Java it creates an object in the heap. It will be explained in more detail below.
  3. Garage Collection: Garbage collection is a critical and automated aspect of the memory management in Java. Unlike older generation languages where the programmer are responsible for memory management, Java for the most part takes the burden off of the shoulders of the programmers and provide garbage collection. Garbage collection in short is responsible for finding the unused objects, reclaiming the memory and in the end de-fragment the memory to be used efficiently. There are two types of GC, minor GC and major GC. We will see more about it in the following sections.

Basics of Java's Garbage Collection

If you have ever worked on low level programming languages e.g. C you would know that how critical the memory management is in C. If not done right, you could have memory leaks which could lead to OOM(Out of memory) issues. Responsibility lies solely with the programmer to not make such mistakes, which is not a trivial task and the programmers must be cautious at all times.

Fast forwarding to Java, from the beginning Java comes with the garbage collection built in. What does this mean? This means that in Java you wouldn't have to take care of freeing up the memory and is mostly done by the GC(Garbage Collectors) that comes with the JVM(Java Virtual Machine).

Responsibilities of GC

GC in Java are responsible for mainly taking care of three things:

  1. Figuring out the objects in the heap that are no longer in use, as these are effectively the ones that can be removed
  2. Reclaiming the memory and removing the unused objects from the heap
  3. Defragmentation of memory, different reclaimed objects can create holes in the heap and might not be big enough to hold new objects and must be consolidated in a contiguous manner

Explanation of Heap structure

By now, you would have basic idea about the heap structure and what is it for. In this part we will discuss Heap structure in detail and will go through the different parts of it and what those mean.

Here you can see how the Heap memory is divided into many different parts.

Heap Memory Structure

Young Generation is a place where all the newly created objects are stored for the first time. You can also guess it from the name itself it is for young objects. There are two different parts to it, which we look into the following sections.

Eden Space is the actual place within the young generation where the newly created objects are placed for the first time. During their life-cycle they might move to survivor space. You might be wondering in which scenarios or conditions, don't worry about that, we will come back to it in some time.

Survivor Space: Before going into details of survivor space let's understand what happens when Eden space is full. Since Eden space is the place where your newly created objects are stored, consider a situation where you are generating a large number of new objects in that case if the Eden space is not enough for the new objects then a minor GC kicks in.

Now you'd wonder what is a minor GC, it is the lighter version of GC where it also have stop the world events but they are very short, and in some garbage collectors it works concurrently. During minor GC, it move the referenced objects in Eden space more specifically in S0. This creates empty space in the Eden space for new objects. This is what happens in the first minor GC, but don't think it is the last one.

Usually there happens to be more minor GC than major GC. From the second minor GC onward objects from Eden space to S0 and from S0 to S1 based on the age of the objects. The older the object gets it moved towards right in the diagram.

Old Generation as the name suggests is the place for objects which are old. The question you might have, what is the criteria for old? There is a threshold if objects in the young generation survives for more number of times than the threshold they become eligible for the old generation. An example could be, a Hash Map based local cache would ultimately land up in old generation after some time.

Now you can easily guess that, when old generation space is full it would trigger a major GC. A major GC is also accompanied with stop the world event i.e. the application, the underlying threads etc basically it stops everything. But there are some newer Garbage Collectors which have a less of an impact when doing a major GC e.g. Shenandoah and ZGC as they work concurrently, but this wasn't the case with older garbage collectors.

How GC knows which objects are referenced or not

GC doesn’t look for “unreferenced objects” directly. Instead, it finds objects that are reachable from GC roots.
Anything not reachable is garbage.

Step 1: GC Roots (the starting points)

The JVM defines a small, well-known set of GC roots that are always considered alive:

Common GC roots:

  • Local variables in thread stacks
  • Method parameters
  • Active threads
  • Static fields
  • JNI references
  • Class metadata

If an object is reachable from any of these, it’s alive.

Step 2: Reachability graph

The heap is treated like a graph:

  • Objects = nodes
  • References = edges

GC performs a graph traversal (usually DFS or BFS):

  1. Start from all GC roots
  2. Follow every reference
  3. Mark every object reached as live

This is called tracing GC.

Anything not reached during this traversal is garbage.

How GC Reclaims the memory

Now we know how GC knows which objects are alive. It uses this information and move the alive objects as explained above, where we learnt how objects move from Eden space to Survivor space then ultimately to Old generation. Nothing is done to the dead objects their references are removed so that heap could re-fill the memory with new objects if possible.

How GC performs Defragmentation

Garbage collection defragmentation works by moving live objects together so that free memory becomes contiguous. After the GC identifies which objects are still reachable (the marking phase), it relocates those live objects toward one end of the heap or into new regions. Dead objects are simply left behind and discarded. This process is called compaction (in classic collectors) or evacuation (in region-based collectors). Young-generation GCs naturally avoid fragmentation because they copy only live objects into empty survivor spaces, instantly leaving a clean, contiguous area for new allocations.

When objects are moved, the JVM must update every reference that points to them. This includes references in thread stacks, static fields, and other heap objects. Traditional collectors do this during a stop-the-world pause, while modern collectors like G1, ZGC, and Shenandoah perform most of the relocation concurrently using forwarding pointers and read/write barriers. The end result is a heap with fewer or no holes, faster allocations, and reduced risk of allocation failures due to fragmentation.

Issue that Junior faced

The issue that I described above that one of my junior faced where the heap memory usage didn't go down even though the actual used heap was reduced because Garbage collection freed unused objects, which is why used heap dropped.

However, the JVM does not immediately release heap memory back to the OS.
It keeps the committed heap for future allocations unless there is memory pressure or specific GC heuristics decide to shrink it.

Final Thoughts

I hope you enjoyed the article and got some insights as to how Java's garbage collection system works. If you have any comments or suggestions please let me know.

Until then - Happy Learning!