Part 4: Heap Analysis

In this workshop, we are going to do some analysis of the JVM's heap dump.

Prerequisites

Java Mission Control (Oracle Java 7u40+)
Eclipse Memory Analyzer

Capture a Heap Dump

You may be aware by now, there are many different ways of getting data from the JVM. Here are just a few ways to get heap dump of the JVM.

jcmd

With jcmd, you have the ability to invoke the GC.heap_dump command. This requires that you are running as the same OS user as the target JVM process.

jcmd <PID> GC.heap_dump my_little_heap_dump.hprof

Note: If a full path is not specified, the heap_dump will be created relative to the location from where the process was started (when generated with jcmd)

jmap

A more traditional approach is using jmap and invoking the command on the target process.

jmap -dump:format=b,file=my_little_heap_dump.hprof <PID>

Core Dump

You can also use jmap to extract a heap dump from a core dump on the process:

sudo gdb --pid=<PID>
gcore /tmp/jvm.core
detach
quit
jmap -dump:format=b,file=my_little_heap_dump.hprof /usr/bin/java /tmp/jvm.core

💡 Capturing a heap dump from a core dump on Java 8, you may run into this issue: JDK-8073606.

HotSpot Diagnostic MBean

You can capture a heap dump by invoking a diagnostic MBean which is included in the Oracle JDK. You can do so by opening your favorite MBean browser (ex. jconsole, jmc) and invoke the com.sun.management.HotSpotDiagnostic. The arguments are:

Filename of the heap dump that will be created. 💡 Note, it will be created with file ownership of the hosting JVM process.
Indicator to dump all live object (true will dump only live objects)

Analyze the Heap Dump

As we had in capturing a heap dump, there are many different tools at looking at a heap dump. We will go through three common tools in which we can do some analysis of a heap dump. All of which are free. 😃

VisualVM

VisualVM is a tool which is part of Oracle's JDK (has been since Java 6). This tool was spawned from the Netbeans Profiler, and it has some helpful tools to inspect a heap dump.

To load the heap dump in VisualVM, you will just go to File -> Load... and specify the File Format to be Heap Dumps.... Once it is loaded, a view will then be presented with the following sections:

Summary
Classes
Instances
OQL Console

For the purpose of the workshop we are only going to utilize the OQL Console to construct queries on the heap. There are other options in VisualVM which can give you context about your heap, but I feel there are other better tools we will touch on later. The attractive feature I like with VisualVM is that its OQL Console allows you to write Javascript code to query objects, which you can then do other functions on that data.

OQL Console

From the OQL Console, we will issue queries and then allow you to further play around with the language.

Query for Thread objects which are daemons (you can get context of Threads, since they are objects in your heap too):

select t from java.lang.Thread t where t.daemon == true

These basic results which are returned, are essentially hyperlinks, that you can then click and begin diving into the instance state.

Query for Thread objects (similar to last query), but return multiple attributes as JSON:

select {name: t.name.toString(), thread_id: t.tid }
from java.lang.Thread t
where t.daemon == true

Query which includes a Javascript function to filter results by looking at all the ConferenceSession objects, and only report the ones which have tags.

function hasTags(conf) {
  return conf.tags.size > 0
}

filter(heap.objects('cchesser.javaperf.workshop.data.ConferenceSession'), hasTags)

Take that previous query and further expand it to list it in JSON with the number of tags:

function hasTags(conf) {
  return conf.tags.size > 0
}

map(filter(heap.objects('cchesser.javaperf.workshop.data.ConferenceSession'), hasTags), 
 '{ conference: it, tags: it.tags.size }')

We can further expand this to sort the results, descending by number of tags, and list the tags in the results:

function hasTags(conf) {
  return conf.tags.size > 0
}

sort(map(filter(heap.objects('cchesser.javaperf.workshop.data.ConferenceSession'), hasTags), 
 '{ conference: it, tag_count: it.tags.size, tags: toArray(it.tags.elementData) }'), 'rhs.tag_count - lhs.tag_count')

Reference: VisualVM OQL Help

💡 Note, the OQL which you use between this tool (and others), isn't necessarily standard, and therefore cannot be used between tooling without issues. VisualVM gives you the ability to leverage Javascript, which is unique to other tools supporting heap dump analysis.

Java Mission Control: JOverflow Plugin

With Java Mission Control, there is an ability to add plugins, one of which, is the JOverflow plugin. You can install it by going to Help and select Install New Software..., and then select the JOverflow Heap Analyzer.

Once this is installed, when you open a .hprof file, JOverflow will be loaded. Also, it add a context menu selection to Dump Heap of JVMs from the JVM Browser.

When you open a heap dump, an initial analysis is completed which lists a large range of anti-patterns and things that can be improved in relation to the memory you utilize.

Anti-patterns

Here is a listing of some the patterns that JOverflow identifies:

Empty Unused Collections: Empty collection, where modification count == 0.
Empty Used Collections: Empty collection, where modification count != 0.
Small Sparse: Only for array-based, where less than half the slots are occupied. Small is considered: size <= default (ex. 16 for HashMap).
Large Sparse: Only for array-based, where less than half the slots are occupied. Large is considered: Size > default.
Boxed: Contains boxed Numbers (ex. java.lang.Integer). Each of these boxed number has overhead as compared to their primitive counterpart due to object references.
Small Collections: Collections with 1 to 4 elements. There are fixed costs of collections, which may lend this set of data better hosted in an array vs. a full Java collection type.
Vertical Bar Collections: Collection which is a list of lists, where the outer collection is large, and it's elements are all small collections (ex. List(1000) of List(100))
Zero Size Arrays: Array where length == 0 (still consumers 12 - 16 bytes).
Vertical Bar Arrays: Similar to Vertical Bar Collections, but for arrays.
Sparse Arrays: Less than half of the elements are not null.
Long Zero Tail Arrays: Ends with a consecutive series of zeros, where the tail length is >= size / 2.
Empty Arrays: Only null elements.
Duplicate Arrays: Where an array contents are the same in another instance, but they are separate array instances.
Duplicate Strings: Same as Duplicate Array, where string1.equals(string2) and string1 != string2.

Reference: JOverflow: Advanced Heap Analysis in Java Mission Control

Filtering

As you play with JOverflow, once you click on something, it will begin filtering down your content. If you wish to reduce down to code that you may own, you can start doing a package filter on the ancestor section. In this example, I want to see duplicate strings for code under cchesser. In this case, I can see that our workshop service has several duplicate strings (as this content is replicated on each object):

Regular Session
4-Hour Workshop
8-Hour Workshop

However, the number of instances / overhead is extremely low (since this is a small service), it gives you visibility of areas that you could optimize.

Eclipse Memory Analyzer

Eclipse Memory Analyzer is a fairly mature tool on analyzing heap dumps. It is a rich tool that includes the abilities to do OQL queries on the heap, general reporting on anti-patterns, and helpful common views (like the dominator tree). Eclipse Memory Analyzer tool is a separate install which can be a standalone install, or it can run as a plugin within Eclipse. First, install the tool if you haven't already via their downloads page.

After you have it installed, go to File and then Open Heap Dump... and specify the hprof file that you wish to analyze.

💡 When Eclipse Memory Analyzer loads your hprof file, it will create many other index files to assist in optimally analyzing and navigating through your heap dump. It may make it easier to isolate your prof file in its own directory prior to opening it, based on the secondary files which are created. You can always just delete these files within Eclipse Memory Analyzer by opening the Heap Dump History window, right-clicking the hprof which was previously loaded, and selecting Delete Index Files. Example listing of index files which get generated by suffix:

.domIn.index
.domOut.index
.index
.o2ret.index
.a2s.index
.idx.index
.inbound.index
.o2c.index
.o2hprof.index
.outbound.index
.threads

Dominator Tree

A common first area to look at your heap within Eclipse Memory Analyzer is the dominator tree. From the dominator tree, you can organize by the retained heap size, and then begin drilling down the tree to see the contributors to the largest GC roots.

Histogram

The histogram gives you a quick listing of all the top consumers by type. Typically this is going to provide context of the large consumers based on their "lower-level" types. For example, char[] is a common large contributor, which then will be followed by String which is a type that is predominately weighted by the size of the char[].

Thread Overview

The thread overview is a helpful view when you are looking at contributors based on the execution of code. For example, if you have a JVM which has thrown an OutOfMemoryError, you can tell if some type of request caused a massive spike which caused the exhaustion of memory, or if the requests may have not been abnormal, there just wasn't sufficient space to support the request.

OQL

Another strong feature of Eclipse Memory Analyzer is the OQL studio to execute queries to do lookup on objects in your heap.

Lookup our ConferenceSession class type:

SELECT *
FROM INSTANCEOF cchesser.javaperf.workshop.data.ConferenceSession

💡 To execute the query, you click on the red exclamation mark (❗). Alternatively, you can press CTRL+Enter on Windows or Command+Enter on macOS.

By returning the raw type back, you can expand each one to further look at the contents of the object. If you wanted to just report the object with its title attribute and the object's retained sized, you could do this:

SELECT c.title.toString(), c.@retainedHeapSize 
FROM INSTANCEOF cchesser.javaperf.workshop.data.ConferenceSession c

Now, you could then filter this by including a WHERE clause, where you can filter it by the title or the abstract:

SELECT c.title.toString(), c.@retainedHeapSize FROM INSTANCEOF cchesser.javaperf.workshop.data.ConferenceSession c 
WHERE c.title.toString() LIKE ".*(J|j)ava.*" OR c.abstract.toString() LIKE ".*(J|j)ava.*"

Reference: Eclipse Memory Analyzer OQL Syntax

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Part 4: Heap Analysis

Prerequisites

Capture a Heap Dump

jcmd

jmap

Core Dump

HotSpot Diagnostic MBean

Analyze the Heap Dump

VisualVM

OQL Console

Java Mission Control: JOverflow Plugin

Anti-patterns

Filtering

Eclipse Memory Analyzer

Dominator Tree

Histogram

Thread Overview

OQL

Files

README.md

Latest commit

History

README.md

File metadata and controls

Part 4: Heap Analysis

Prerequisites

Capture a Heap Dump

jcmd

jmap

Core Dump

HotSpot Diagnostic MBean

Analyze the Heap Dump

VisualVM

OQL Console

Java Mission Control: JOverflow Plugin

Anti-patterns

Filtering

Eclipse Memory Analyzer

Dominator Tree

Histogram

Thread Overview

OQL