Classloader-Related Memory Issues
Chapter: Memory Management
Sometimes I think the Java classloader is to Java what dll-hell was to Microsoft Windows (Read here, and here if you want to know more about this sort of hell). However, modern enterprise Java applications often load thousands of classes, use different isolated classloaders, and generate classes on the fly. While Java classloader issues lead to the same runtime issues as the aforementioned dll-hell (multiple versions of the same class/method), they also lead to memory leaks and shortages that need to be addressed in any book about Java performance.
When there are memory problems, one thinks primarily of normal objects. However, in Java classes, objects are managed on the heap, as well. In the HotSpot JVM, classes are located in the permanent generation or PermGen (See the section Not All JVMs Are Created Equal). It represents a separate memory area, and its size must be configured separately. If this area is full, no more classes can be loaded and an out-of-memory error occurs in the PermGen. The other JVMs do not have a permanent generation, but that does not solve the problem, as classes can still fill up the heap. Instead of a PermGen out-of-memory error, which at least tells us that the problem is class-related, we get a generic out-of-memory error.
Here we'll cover the most common classloader-related memory issues (plus one rare but educational one), and how to identify and solve them.
A class is an object and consumes memory. Depending on the number of fields and class constants, the needed memory varies. Too many large classes, and the heap is maxed out.
Quite often the root cause of large classes is too many static class constants. Although it is a good approach to keep all literals in static class constants, we should not keep them all in a single class. In one case a customer had one class per language to contain all language-specific literals. Each class was quite large and memory-hungry. Due to a coding error the application was loading not just one, but every language class during startup. Consequently, the JVM crashed.
The solution is quite simply to split large classes into several smaller ones, especially if you know that not all of the constants are needed at the same time. The same is true for class members. If you have a class with 20 members, but depending on the use case you use only a subset, it makes sense to split the class. Unused members still increase the size of the class!
Single Class in Memory Multiple Times
It is the very purpose of a classloader to load classes in isolation to each other. Application servers and OSGi containers use this feature of classloaders to load different applications or parts of applications in isolation. This makes it possible to load multiple versions of the same library for different applications. Due to configuration errors, we can easily load the same version of a library multiple times. This increases the memory demand without any added value and can lead to performance problems, as well.
A customer ran a service-oriented architecture application on Microsoft Windows 32-bit JVMs. His problem: he needed to assign 700 MB to the PermGen, but 32-bit Windows does not allow more than ~1500 MB per Java process. This did not leave him with enough room for the application itself.
Each service was loaded in a separate classloader without using the shared classes jointly. All common classes, about 90% of them, were loaded up to 20 times. The result was a PermGen out-of-memory error 45 minutes after startup.
I was able to identify this by getting a histogram memory dump from the JVM in question (jmap -histo). If a class is loaded multiple times, its instances are also counted multiple times. If we see a single class several times with different counters, we know that it was loaded multiple times. I subsequently requested a full heap dump and analyzed the references to the classes that were loaded multiple times. I found that the same JAR file was loaded via different classloaders!
Another symptom is that calls from one service to the other would serialize and deserialize the service parameters, even if those calls happened within the same JVM (I could see this as a hot spot in the performance analysis of those service calls). Although the different applications did use the same classes, they resided in different classloaders. Hence the service framework had to treat them as different. The service framework solved this by passing the parameters per-value and not per-reference. In Java this is done by serializing and deserializing.
I remedied the problem by changing the configuration switch in the JBoss deployment files of the services. The deployment file defined which JAR files should be loaded in isolation and which should be shared. By simply setting the common JAR files to shared, the memory demand of the PermGen dropped to less than 100 MB.
Especially in application servers and OSGi containers, there is another form of memory leak: the classloader leak. As classes are referenced by their classloaders, they get removed when the classloader is garbage-collected. That will happen only when the application gets unloaded. Consequently, there are two general forms of classloader leak:
Classloader Cannot Be Garbage-Collected
A classloader will be removed by the garbage collector only if nothing else refers to it. All classes hold a reference to their classloader and all objects hold references to their classes. As a result, if an application gets unloaded but one of its objects is still being held (e.g., by a cache or a thread-local variable), the underlying classloader cannot not be removed by the garbage collector!
This will happen only if you redeploy your application without restarting the application server. The JBoss 4.0.x series suffered from just such a classloader leak. As a result I could not redeploy our application more than twice before the JVM would run out of PermGen memory and crash.
To identify such a leak, un-deploy your application and then trigger a full heap dump (make sure to trigger a GC before that). Then check if you can find any of your application objects in the dump. If so, follow their references to their root, and you will find the cause of your classloader leak. In the case of JBoss 4.0 the only solution was to restart for every redeploy.
Leaking Class Objects
The second classloader leak version is even nastier. It first came into existence with now-popular bytecode-manipulation frameworks, like BCEL and ASM. These frameworks allow the dynamic creation of new classes. If you follow this thought you will realize that classes, just like objects, can be created and subsequently forgotten by the developer. The code might create new classes for the same purpose multiple times. You will get a nice classloader leak if either the class or its object remains referenced. The really bad news is that most heap-analyzer tools do not point out this problem; we have to analyze it manually, the hard way. This form or memory leak became famous due to an issue in an old version of Hibernate and its usage of CGLIB (see this discussion on Hibernate for details).
One way to identify such a problem is to check a full heap dump for the leaked classes. If the generated classes share a common naming pattern, you should be able to search for them. You can then check if you find multiple classes with the same pattern, where you know you should only have one. From there you should be able to find the root reference easily by traversing the references.
Same Class Being Loaded Again and Again
Lastly I want to describe a phenomenon where the same class is loaded repeatedly without it being in memory multiple times. It is not a common phenomenon, but neatly shows why it is important to know about the behaviors of different JVMs.
Contrary to popular belief, classes will be garbage-collected! The HotSpot JVM does this only during a real major GC (see the earlier discussion of major vs. minor GCs), whereas both IBM WebSphere JVM and JRockit JVM might do it during every GC. If a class is used for only a short time it might be released immediately (like every other temporary object). Loading a class is not exactly cheap and usually not optimized for concurrency. In fact, most JVMs will synchronize this, which can really kill performance!
I have seen this twice in my career so far. In one specific case, the classes of a script framework (Bean Shell) were loaded and garbage-collected repeatedly while the system was under heavy load. Since multiple threads were doing this it led to a global synchronization point that I could identify by analyzing the locking behavior of my threads (leveraging multiple thread dumps). However, the development happened exclusively on the Oracle HotSpot JVM. As mentioned, the HotSpot JVM garbage-collects classes only in a major GC, therefore the problem never occurred during the development cycle. The production site, on the other hand, used an IBM WebSphere JVM and there the issue happened immediately. The lesson learned was that not all JVMs are equal.
As a solution I simply cached the main object (the BeanShell Interpreter). By ensuring that the main object was not garbage-collected, I made sure that all required classes were kept alive and the synchronization issue was gone.
Classloader problems are hard to spot. The reason is not really a technical one. Most developers simply never have to deal with this topic and tool support is also poorest in this area. Once you know what to look for, you can easily identify classloader problems with a combination of trending and full heap dumps.