Archive for November, 2007
Leak Patterns: new Thread
One of the most surprising memory leaks I found looked like this:
Thread t = new Thread();
Of course, that’s not exactly the whole story, but it illustrates the point that just knowing where the leaky object is referenced is not always enough to find the cause of the leak.
So why does new Thread() leak? It turns out that WLS puts some important (internal) information on the thread as InheritableThreadLocal instances. Some of these references may contain references to the application’s ClassLoader. Since they are inheritable, they will be passed to the new thread.
And if that new thread outlives the application ClassLoader (that is, the application is undeployed before the thread stops), then you have a leak.
Of course, a good J2EE developer will recognize that creating threads is not allowed by the specification - you should instead use something like WorkManager. And that is correct.
But I am in a somewhat different environment, where we are building what amounts to a container. So, while we are doing J2EE, we are also building the container to support J2EE components. So we sometimes walk a fine line on issues like this.
Technorati Tags: leak, memory, WebLogic
No commentsSetting up YourKit for WebLogic
Setting up YourKit for WLS is pretty easy, and reasonably clear in their documentation. Since I work with not-yet-released versions of WLS, the automatic configuration scripts for various profilers usually don’t work for me, so I usually configure things manually.
For WLS and YourKit, you just need to do 3 things:
- Set the library path
- Set the agentlb option
- Start the server with debugging off
To do this in Linux, it looks something like this:
$ export LD_LIBRARY_PATH='/opt/yjp-6.0.16/bin/linux-x86-32/' $ export JAVA_OPTIONS='-agentlib:yjpagent' $ ./startWeblogic.sh nodebug
Once the server starts, you can connect to it from the YourKit UI, trigger the leak, and take a memory snapshot.
I have been looking for application ClassLoader leaks, and I know that the WLS GenericClassLoader has a String called annotation that holds the application name. So with the snapshot open, I pick Memory -> Strings by Pattern and enter the application name.
I happen to know how the annotation is formatted so I acually search for the regular expression “^myApp@$”. That will find the string, and thus the GenericClassLoader that is leaked. And you can easily see then what is holding the ClassLoader and why it is a leak.
Technorati Tags: YourKit, WebLogic
No commentsFinding the root cause
Finding a leak is only the first step. Usually, the leak will be some Map or field that holds some Object or Class or whatever that eventually anchors the ClassLoader.Using a profiler like YourKit, you can find the cause of the leak. But the real solution to the problem requires you to figure out:
- Where was this anchor established (created)?
- Why is it there?
- Is it necessary, or can it go away (or be done some other way)?
- For example, can it be held weakly (WeakReference)?
- If it must remain, how should it get cleaned up?
Sometimes, the answer to the above is obvious. But other times, the first question (where was it established) is difficult to see.
YourKit and other profilers have the ability to record allocation stack traces (where the stack trace of every object allocation is recorded). This is one way to go, but it seriously slows down the running of the server (it is off by default in YourKit, for example). I also found that in a system as large as WebLogic Portal, I never really got adequate data because there was just way too much for the profiler to record (so it often gave up before I saw what I needed).
The next option would be to run with a debugger, assuming you have the appropriate source available in your IDE, etc.
But often the easiest way to go is to add some System.out.println and Thread.dumpStack calls in strategic places in the code. Then you can see the call stack of the one place you are interested in without having to record all allocations or figure out how to run with a debugger.
Once you find that root cause, you can move on to a reproducible test case, and then a fix.
Technorati Tags: leak, memory, portal, WebLogic
No commentsFinding leaks
This leak business turned into a huge effort. The nature of these leaks was such that the application’s ClassLoader was held (directly or indirectly) after it was supposed to be let go. The ways of holding a reference to a ClassLoader are numerous, and not always obvious.
Like I mentioned in my last post, the project started with us looking to increase iterative development performance (and therefore centered on redeploy times). We quickly discovered that redeploys were destabilizing the server: you could do about 3 or 4 redeploys before Hotspot would crash with a PermGen-related OutOfMemoryError. After that, the JVM was in a terrible state.
So we switched to JRockit, which was much better - but still would slow down after about 8 or 9 redeploys. JRockit doesn’t have the same kind of PermGen that Hotspot does (JRockit loads classes in heap, where they are treated by the gc like any other object).
It became obvious that something was leaking memory, and that was causing Hotspot to crash, and JRockit to swap. And since the leak happened with redeploys, I knew that the application’s ClassLoader would be the main leak symptom.
A leak in Java happens because some object is still referenced, and not able to be collected by the garbage collector. There is a “Path to GC Root” to that object. These paths to the object from:
- A static field in a class loaded by a parent ClassLoader (i.e. the System ClassLoader).
- Something held by a Thread stack.
- Something else the GC has decided not to get rid of (objects not yet finalized, those in a PhantomRef queue, and possibly SoftRefrerences).
Obviously we need some memory profiler tool. I first tried Optimizeit, since we had licenses available. This did help me find one leak, but it took forever (at least a day), and was terribly painful. The problem was that Optimizeit incorrectly identified any static field as a path to GC root. But most of these static fields were actually held by the application’s ClassLoader, and would be GC’ed if the ClassLoader were released. So I had a load of red herrings to wade through.
I also tried JRockit’s Memory Leak detector tool, but this too was way too painful, as it is not really aimed at this sort of problem. There was this great visual tool for viewing references, but with such a large system as ours, it also presented a load of irrelevant stuff to wade through.
On the suggestion of one of the JRockit devs, I tried YourKit, and this one nailed it. It correctly reports real paths to GC root, and has a neat “Find String” search. So now (since I know what strings to look for), finding a ClassLoader leak takes me about 5 minutes (and most of that time is spent grabbing the snapshot).
Next comes the hard part: once you find the reason for a leak, you have to figure out why it happened and how to fix it. That’ll be next.
Technorati Tags: leak, memory, portal, WebLogic
No commentsHunting J2EE application memory leaks
The last few months, I have been working on a performance project for WebLogic Portal 10.2.
It started as a project to increase performance, but we quickly determined that there was a larger roadblock that had to be cleared first. Over the last couple of releases, we have let more memory leaks back into the product. We had worked really hard to clear these out a while ago, but did not follow up with testing - so we left the door open and more leaks got in.
We have now cleared all the leaks we have found, and are in process of establishing a test environment to keep them from coming back.
These leaks have some characteristics that I found interesting:
- The leaks we were targeting happen upon redeploy of an application.
- They result from (unintentional) retained references to the application’s ClassLoader.
- Since hotspot holds classes in a fixed-size Permanent Generation, the leaks eventually (after a few redeploys) result in a serious and unrecoverable JVM failure.
- The leaks are surprisingly easy to trigger, with innocent-looking code.
The good news for most J2EE developers is that these leaks are generally not possible for an application to trigger (assuming you follow good J2EE coding practices). We (WLS and WLP) see them because we implement services that live in the system classpath (rather than in the application’s ClassLoader).
Over the next several posts, I’ll outline some of my findings and some tips for hunting such leaks.
Technorati Tags: memory, portal, WebLogic
No comments