Dave Landers

Dave’s thoughts (such as they are)

Leak Testing

So, here we are. We have found lots of memory leaks, and drastically improved the redeploy experience for WLP 10.2 (and the fixes are being backported to previous releases).

However, this is not the first time we’ve been here. But it is hopefully the last.

We are putting in place a test environment to keep these leaks plugged. Actually, to be more precise, we’re adding some instrumentation to our integration test suite that drives tests via the IDE. So this test will behave just like a developer – although it’s a rather anti-social and uninteresting developer :).

Note that this technique is not restricted to WLP or redeployment testing, but hopefully you can find it useful for other situations.

Counting ClassLoader Instances

The basis of this technique is that we know that the leaks we are interested in happen to always hold on to the application or webapp ClassLoader. Further, we also know that these ClassLoaders are instances of GenericClassLoader (for the app) or ChangeAwareClassLoader (for the webapp).

So, our instrumentation relies on being able to count the instances of these classes. Fortunately, JRockit can do this easily, without any extra messing about with the startup settings or whatnot.

JRockit comes with this handy jrcmd command. It lets you interact with a running JRockit JVM, like for example the WLS server. You can use it to do all sorts of interesting things:

  • find out the JVM’s command line (how it was started)
  • print thread dumps
  • change the verbosity of the JVM on the fly (things like garbage collector stats)
  • trigger a garbage collection
  • print a GC report
  • print diagnostics about memory usage

All sorts of cool things – and none of these require you to change anything about your running process. No command line switches or anything messy like that.

So, we are looking for a count of the instances of our ClassLoader objects. We do that by running:

jrcmd <wls server pid> heap_diagnostics

(You get the wls server pid by running just jrcmd – it will print the process ids of running JRockit instances.)

If you run that, you’ll see tons and tons of output – including a listing of every class loaded by the system and how many instances there are. So all we need to do is search for the class we are interested in:

jrcmd <wls server pid> heap_diagnostics | grep '%.*GenericClassLoader

That will grep for the one line we are interested in (turns out it contains a percent sign; why is not important here). We’ll get output something like this:

     0.0% 10k      150    +10k weblogic/utils/classloaders/GenericClassLoader

In the above, we see that there are 150 instances of the GenericClassLoader.

To get a more accurate count, we also want to trigger a full garbage collection and finalization before counting anything. We can do that with jrcmd also:

jrcmd <wls server pid> runfinalization

Using this information, here’s what we are doing for WLP testing:

  1. Start the server, with the application undeployed.
  2. Run finalization and count the GenericClassLoader and ChangeAwareClassLoader instances. This gives us a baseline for the number of instances used by the server itself.
  3. Deploy the application and use it (i.e. wander through the portal in a browser)
  4. Count instances again.
  5. Redeploy the application, use it, and recount.
  6. Repeat the redeploy step several times.
  7. Undeploy the application.
  8. Count instances.

All these counts (and additionally the [re-]deploy times) are all being recorded and analyzed. The most important metric is that the first and last counts are the same – after the application is undeployed, the ClassLoaders used by the application should be gone. We are collecting the intermediate results because we expect they might be useful in failure analysis.

The one last step is that we are running the JVM instrumented for YourKit, and configured to dump a memory snapshot on exit. So if the ClassLoader counts don’t look right, a developer will have the memory snapshot for analysis, without having to rerun the whole test suite.

Hopefully some of this information can be useful if you have leaks you are hunting.

Technorati Tags: , , , , ,

Comments are off for this post

Comments are closed.