Dave Landers

Dave’s thoughts (such as they are)

Archive for the 'Tech' Category

Time Machine Woes

Time Machine
Yesterday, I decided it was time to move on. I deleted my bootable Tiger backup and turned on Time Machine. I like the fact that it promises to do hourly/daily/weekly incremental backups, and the Time Machine application, while heavy on the cheese, is a pretty nice way to access backup recovery.

But backups aren’t backups unless you test that you can recover files (and you know how to do it without having to mess around when the time comes). So today, with a day’s worth of backup history, I decided to play with Time Machine.

I flipped back and forth thru time-cheese, and after a few seconds the world disappeared. Oh wait, it was just my MacBook Pro rebooting. Bong. Apple. Spin wheel. Login.

Nothing in any of the logs (there wasn’t time) - just a hard, fast, shutdown.

So, I did what any self-respecting software engineer would do. I tried it again. Same result. I have now hard-rebooted my machine like 10 times via Time Machine (I don’t recommend it).

If recovering a file risks a hard shutdown, and I can’t resolve this - I will just have to go back to rsync.

Technorati Tags: , , , ,

No comments

Leopard impressions

I’ve had Leopard installed for several days now, and mostly it is good. Here are some delights and gripes and bugs and other thoughts.

Mail

Mail still doesn’t obey the Hide checkbox when it is started at login. Just what is so hard about this? I find this really annoying. My son did figure out a rather clever “cheat” - he starts Mail in a different Space, so it is not really hidden, but just looks that way (but I’m not using Spaces).

And Mail still doesn’t grok the age-old function “go to next unread message” (the shortcut for this should be “spacebar”).

On the plus side, Mail will finally display a proper count of unread messages in the Dock icon (rather than only counting messages in the inbox).

Mail BugAnd in the bug arena, there is something seriously wrong. I occasionally see a popup from Mail saying it can’t identify the certificate for pop.gmail.com.

When I inspect the certificate, it is the one for my work email account. So something is seriously broken in Mail’s multiple-account settings or fetching.

Mail / RSS

I never really have got along with RSS in the browser - never seemed right to me. So I thought having it in Mail was a good idea - it’s really where I’ve wanted most of my feeds all along. But Apple’s implementation is just not “there” yet. It would probably work for someone with just a handful of feeds, but not for me.

First, there is just no way to import a set of feeds from anywhere but Safari. I found these instructions and got my OPML feeds imported (via Firefox to Safari to Mail - ugh), but they lost their folders. It took me a while to figure out that you can create folders for feeds (you just create a new mailbox), and then I was on the way to rebuilding my folders, and importing feeds into them. But along the way I figured out that you can’t sort or arrange anything. So I was stuck thinking I’d have to rename my folders so they’d be “aNews”, “bTech”, “cBlogs”, etc. Gack. And things were just getting crowded in the sidebar - what with my inboxes and all my mail folders, there just wasn’t room for a dozen more folders for feeds.

So I came to my senses and went back to using the most excellent NetNewsWire.

Safari

I think the best thing in the new Safari is actually the new eye candy in the Find feature. I was always having trouble seeing the default (subtile) blue highlight. Apple has now made it easy to see the find results on any web page (regardless of color scheme).

The history menu also now has two new options: Reopen Last Closed Window and Reopen All Windows from Last Session. Fantastic for those of us who sometimes accidentally close or quit.

The other two things I like come with the debug menu. First, they moved Open Page With and User Agent from the bottom to the top of the menu. And the new Web Inspector is great (kinda like Firebug, but more Apple-y). I especially like the way it rolls up CSS styles and shows Metrics (margins/borders/padding/size).

Spotlight

Spotlight just seems to work better. First, there’s new stuff contributing - most notibly Web History. Spotlight searches now include your Safari cache, so you can find that site you visited last week.

Also, they’ve made the “Top Hit” automatically selected (rather than messing around with the Command key). I can type, for example, C-Space, sys, Return and get System Preferences. It’s just better.

iCal

Well, finally you can set a default alarm for new events, although there’s no way that I can see to set the default sound to anything but Basso. And the default alarm doesn’t seem to apply to notifications created from Mail. So some progress here, but still a way to go.

Stacks

I actually like this. I used to keep folders in my Dock for Applications and Downloads, but this just works better. I don’t really like the Fan view, but since my Dock is on the side, all I and use is Grid anyway.

Quick Look

This is going to be really handy, especially now that I’m starting to learn the shortcut (Command-Y). It’s way faster than launching an app just to see what’s in the file. Next, I’m going to have to try a code syntax highlighter.

Random Issues

I have my desktop background set to display photos from a folder. Sometimes, first thing in the morning, my second monitor comes up with the default Leopard background. If I bring up preferences, it does think that it’s displaying the photos. If I wait a half-hour (when the photo is scheduled to change), it gets with the program.

I have noticed that the login dialog doesn’t get focus when I wake from sleep. So I type my password and nothing happens till I click in the login dialog. The worst part is that I sometimes have found bits (or all) of my password sitting in other dialogs or documents - which means those dialogs actually had keyboard focus while they were supposed to be locked out. Scary.

I am disappointed that WiFi with WPA is still not right. I have never been able to hold a connection to our work WAN for more than about 15 minutes (it had steadily grown from 5 to about 15 over several Tiger updates). Now, it seems we’re back to 5 minutes. My home network (Linksys) is not as bad as at work, but it does drop connections occasionally, too. I have no data, but it does seem worse than with the last update of Tiger. I never had these problems with my PowerBook.

Technorati Tags: , ,

No comments

Leak Testing

So, here we are. We have found lots of memory leaks, and drastically improved the redeploy experience for WLP 10.2 (and the fixes are being backported to previous releases).

However, this is not the first time we’ve been here. But it is hopefully the last.

We are putting in place a test environment to keep these leaks plugged. Actually, to be more precise, we’re adding some instrumentation to our integration test suite that drives tests via the IDE. So this test will behave just like a developer - although it’s a rather anti-social and uninteresting developer :).

Note that this technique is not restricted to WLP or redeployment testing, but hopefully you can find it useful for other situations.

Counting ClassLoader Instances

The basis of this technique is that we know that the leaks we are interested in happen to always hold on to the application or webapp ClassLoader. Further, we also know that these ClassLoaders are instances of GenericClassLoader (for the app) or ChangeAwareClassLoader (for the webapp).

So, our instrumentation relies on being able to count the instances of these classes. Fortunately, JRockit can do this easily, without any extra messing about with the startup settings or whatnot.

JRockit comes with this handy jrcmd command. It lets you interact with a running JRockit JVM, like for example the WLS server. You can use it to do all sorts of interesting things:

  • find out the JVM’s command line (how it was started)
  • print thread dumps
  • change the verbosity of the JVM on the fly (things like garbage collector stats)
  • trigger a garbage collection
  • print a GC report
  • print diagnostics about memory usage

All sorts of cool things - and none of these require you to change anything about your running process. No command line switches or anything messy like that.

So, we are looking for a count of the instances of our ClassLoader objects. We do that by running:

jrcmd <wls server pid> heap_diagnostics

(You get the wls server pid by running just jrcmd - it will print the process ids of running JRockit instances.)

If you run that, you’ll see tons and tons of output - including a listing of every class loaded by the system and how many instances there are. So all we need to do is search for the class we are interested in:

jrcmd <wls server pid> heap_diagnostics | grep '%.*GenericClassLoader

That will grep for the one line we are interested in (turns out it contains a percent sign; why is not important here). We’ll get output something like this:

     0.0% 10k      150    +10k weblogic/utils/classloaders/GenericClassLoader

In the above, we see that there are 150 instances of the GenericClassLoader.

To get a more accurate count, we also want to trigger a full garbage collection and finalization before counting anything. We can do that with jrcmd also:

jrcmd <wls server pid> runfinalization

Using this information, here’s what we are doing for WLP testing:

  1. Start the server, with the application undeployed.
  2. Run finalization and count the GenericClassLoader and ChangeAwareClassLoader instances. This gives us a baseline for the number of instances used by the server itself.
  3. Deploy the application and use it (i.e. wander through the portal in a browser)
  4. Count instances again.
  5. Redeploy the application, use it, and recount.
  6. Repeat the redeploy step several times.
  7. Undeploy the application.
  8. Count instances.

All these counts (and additionally the [re-]deploy times) are all being recorded and analyzed. The most important metric is that the first and last counts are the same - after the application is undeployed, the ClassLoaders used by the application should be gone. We are collecting the intermediate results because we expect they might be useful in failure analysis.

The one last step is that we are running the JVM instrumented for YourKit, and configured to dump a memory snapshot on exit. So if the ClassLoader counts don’t look right, a developer will have the memory snapshot for analysis, without having to rerun the whole test suite.

Hopefully some of this information can be useful if you have leaks you are hunting.

Technorati Tags: , , , , ,

No comments

Leopard Install Notes

I finally got around to upgrading my MacBook Pro to Leopard.

I generally take the opportunity of an OS upgrade to clean house, and I did it this time with Leopard again. So I made good backups, then did a clean install (not an upgrade). I then spend a few days figuring out exactly what I need from the backups and moving that over. This has several side-effects that I like. It cleans out the junk: old applications, prefs from apps I tried but no longer use, etc. It also gives me a fresh look at the new features, since they are not hidden by my old prefs or hacks.

Backups First

The first step is the backups. I have been using a script (fired from a calendar item) for a while now to backup (via rsync) some critical pieces of my home directory to the Linux box in my office. But that just wasn’t going to cut it for an OS upgrade. I wanted a full, bootable backup of the entire machine.

I got a USB drive and formatted it the same as my main drive (Mac OS Extended Journaled).Then, open Get Info on the drive and uncheck the Ignore Ownership on this drive (at the bottom).Then, use rsync to do a full backup. This will take quite a while, and works best if you close everything (ilke Mail, etc) first.

# turn off spotlight first
sudo mdutil -i off /Volumes/BackupDrive
# backup everything
sudo rsync --one-file-system --recursive --links --perms --times --group --owner --extended-attributes --verbose --progress --delete / /Volumes/BackupDrive
# turn off spotlight again, since we just dropped new index files over there
sudo mdutil -i off /Volumes/BackupDrive

I’m not sure how necessary the spotlight thing is, but I didn’t like all the backup versions of apps and stuff showing up in Spotlight searches.

Next, you need to make the drive bootable:

sudo bless --folder /Volumes/BackupDrive/System/Library/CoreServices

Great - now for the part that most people skip: Test the backup! I shut down the machine, then rebooted holding the Option key. I booted with the USB drive, and checked that everything looked OK. So now I know that if something goes drastically wrong with the upgrade, I can at least get some work done.

Planning

The next step, since I was doing a clean install, was to make a few notes. I scratched out a list of the files I needed, another list of apps and settings that I used every day, and another list of apps, settings, and other things that I wasn’t sure about. Then I spent a few minutes making sure I had links to installers, notes containing license keys, etc.

I also exported backups of things like Address Book, iCal, Safari bookmarks, Mail, and Keychain.

Install

I only encountered one hiccup in my plan. I knew I was going to do a full new install, but I also plan to do an upgrade or archive install at home (to preserve the sanity of my wife and kids). So I was going to try these first on my laptop, then blow everything away and do my clean install.

So I waited forever for the install DVD to verify (don’t need to do that again, thankfully), then proceeded with an upgrade install. But the progress bar was telling me it would take around 4 hours. I didn’t have 4 hours to blow on this experiment, so I bailed out and just went straight to the clean install.

ReCreate

The recreation of my “stuff” went pretty smoothly. I went through my checklist and started copying files, apps, and settings. I did this one at a time, just to keep things clean. For Mail, I did migrate over my old mail forders, accounts, and preferences - but then made sure to go check out the new prefs to see what’s new and what I might want to change.

Right away, I got Mail, Adium, iCal, NetNewsWire and Eclipse set up. Couldn’t wait on those. But other apps are being moved over on a slower pace. I don’t want them just because I used to use them - I want to make sure I really need them.

Impressions

Things seem a bit faster. I’m not sure if it’s just that I have not (yet) bogged the thing down with hacks, or if Leopard is actually a bit zippier. Or maybe it’s the new eye-candy acting as a sort of endorphin or whatever.

I’ll blog other impressions of new feature “likes and gripes” later.

Technorati Tags: , ,

No comments

Leaks We Plugged

In this post is a dump of most of the leaks that we have found and plugged for our upcoming 10.2 release. I present it here not so you can see exactly what was leaking or whatever, but rather to give me a format where I can make some notes about what we encountered, and how we fixed things. Hopefully, that might prove useful to someone.

Note that many of these leaks were never released, but were only present in internal revisions. But the ones that did make it out are being back-ported to patches for 10.0 and 9.2.

I’ve also kind-of-categorized them into a few groupings, where similar patterns have emerged. Some did not fit clearly into just one category, but the breakdown should help other leak-hunters in their quests.

Clean Up After Yourself!

This class of problems is caused by creating references to objects without clearing them when you are done with them. The Thread and ThreadLocal issues I discussed in earlier posts generally fall under this category.

Timers not released: If an app created a Timer (via commonj.timers.TimerManager), and the timer did not stop before the application undeployed, it was not cleaned up (and thus leaked). The culprit was a misused WeakHashMap. The Timers were stored like this:

WeakHashMap timers = new WeakHashMap();
...
timers.put(theTimer, theTimer);

The map was just being used as a Set, but the insertion of the timer into the map value ensured that the same value in the keys were never unreferenced, thus loosing the entire benefit of the Weak keys in the first place (see the “Implementation note” in the WeakHashMap javadoc). The fix was simple:

timers.put(theTimer, null);

WSEE Runtime MBeans: WSEE (the web services stack) was creating MBeans to hold the configuration of handlers. They were not cleared when the application was undeployed, which in itself represents a minor leak. Unfortunately, the MBeans also (indirectly) held a reference to the Application’s ClassLoader - turning a small problem into a rather large one.

GroupSpace use of Beehive: GroupSpace was calling Beehive’s beginContext without a matching endContext. Behind the scenes, Beehive created some ThreadLocals, but it does the right thing and removes them when (if) you call endContext. This was fixed by balancing the context usage within a request.

Portal Framework service manager: - Internal class in the System ClassLoader has a Map that was holding references of services to be used. One of the services was loaded by the Application’s ClassLoader. This was fixed by unregistering services when the application was undeployed. Lesson: Keep track of what you are using and let it go when it is no longer needed.

Lesson: The Garbage Collector is not your Mom - you have to clean up after yourself around here.

Hold Me Tight!

Where the above references were necessary but simply not cleaned up, this grouping is more caches that are holding unnecessary or extra information.

Log Buffer: WLS Logging keeps a buffer that holds the last 50 or so log messages for efficient display by console. But the implementation was holding the application-scoped logger objects (rather than simply holding the text it needed to display). The fix was to copy information from the application-scoped objects to generic system-scoped objects for the buffer.

Resource initialization optimization: The ResourceBase class that is the base class for our Entitlement resources, was keeping a collection of previously created objects. It then used data from these objects to initialize fields in new instances referring to the same resource. But our Entitlement resource objects were application-scoped, and thus this collection caused leaks. We fixed this by overriding the base object with our own intermediate implementation. We kept the caches, but wrapped the objects in Weak and Soft references. This is not an ideal solution, so we have planned to revisit the whole design in a future release.

BeanELResolver cache: The javax.el.BeanELResolver (used when EL is used in a JSP) caches bean properties in a static Map. Since these bean objects are generally webapp-scoped, there is a leak. We fixed it the same way Glassfish fixed it by clearing the map on undeploy. Since BeanELResolver does not have a public method to clear the map or uncache entries, this had to be done by reflection of the private map field (ugh). The interesting bit is that BeanELResolver does have methods to provide the close/clear function, but they are private (and never referenced).

Commons Logging in System Classpath: Commons Logging holds a map of loggers, which it may load via the application or webapp ClassLoader. Thus simply having the commons logging jar in your System ClassPath will trigger a leak (if logging is used from an application - as it is with Beehive). For us, the jar was added to the system classloader inadvertently (via jar manifest entries), so the fix was to simply remove it. However, we really can’t do anything about it if a customer decides to stick commons logging in their classpath.

Glassfish JAXB: The Glassfish implementation of JAXB caches some reflection lookups. It has a WeakHashMap with Class objects as keys. Good try, however the values in that map are Constructor objects, which themselves hold an instance to the Class (which therefore means the keys are strongly held, and the Map’s weak keys can never be released). We fixed this in our local version by wrapping the Constructor in a WeakReference. We filed the ssue with Sun, and they adopted our fix in version 2.1.6.

LeaseManager holding a TimerManager: The LeaseManager (a new internal class) was holding a static reference to an app-scoped commonj.timers.TimerManager. This one was fun because it required tripping a race condition in order to trigger the leak. Also, in addition to being a leak, this was an error because the static reference tied the LeaseManager to a single deployed instance of a single application, so it would not work properly with multiple applications. We fixed this by looking up the TimerManager as needed (from JNDI, like you’re supposed to do). Yet another example of premature optimization (i.e. presuming that the JNDI lookup was too slow).

Lesson: You gotta know when to hold ‘em; Know when to fold ‘em.

The Ties that Bind

I’ve covered the Thread and ThreadLocal issues previously, but here are the details to complete the list.

ThreadLocals in Portal Framework: Some of the Framework JDBC code was using ThreadLocal but not cleaning up at end of request. This was fixed by simply removing the ThreadLocal (in a finally block) at the end of the request

new Threads in JSP compiler: See the previous post. It turned out that the thread was unnecessary: a leftover from when the code was ported to the server from a different environment. So it was just removed.

ThreadLocals in Content Management: Content was using ThreadLocal and not removing them after a request. The ThreadLocals were added as a (possibly unnecessary?) performance optimization, and the fix was simply to remove them.

Lesson: Don’t let your code get all tied up in knots - unravel those threads.

Final Thoughts

If you’ve taken the time to look at these issues, you can see that the problems spanned across multiple subsystems: WLP, WLS, WLW, Apache, and Sun. It was quite a project, and quite a fun challenge to track all these down.

Next, I’ll outline how we intend to keep this from happening again, and show you some tricks for your own memory leak testing.

Technorati Tags: , , ,

No comments

« Previous PageNext Page »