Murphy goes online: disaster recovery's the target

By

I know you’re familiar with Murphy’s Law.  When I was in the Army, we were always concerned about “ol’ Murphy.”   But please rest assured, the folks at American Eagle Outfitters now know more about Murphy than they ever wanted, and what they know from first-hand experience provides a valuable lesson for the rest of us.

American Eagle’s e-commerce website was down for eight days, according to StorefrontBacktalk.com, in what the editor called “complete website death,” while the company struggled to recover data from crashed servers.

It wasn’t like American Eagle was trying to cut corners, mind you.  According to StorefrontBacktalk.com, it had outsourced much of its web operations to IBM.  And it was using IBM and Oracle software and hardware.  But the company apparently suffered through a “perfect storm.”  Up to a point.

American Eagle’s storage drive went down, followed shortly by the secondary backup drive.  The Oracle backup utility software worked, kinda:  StorefrontBacktalk.com quotes a source, who says the software was restoring one gigabyte per hour.  Which is okay, I guess, but they needed to restore 400GB.  The problem was American Eagle’s disaster recovery site wasn’t ready to go.  The source is quoted as saying, “They apparently could not get the active logs rolling in the disaster recovery site. I know they were supposed to have completed it with Oracle Data Guard, but apparently it must have fallen off the priority list in the past few months and it was not there when needed.”

The lesson is painfully obvious. Always assume the worst will happen… and this is exactly  why “Murphy” was a popular term used among soldiers in the Army.  Work with your team, either in-house or outsourced, to figure out how you’ll be able to get back in business when it happens.  Above all, test…test…and test again.  If American Eagle had done this, it would have discovered the DR site wasn’t up to speed.  That lack of preparation apparently cost American Eagle big time. 

One other thing; it’s a commonly held precept in our business that a DR plan that sits on the shelf is potentially worse than no DR plan at all, because it instills a false sense of security.  Things constantly change; servers are added or subtracted, data load grows.  Keep testing.  If you’re outsourcing your DR and storage to a third party, make sure that regular testing is included in any SLA.

So, American Eagle Outfitters is back up and running, and that’s a good thing.  They have also learned, unfortunately, “If anything can go wrong, it will, at the worst possible moment.”