The New Server Virtualization Imperative for 2010: Application Consistent Recovery with Low Overhead

By: Jerome M. Wendt

Tuesday, January 5, 2010

Server virtualization was one of the hot technology trends in 2009 and there is every reason to believe it will remain that way in 2010.  But as this trend broadens to include the virtualization of mission critical applications like Microsoft Exchange and SQL Server, new considerations come into play. Most notably, organizations must identify a data protection solution that can deliver application-consistent recovery points, bring applications quickly back online and do so without negatively impacting the performance of the physical host.

Gartner Inc. recently commented that it expects that fully 50% of workloads will run inside virtual machines by 2012 which represents nearly 58 million of deployed machines. So as this trend accelerates, it is only logical to assume that more mission critical applications such as Microsoft Exchange and SQL server are bound to be virtualized.

A recent article on FedTech lends credence to this conclusion. Departments within government organizations such as the Air Force, Navy and Federal Deposit Insurance Corp. (FDIC) are already in the midst of virtualizing applications like Exchange. Reduced hardware costs, less floor space, and lower power costs coupled with Microsoft's increased willingness to support Exchange on Microsoft Hyper-V and other virtualization platforms means applications like Exchange are ripe for virtualization in the next few years.

It is as more mission-critical applications are migrated to virtual environments that rapid reliable recoveries become more important.  The virtualization of these applications may create new data protection and disaster recovery (DR) challenges that organizations have not fully taken into account prior to virtualizing these applications. Consider:

  • Most physical machines, particularly Windows and Linux servers, operate at utilization rates in the 20 - 35% range.
  • Once consolidated and virtualized, they can run at utilization rates that approach 85% or greater.
  • This higher utilization rate leaves fewer resources to run performance-intensive applications like backup.
  • Traditional backup approaches can consume 20% or more of the physical server's available resources.
  • This problem is compounded if backup jobs on multiple virtual machines (VMs) kick off at the same time.
  • Hypervisor-level APIs like VMware Consolidated Backup (VCB) and vStorage are intended to provide low overhead data protection approaches appropriate for virtual machine environments.
  • Hypervisor-level APIs lack an application consistency mechanism and can produce only crash-consistent snapshots
  • Crash-consistent snapshots are not application-consistent, and therefore introduce two problems:  they may not lead to reliable data recovery and, even if they do, recoveries take longer than they do when application-consistent snapshots can be used  
  • Application-consistent snapshots are desirable  for the proper protection and recovery of applications running in virtual machine environments (just like they are for applications running in physical environments and for the same reasons) because they lead to faster, more reliable recovery
  • Organizational expectations for near real-time application recoveries (30 minutes or less) are on the rise.
Organizations should not underestimate the growing intolerance that their current customers, internal or external, have for outages of any length, especially when a mission critical application like Microsoft Exchange or SQL Server is concerned. While they can certainly count on some goodwill and understanding among their end users should a disaster strike, a recent Applied Research study quantified just how much goodwill they should expect and found that only about 60% of them will tolerate an extended outage
.
While the exact definition of an "extended outage" is elusive, most will agree that 24 hours now qualifies as an extended outage and it is safe to say that in regards to mission critical applications, any outage over 30 minutes probably fits this definition. So what those responsible for delivering data protection and DR services for these applications should find disconcerting is that based upon the real world feedback that this study gathered, 40% of customers who experienced an "extended outage" left their provider in favor of someone else who was not having the same problems.

It is for these types of reasons that it behooves organizations to seek out a solution like InMage that provides application-consistent recovery points in a manner compatible with the requirements of virtual machine environments.  InMage integrates with native application snapshot APIs, using a very low overhead filter driver to drive the creation and marking of application-consistent recovery points, and works in exactly the same way across both physical and virtual machines.  This simplifies data protection operations by providing a consistent set of processes to manage recovery across the entire enterprise.

Blog Services by DCIGInc.com