Frequently Asked Questions

What really differentiates the InMage solution from other DR solutions?

InMage Scout is the only combination DR, backup elimination, and application recovery solution on the market that supports WAN-efficient block-based replication. Other features we've heard customers repeatedly comment on include its non disruptive deployment (works with what you have), low impact on production servers (our "host off-load" design), and its ability to minimize (and in some cases to completely dispense with) tape-based infrastructure and its associated problems and operator involvement.

Is InMage a multi-snapshot solution?

No, and this is a misnomer that carries significant implications with it. Multi-snapshot solutions save snapshots on a regular schedule, say one every hour. Any data between those snapshots is not available for recovery. With a multi-snapshot solution you decide ahead of time what snapshots you will save, thereby determining your recovery granularity.

What InMage does is very different. It doesn't save "snapshots", it continuously saves all the data and metadata necessary to retroactively create any recovery point if and when that point is needed. This approach ensures that you will always have the optimum recovery point available to meet any administrative requirement, regardless of whether that is recovery, maintenance, reporting, etc. This approach takes up only slightly more space than multi-snapshot solutions, but provides significantly better recovery granularity. Note also that the ability to create multiple disk-based images around a particular point in time offers invaluable advantages in root cause analysis that multi-snapshot products just can't meet.

What are AppShots and how do I use them?

Application-consistent recovery points are marked in the data streams that InMage collects. These points can be retroactively selected and immediately turned into disk-based copies of production data that can be mounted on recovery server targets. AppShots are most often used for recovery purposes since they result in fast, reliable recoveries but they can be used to time-shift administrative operations associated with data migration, business intelligence, reporting, test, and/or development activities so as to minimize impacts on business operations and optimize administrator productivity.

Can I really "roll back" to any previous point in time?

When setting up InMage, the administrator defines a retention period. This retention period defines how long the collected data streams will be retained. Longer retention periods require larger retention logs and hence more storage capacity on the recovery target(s), but because it basically retains the initial baseline plus only changes, the retention logs do not grow much over time. Once data ages out of the retention period, it is discarded and cannot be used in recovery point selection. For most of our customers, this retention period varies between several days and several weeks.

Any point at which a write occurred is a viable recovery point that will result in a reliable recovery (i.e. any available recovery point is crash-consistent). Some points, like AppShots, will result in faster recoveries at the application level, but both application-consistent and crash-consistent recovery points result in comparably reliable recoveries. When administrators are selecting a recovery point, the most recent point will result in the least amount of data loss, but if this is not an AppShot then it will take longer to recover. Recovering from the most recent AppShot will result in the fastest recovery, but may result in some data loss if it is not the most recent point in the data stream.

Is InMage Scout a backup replacement?

No, InMage Scout works with any backup software to specifically address backup window, RPO, RTO, and recovery reliability issues (based on tape recoveries) while still allowing you to dump data to tape if and when you want. It optimizes recovery capabilities while eliminating backups as a discrete operation, all while allowing you to continue to use your backup software as much or as little as you want. For enterprises that may only need to retain backup data for 30-60 days, tape operations may be minimized almost to the point of non-existence, while those that have longer retention periods may want to migrate data from disk to tape (using existing backup software), after the data ages past a certain point, for more cost-effective long term storage. InMage Scout effectively front ends the existing tape-based backup infrastructure, handling the backup and restore tasks for which tape is not well suited (daily backups, daily restores), but allows backup sets to be migrated to tape whenever desired.

Unique Disaster Recovery Technology: Enable Recovery within Minutes

InMage uses a unique, hybrid recovery technology that provides for granular recovery capabilities that can meet the most stringent RPO/RTO requirements while completely eliminating backups as a discrete operation that impacts business operations. InMage deployments can be configured to support long distance disaster recovery (DR) requirements as well as local recovery requirements to meet daily restore requests. InMage foundation technologies include continuous data protection (CDP), asynchronous replication, application failover/failback, and WAN optimization as well as other application-aware integration points used in our Application Solutions products, and leverage disk as the underlying storage infrastructure to provide fast, reliable recovery.

What sets InMage apart from other vendors in the disk-based recovery space is the comprehensiveness of our solution. Other vendors with solutions in the DR and backup space focus on providing data recovery, leaving it up to administrators to get applications back up and running. Data recovery provides a strong foundation, but alone is not sufficient to meet comprehensive recovery needs. Data is only useful when applications that can use that data are available, and application recovery must be addressed as well to ensure continued business operations. Application recovery generally requires human intervention, further adding to infrastructure complexity, cost, and recovery risk. InMage provides a single platform that not only provides for granular data recovery in remote or local sites, but also offers automated application recovery for any application. By going this extra step, InMage covers the entire range of infrastructure recovery requirements that are associated with a particular application service.

Inmage’s continuous network bandwidth usage

InMage's unique architecture supports non-disruptive deployment into existing environments. Deployment requirements can be scoped out ahead of time using our I/O profiler to determine your I/O change rates per server and exactly how much bandwidth and storage will be required to meet defined recovery requirements. This leads to reliable deployments with no surprises where you know exactly what RPO/RTO requirements you can support at what cost before you deploy. At initial installation, each production server that will be protected only has to be rebooted, and there is no need to reformat or migrate existing data. The main InMage Scout component, the CX, deploys in the network as an out-of-band component whose failure and/or replacement does not impact production operations in any way. Network outages are handled transparently without requiring an expensive data resynchronization operation.

How The InMage Technology Works

Inmage’s disaster recovery technology work Disaster recovery technology work –Figure 2 Inmage’s continuous data protection, replication and failover/failback

 

Although most InMage deployments are protecting several servers and applications, to understand how our technology works let's take a look at a basic single server configuration. At the local site is a production server (also called the "source" server) that you want to protect. Storage attached to this server can be DAS, SAN, or NAS, including any flavors of SCSI, iSCSI, and/or FC. You'll also have another server running the same operating system which will act as the target, and it can also be using any type of storage as long as it has at least as much storage on it as the source server you want to protect. InMage supports heterogeneous servers, running Windows, Linux, or Unix, heterogeneous storage (any type, any vendor), and can collect/replicate data at the block level, depending on how its configured. Because of the advantages of block-based replication, most InMage deployments use it instead of file-based replication.

Recovery point selection

A basic InMage install will require that a filter driver, called a data tap, is installed on each source and target server. The data tap is a small component, much lighter weight than backup or host-based replication agents, that takes up very little CPU on the source. Basically what it does is asynchronously send writes as they occur to the InMage CX. The CX is an Intel-based 2 CPU server used to run the InMage Scout software, and it can be deployed on the same local network as the production server. Most of the processing that is done by conventional agents is offloaded to and performed by the CX, which gives InMage a significantly lower host impact than conventional replication and backup agents. InMage uses the CX to add compelling functionality to the solution, including not only fault management but various policies that support our granular recovery capabilities, functionality that minimizes bandwidth requirements such as TCP optimization and compression, encryption that provides security for data in-flight, and I/O profiling.

At initial installation, a baseline copy of the storage at the local site that you want to protect is created at the target location, but thereafter InMage only captures and sends changes to that data from the source to the target. This fact, coupled with other technologies mentioned above that minimize the amount of data that has to be sent to the target to enable application and data recovery operations, provides an interesting alternative to data deduplication technologies. InMage effectively achieves the same goal as deduplication technologies, which is to minimize the amount of storage capacity required to move or store a given information set. If deduplication technologies are deployed at InMage target locations, they will result in very limited data reduction ratios because there is not much redundancy in the data we are moving we have already taken it out.

Note that InMage uses CDP technology to capture data at the local site. CDP technology captures writes in real time and labels them with different types of markers so that it is easy to refer to previous points in the captured data stream. Each write is time-stamped, but administrators can set up policies to insert all types of markers, including ones that mark application-consistent recovery points (what InMage calls AppShots), useful points in the maintenance cycle such as pre- or post-patch points, or relevant business process events such as a quarterly close. The annotated data stream is stored in the retention log (shown attached to the recovery target), and the retention period can be defined by the administrator. Administrators can refer to this data stream, retroactively select any desired point or points, and immediately create a disk-based image of the selected point to use for recovery or other administrative tasks. These images can be read only or read/write, virtual or physical images, depending on how they are defined by the administrator at creation. Image creation is completely de-coupled from production servers, yet gives rapid access to even the most recent production data sets.

If a selected point in time represents an application-consistent point, then it is referred to as an AppShot. InMage uses application-specific APIs, such as Windows VSS, Oracle RMAN, etc. to mark application-consistent points in the data stream. AppShots are most often used for recovery because they support the shortest RTOs, but InMage can reliably re-create any previous point in time, regardless of whether it is application-consistent (like an AppShot) or crash-consistent (any other point in the data stream). Crash-consistent recovery points support reliable recovery, but will generally not support recovery that is as fast as that supported by AppShots. Still, crash-consistent recovery points can offer real value during root cause analysis to determine the cause of various failures.

InMage uses asynchronous, IP-based replication to move data from the CX to recovery targets. This network hop is often across a wide area network, so features like WAN optimization, bandwidth shaping and encryption become very important. Asynchronous replication supports DR configurations without distance limitations, yet can meet very stringent RPOs for remote or local site restores, minimizing data loss on recovery.

Note that this is a unique design that borrows elements of both host-based and appliance-based replication products, yet is superior to both. It is much more scalable in supporting larger numbers of servers and higher storage capacities than host-based replication, but imposes only a fraction of the overhead of those products on source servers while offering richer functionality. Like appliance-based solutions, InMage supports heterogeneous servers and storage. Unlike appliance-based solutions, InMage is a software-based solution that can be deployed on any Intel-based server, providing the flexibility to re-deploy existing equipment or the freedom to purchase any new equipment (servers or storage) without limitation. Note that InMage does not require a SAN, although it can be used in SAN as well as any type of storage environment. Note also that the CX is an out-of-band component that does not impact production operations in any way if it fails or is replaced online, so it does not have to be deployed in expensive server pairs like many appliance products.

Inmage’s Simultaneous recovery foundation for multiple applications

The CX supports the use of local and/or remote targets. Setting up a local target allows disk-based recovery operations to occur at the local site, while the remote targets support DR requirements.

InMage supports a web browser-based management GUI that allows all management operations for both application and data recovery across different production servers and applications to be tracked and managed using a common management paradigm. A command line interface is available as well. Management capabilities are protected through the use of a multi-level security model.

Application Recovery

If application availability is a concern, you may have already looked at shared disk clustering products. Generally, these products add complexity and cost to existing configurations, not to mention additional license fees that may include a second application license. It is yet another product that would have to be added to conventional replication and backup products to create the single platform solution you get with InMage.

InMage has productized failover/failback for a number of key enterprise applications, including Microsoft Exchange, SQL, and SharePoint as well as Oracle, MySQL, Blackberry Server, and SAP. This approach is far preferable to manual recovery, which is time-consuming, very dependent upon operator expertise, and can be fraught with risk. The InMage solutions are developed, supported, and documented by InMage on an ongoing basis, and are proven reliable through testing we do with each successive release. They provide rapid, reliable recovery using AppShots (or any selected recovery point), and are comprehensive in the sense that they not only bring the application back up, but can also make recovery transparent from the client side by updating Active Directory or DNS entries during failover/failback.

There are features that make our application recovery different from conventional clustering products:

  • InMage bases their application recovery on a shared nothing model which can easily support either remote application failover (a very complex proposition with most conventional clustering products) or local application failover
  • Application failover/failback uses known good data states for recovery, thereby resolving failures that are caused by data corruption problems and giving the flexibility (not available with conventional clustering) to recover from any recovery point within the retention period, if so desired, instead of just the one option; if AppShots are used for failover (which is the common case) we can also recover much faster than conventional clustering products which are effectively recovering from crash-consistent (not application-consistent) points
  • Application failover capabilities come as part of our base recovery solution and are managed through the centralized management interface that you can also use to manage remote (DR) and local (backup) recovery capabilities

While we have productized failover/failback capabilities for the key enterprise applications, the same application recovery template can be used by customers to provide these capabilities for literally any application.

Get A Handle on Your I/O Load

Because the CX is a central component that effectively sees the data streams generated by multiple servers, it provides a great location to monitor and track how I/O loads may change over the course of a day or over the course of time. The InMage Management GUI can be used to track I/O at the system level, which can provide excellent input for capacity planning purposes.

InMage has implemented a mechanism which can calculate the best-case RPO for a given application at any time based on this I/O tracking capability. This gives administrators direct, real time feedback concerning their recovery capabilities, helping to set expectations appropriately before a recovery starts. This is another unique capability that is not available in other replication, backup, or conventional clustering products.

How InMage Eliminates Backups

Today, backup is a discrete operation that runs at a scheduled time. It collects up all the changes since the last backup and writes them to a backup target across the network. During the backup operation the application being backed up can be either shut down (not happening this way much anymore since this would impact business operations too much) or the application can be backed up while it running (known as online backup). Online backup can result in noticeable degradations to production application performance, making it difficult to find the right time to perform a backup. There are several key problems with this "periodic" approach to backup:

  • With 24x7 operations becoming the norm, there is less and less time available to perform backups
  • Larger applications and higher change rates complicate backups because they are trying to push more data across networks that may already be over-subscribed
  • Backup granularity (once a day) is staying the same, but with increased business operations driving high data growth using that one recovery point means more lost data for each recovery

Using disk with conventional backup technologies improves things a little, but ultimately will run into the same problem that exists with tape: you can't get everything backed up within the available windows. The next generation of data protection solutions will move away from this notion of backup as a discrete operation and towards continuously capturing changes from applications throughout the day. This is the new paradigm that is necessary to provide data protection as data sets grow to hundreds of terabytes and beyond.

InMage customers have already moved to this new paradigm. We are dribbling writes as they occur across the network 24 hours a day, imposing no "backup" impact on production applications and using a miniscule amount of network bandwidth on a real time basis. And because we are collecting this data using CDP technology, we improve the recovery metrics relative to conventional technologies. We offer better RPOs than discrete methods because data is available for recovery within seconds of its creation (or a few minutes at the most if you are talking about recovery at the remote site). Because we are recovering directly from disk, we offer very low RTOs, and this disk-based recovery also provides for improved recovery reliability relative to tape-based infrastructures. We are in effect eliminating backups while significantly improving recovery capabilities.