Post number 11 in a series of 12 from one of our provider partners, NTT.
Disaster Recovery (DR) is like insurance. You consider how much you really need, pay for it, and hope you never have to use it. Some organizations are willing to risk not having a DR plan. This is typically not an option if you are a publicly traded company. Others look at DR and get the bare minimum needed to keep the business up and running. Others want an exact replica of what they have with an almost immediate ability to recover everything in their environment. If you are a large organization you probably have some sort of DR plan in place. The big question then becomes how good is the plan and how much is it costing you?
In countless meetings with customers over the years we have talked about their strategies for DR. Some have advanced, fully replicating systems that are regularly tested to assure they work. Others have backup tapes they will restore from. Having lived in the Northeast for the past 15 years I have been a witness to both the successes and failures of these DR plans.
When you boil it down, a good DR plan needs to consider two factors: Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO is the amount of time it takes to get the systems up and running after a failure. RPO is the amount of data that is acceptable to lose if there is a failure. A third factor might be the minimum number of applications that need to be available during a disaster for the organization to properly function. There is a direct correlation between the cost of your DR solution and your RTOs and RPOs. As you arrive at an RTO of no downtime and an RPO of zero downtime, the costs of the solution typically increase.
There are other factors that affect the cost of DR. First, there needs to be equipment in two different locations (preferably geographically dispersed) that roughly mirror each other. In the past it wasn’t uncommon to see exact replicas of Data Centers set up in different locations. Second, there needs to be a storage resource that is storing the exact data that is being written on the primary site. Third, their needs to be sufficient network connectivity and bandwidth to handle all the changes in the data.
Cloud providers can provide infrastructure without the need to hop on a plane and make sure everything is up and running.”
Over the years there have been different ways of replicating data from one site to another. The main types of replication either use host-based or storage-based replication technologies. Both of these technologies are applicable to different DR scenarios. The amount of data that can be lost, the distance between the DR sites, and the amount of bandwidth that is available between sites determines the type of replication that should be used. Data that needs to match exactly between sites requires synchronous replication. The main problem with synchronous replication is distance and network latency. The applications wait for a response back from a remote site before they continue with transaction processing, which may cause application timeouts. Data that can be a little off can use asynchronous replication, which returns a response from the local storage once it is written.
There are countless articles out there about what you need to do for DR and what makes a good plan. The process of creating and testing DR plans has made almost its own industry. IBM Global Services and SunGard have providing “cloud-like” DR services for years. In fact, you could almost say that disaster recovery was one of first cloud solutions. DR is a must have for large organizations, especially with the capacity to set up hybrid clouds between service providers and corporations.
You have probably already heard thousands of DR pitches over the last ten years. Since “cloud” is the big thing right now, the question is, is cloud a good option for DR? If we revisit my comparison of DR to insurance, I would say that DR is an ideal use of the cloud, mainly because it is costly and time consuming to maintain a DR environment you hope to never have to use. Cloud also provides some other good options:
- Geographical dispersion of data – if you are replicating data from your site into the cloud or to a disk in the cloud, you have moved the data to an offsite location. The infrastructure to support the environment is already available for you with the cloud.
- Huge network pipes – most of the cloud providers have set up their operations in locations that have been built out with large amounts of fiber connectivity. All a company needs to do is patch into it. This may be easier than getting this capability into a corporations’ sites.
- Access to additional storage and backup capabilities on the fly – This isn’t necessarily a DR thing but it can help move the tedious job of maintaining and managing backups to a third party with specific SLAs.
- Highly redundant infrastructure – The infrastructure to support the environment is already available for you. No new generators of diesel fuel to stock. No electrical or AC concerns. Someone else is in charge of watching that all the time.
- Pay-per-use models – This has been highlighted in other places, but you basically only need to pay an ongoing charge for the storage space you use, and would only have to pay for the server resources if you spun them up (say for DR testing).
- Cloud providers can provide infrastructure without the need to hop on a plane and make sure everything is up and running. In the past, the typical methodology for DR was to hop on a plane or car and start the recovery of your resources. This is no longer the case. If you can get network connectivity during a disaster then all of your infrastructure can be stood up remotely.
As you explore cloud providers for DR, make sure that you evaluate your environment to see what you have in place. Ask questions about how your provider would handle replication into their cloud. It may also be helpful to choose a provider that can provide colo space for applications that have special disaster recovery requirements. Cloud computing may not be a panacea for your entire DR needs but it is like a hammer. You may not use it for everything but it sure is a nice tool to have in your toolbox.
Next Post: Moving Enterprises to the Public or Hybrid Cloud Part 12 – Conclusion
Contact StrataCore to learn more about NTT America Cloud services (206) 686-3211 or stratacore.com
About the author: Ladd Wimmer
Ladd Wimmer is a valuable member of the NTT Communications team. He has over 15 years of experience implementing, architecting, and supporting enterprise servers, storage and virtualization solutions in a variety of IT computing and service provider environments. He worked as Systems Engineer/Solution Architect in the Data Center Solutions practice of Dimension Data, most recently serving as technical lead for the roll out of Cisco’s UCS and VCE vBlock platforms across the Dimension Data customer base. Ladd has also run two IBM partner lab environments and worked for an early SaaS provider that created lab environments for Sales, QA testing and Training.