AutoNation spent years trying to establish a disaster recovery plan that inspired confidence. It went through multiple iterations, including failed attempts at a full on-premises solution and a solution completely in the cloud. The Fort Lauderdale, Fla.-based auto retailer, which operates 300 locations across 16 states, finally found what it needed with a hybrid model featuring disaster recovery as a service.
“Both the on-premises and public cloud disaster recovery models were expensive, not tested often or thoroughly enough, and were true planning and implementation disasters that left us open to risk,” says Adam Rasner, AutoNation’s vice president of IT and operations, who was brought on two years ago in part to revamp the disaster recovery plan. The public cloud approach sported a hefty price tag: an estimated $3 million if it were needed in the wake of a three-month catastrophic outage. “We were probably a little bit too early in the adoption of disaster recovery in the cloud,” Rasner says, noting that the cloud providers have matured substantially in recent years.
AutoNation, which also owns collision centers, auction houses and launched its own precision parts line in 2018, has a new disaster recovery plan that features a blend of colocation-based and as-a-service-based disaster recovery, with 75% of applications targeted to recover from a Denver colocation facility and 25% from Amazon Web Services. The environments are orchestrated by DRaaS provider Cohesity and its secondary data management platform, which backs up and replicates virtual servers, applications and data to the colocation facility and to AWS. Cohesity also manages failover and recovery.
“The ability in a disaster to flip a switch and automatically spin up VMs off-premises lets me sleep better at night,” Rasner says.
The DRaaS market is a complex scene. There are hundreds of DRaaS providers, all with different approaches and capabilities for replicating and hosting servers and data. Some DRaaS services focus on virtual servers, while others also back up physical servers; some rely on on-site backup appliances, others don’t. It’s a growing market, as enterprises look to third-party providers to provide failover in the event of natural disasters or service disruptions. Market research firm Technavio predicts the global DRaaS market will expand at a compound annual growth rate of nearly 36% between 2018 and 2022.
For Ken Adams, CIO of Miles & Stockbridge in Baltimore, DRaaS is a way to fully embrace the cloud but still address compliance demands for the 480-employee law firm. ISO standards require law firms to preserve data in three different locations. As an early adopter of the cloud, Adams embraced as-a-service early on and saw the opportunity to use it for disaster recovery.
Miles & Stockbridge uses ClearSky Data’s on-demand platform and appliances to access and store virtual servers and data locally and in a colocation facility in Virginia, and to send data out to a third location: a virtual cache server on Amazon’s AWS, which Adams calls his “insurance policy.”
“ClearSky was originally just a storage platform for us, and then we decided to try putting our servers on the appliances, which have solid-state drives. We had no performance hit on the servers and we got that extra protection from having the servers – not just the data – ready in multiple locations,” he says.
While the appliance in Virginia is updated almost in real time, the AWS version of data is a little older, saving on traffic. Disaster recovery, he says, is now easy. “You just push a button in the ClearSky console that works with VMware and fails over from one environment to the other.”
Adams has dedicated fiber lines from two different ISPs connecting the ClearSky appliances so they can easily handle the heavy demands of applications such as litigation support. However, he says the burden on them is not as great as it might be because some applications such as the firm’s document management solution are already accessed as SaaS, giving them built-in disaster recovery.
Spencer Suderman, principal consultant for tech research and advisory firm ISG in Stamford, Conn., says as interest grows in DRaaS and more players enter the market, IT teams have to consider the needs of their servers and data. While some servers and applications might port easily to a cloud-based “as a service” disaster recovery environment, others might be resistant because they are proprietary or are highly interdependent with other applications.
If IT thought getting applications to the cloud in the first place was difficult, adding on DRaaS certainly adds to the complexity, Suderman says. For instance, containerized applications in virtual servers might not be able to fail over or recover properly. “A virtualized server still has dependencies,” he says. And, even if the application works, the data transport might cause issues. “Let’s just say you have a recovery time objective of six hours. If you have a terabyte of data on a 100M bit/sec link, it will take you 23 hours to download all that data. You won’t be able to meet your RTO,” he says.
AutoNation’s Rasner finds that the scope of applications suitable for DRaaS is limited in the automotive industry, where it’s common to have legacy applications that were custom-built or have a lot of tentacles into other applications such as AutoNation’s 13-year-old CRM system. AWS, Rasner says, is best suited to off-the-shelf and stand-alone applications such as AutoNation’s equity mining tool, which helps service teams determine if customers would find better value completing an expensive repair or buying a new car. AWS also houses backups older than 40 days. As legacy applications are refreshed or refactored, Rasner says they will be added to the AWS disaster recovery environment.
ISG’s Suderman recommends intensive planning and monthly, bi-monthly, or quarterly drills with the DRaaS provider. “Disaster recovery is probably one of the most under-planned services,” and he anticipates DRaaS, where you’re handing off some responsibilities to a provider, will only make it worse. “Everyone talks a good game about disaster recovery, but what’s the breadth and depth of the planning you’ve done for a real disaster? DRaaS drills will tell you how portable your environment really is.”
Some considerations: Are all your applications in one place and on virtual machines that can be quickly spun up? Is your data fresh? How long can your organization stand to be down, and does your provider know the priority of your applications and data?
Perhaps the most important question, if you are in a highly regulated industry: Do you have visibility into your disaster recovery sites? “You might not be able to tell where your application is running if you’re using cloud-based infrastructure,” Suderman says.
Vishal “Steve” Mathur, senior IT manager at Baltimore-based food manufacturer TIC Gums, is at the start of his company’s DRaaS journey. His first step was to redo the company’s WAN infrastructure, which had relied on a single MPLS line out to the company’s three sites. “When our MPLS line went down, all three sites were shut down because we couldn’t get to the Internet for Office365 or Salesforce,” he says.
Now TIC Gums has built-in redundancy with three lines from three separate ISPs and independent firewalls at each site that provide high availability to sustain cloud-based backup, storage, and disaster recovery. “With the infrastructure we had, it would have taken days, if not weeks to bring the business back up,” Mathur says.
Although the company initially thought it would implement disaster recovery on a platform like AWS or Microsoft Azure, Mathur charted out a score card that put Expedient’s DRaaS offering ahead of the others. “The biggest question we always returned to was: ‘What kind of service and support would we get from the big players?’ Over the long term, we wanted the more personal relationship and support,” he says.
The company has worked closely with Expedient to identify the core stack of applications that would need to be recovered, and the work to redesign those applications is 80% complete. “This year, we will be migrating that pod of applications over to Expedient’s data center,” Mathur says. TIC Gums’ DRaaS RTO is less than two hours.
“We will be able to initiate disaster recovery based on standard operating procedures and will be able to bring everything back up from one phone call to Expedient,” he says.
Mathur already has set out goals to test the DRaaS twice a year and to adjust standard operating procedures accordingly. Servers will be moved from tier to tier (each tier denotes how many hours the server can be down) based on the findings of the drills, which are done in partnership with Expedient. Mathur only has to dedicate one system administrator from his team: “95% of disaster recovery is left to the provider,” he says.
AutoNation’s Rasner warns fellow IT professionals not to get complacent. “You still have to push the button and declare a disaster. Then there are things that need to be tested, validated, and, in some cases, manually intervened with,” he says.
In addition, he says, “DRaaS is not one size fits all.” Each application and bit of infrastructure needs to be evaluated, and companies need to consider the appropriateness of capital expenditures versus operational expenditures. How he justifies it: “All you’re doing in disaster recovery is replicating and replicating, and you can do that via DRaaS without incurring the cost of all the heavy infrastructure depreciating and not doing anything value-added.”
Overall, Rasner is happy with his DRaaS experience: “We’ve tested it, and it is rock solid. Even though it was painful to get here, our disaster recovery is in a much better place than it was.”