Loading...
Loading

Why Traditional IT Practices Are Becoming Extinct

2011-10-19by Michael Crandell

Earlier this year the IT industry was reminded that cloud computing infrastructures are vulnerable to the same genetic IT flaw that plagues traditional data center operations: everything fails sooner or later.

 

In March, Japan experienced an 8.9 earthquake and subsequent tsunami that caused widespread disruptions to power supplies and network connectivity to data centers, causing Japanese companies to rethink their traditional disaster recovery strategies. Several weeks later, the EBS system in one of Amazon’s EC2 data centers in the Eastern U.S. failed due to a faulty router upgrade and a cascade of resulting events, sent hundreds of customers scrambling in an effort to resume services. And in September, demand for Target’s new Missoni line resulted in an influx of web traffic that crashed the retailer’s website and led to several hours of downtime.

 

Protecting your organization from unplanned downtime is widely dependent on building redundancy and diversity directly into your disaster recovery and business continuity systems. Business systems need to be able to run on a number of different infrastructures—whether they be public clouds such as Amazon, Rackspace, or also private clouds using traditional on-premise hardware—and fail over between them quickly and efficiently as necessary.

 

Despite the recent outages, the fact is that public clouds now provide organizations with an impressively wide array of options to implement business continuity at a level of affordability that simply did not exist a few years ago. The solution is not a simple one – design your infrastructures for the possibility of failure.  

 

While there is no easy fix, there is a general approach that does work – combining redundancy in design with automation in the cloud management layer. 

 

The first step requires architecting a solution that uses components that can withstand failures of individual nodes – whether those are servers, storage volumes, or entire data centers.  Each component (e.g. at the web layer, application layer, data layer) needs to be considered independently, and designed with the realities of data center infrastructure and Internet bandwidth, cost and performance in mind.  Solutions for resilient design are almost as many and varied as are the software components they utilize.  For example, databases alone comprise a wide range of approaches and resiliency characteristics, including SQL, NOSQL, replication, caching technologies, etc.

 

But the secret sauce really comes in how your architecture is operated.  What parts of the system can respond automatically to failure, what parts can respond nearly automatically, and which not at all? For example, if a given cloud resource goes down – be it a disk drive, a server, a network switch, a SAN, or an entire geographical region – how seamlessly can you launch or fail over to another and keep operations running? Ideally, of course, the more that is automated (or nearly so), the better your operational excellence. 

 

Achieving that level of automation requires that your system design and configuration be easily replicable.  Servers, for instance, need to be quickly re-deployable in a predictable fashion across different cloud infrastructures. It’s this automation that gives organizations the life-saving flexibility they need when crisis strikes. Our own RightScale ServerTemplate™ methodology provides this re-deployment capability that allows a server, if brought down from an outage, to be launched in another cloud in a matter of minutes.

 

The right cloud management solution should simplify the process of launching entire deployments through customizable best practices and should provide complete visibility into all infrastructures through a central management dashboard – a ‘single pane of glass’ – through which administrators can monitor performance and make capacity changes based on real-time needs. The same automation and control that gives organizations the ability to scale up or down using multiple servers when demand increases also allows them to migrate entire server deployments to a new infrastructure when disaster strikes.

 

Some may have thought that the cloud was the answer to all their IT problems. It’s not, and that’s actually good news. By recognizing one of the original founding principles of cloud architectures – that everything fails at some point – businesses are now in a position to design and build services that are more resilient than in the past, at a fraction of the cost.  With the right architecture and management layer, cloud-based services can provide unparalleled disaster protection and business continuity.

news Buffer
Author

Michael Crandell

RightScale

Michael Crandell is CEO of RightScale, the leader in cloud computing management.

View Michael Crandell`s profile for more
line

Leave a Comment