Loading...
Loading

Storage - The Backbone Of The Cloud. To Centralize Or Distribute?

2011-11-03by Andrew Rouchotas

Cloud giants like Amazon and Google are initiating a massive push towards distributed and replicated storage solutions for cloud computing. Their assertion is that the storage backbone for the cloud should be done via replicating data across commodity type of local storage hardware. Essentially, they are talking about network based RAID across multiple local storage servers. Amidst their claims that distributed storage concepts will one day make SAN solutions obsolete, I am strangely reminded of similar claims by Intel and others regarding the future of mainframe computers when 1U pizza boxes hit the market. At the time, it was postulated that distributed processing found in these intel based machines would be preferable to the centralized compute model offered by mainframes, and inevitably would spell the end of mainframes and the concept of consolidated computing. Clearly distributed processing did not put an end to consolidated mainframe computing, as many enterprises today utilize a hybrid approach to computing. Rest assured, the distributed storage model being so heavily promoted by several cloud giants today will not spell the end of consolidated SAN storage platforms. Having said this, there are clear benefits and disadvantages to each approach that must be considered when building a storage backbone for your cloud.

The majority of service providers offering public cloud solutions today have looked to a consolidated storage platform for their various cloud solutions. The reasons for this are well understood. The hypervisor + consolidated SAN storage platform is natively supported by the most common hypervisors (VMware, Xen,KVM) as a mechanism for achieving high availability and elasticity. Additionally, support for existing applications and platforms is natively available with this infrastructure. Unfortunately, without proper architecture and planning, consolidated storage platforms can represent a large single point of failure or a large network & disk based IO bottleneck. This sort of poor planning and poor architecture has tarnished the reputation of consolidated storage platforms as well as the notion of public clouds as production ready for mission critical applications. Expensive and proprietary storage platforms (EMC, NetAPP, EqualLogics) are certainly mature and ready storage solutions. For the average service provider, the entry level sticker price made it virtually impossible for providers to offer effective replication and redundancy at the SAN level which led to various significant outages and even data loss. When work loads increased and providers were not able to effectively increase total throughput and/or add in caching layers to offset the workloads, poor performance began to creep into the equation. The issues fundamentally however were related to poor architecture and lack of redundancy. In order to address these concerns, service providers are now circling back and investing in redundant architecture, investing in 10G and bonded 10G networking, as well as, investing in extensive caching layers to offset and seamlessly serve massive and unpredictable workloads that are typically found in large public cloud deployments. The results for those that have invested in upgrading and evolving their infrastructure have been excellent. A properly built consolidated cloud storage backbone can be redundant multiple times over in every single aspect. With adequate and elastic SSD caching, reads and writes can be delivered at SSD speeds, regardless of what drives and what IOPS are available behind them.

The distributed storage model is an interesting one. As a result of a lack of any real support by the common hypervisors (ie VMWare, XEN, KVM), adoption of this type of storage backbone has not been as prolific. Applogics by 3Terra (now owned by ca.com after a $100M purchase a couple of years back) is the only real solution that enables this sort of architecture out of the box. Cloud giants like Amazon and Google boast this sort of architecture as well, but, have not really opened their technology and infrastructure up wide enough to give us a consolidated view of what they are actually doing and how. Additionally, large and extended outages of these proprietary platforms are certainly enough to scare potential cloud service providers away from this model and instead, opt for the more conservative, supported and understood centralized storage model. The real issue with this approach though is typically requiring applications to be specifically written for compliance to the architecture, vs offering native support for applications already running on the web. The other issue is that even though data is replicated across local storage commodity hardware, there are limitations to which hardware can be used and more specifically how hardware can be pooled. Having said this, local replicated storage offers a significant baseline advantage with respect to IO concerns simply because of the fact that IO is handled by the local servers backplane vs a network based storage.  With network based SAN storage, there will always be a certain amount of IOWAIT required to read and write over a network as compared to a local servers backplane.

So what exactly does the future hold for Cloud providers? We are at the point of maturity where providers are starting to build out version 2 and version 3 of their Public Clouds. Many providers are choosing to stick with their Enterprise SAN partners adding in redundancy if they were previously lacking it and adding in caching layers where ever possible. Some providers are moving to an Applogics approach and moving their Clouds to distributed and local storage model on commodity hardware pools. Others are adopting a hybrid approach with a centralized storage model built on commodity hardware with multiple and continual replication across pools of commodity hardware using open standards and open solutions. Those service providers opting for centralized storage based on commodity hardware, are able to achieve complete redundancy and scalability affordably and also represent a point of evolution between centralized SAN storage on proprietary hardware and distributed local storage across commodity hardware. Common technologies utilized to enable centralized and redundant commodity based storage that is also replicated across commodity local storage pools include open source solutions such as ZFS and Gluster.

Ultimately however, Cloud computing is continually evolving, as is the storage backbones that power them. I would theorize that the next stage of evolution for most providers will be a hybrid approach utilizing both a consolidated and distributed storage model for their Public Clouds and will either automatically and/or via client selection, utilize both types of storage backbones to deliver their services.

news Buffer
Author

Leave a Comment