Loading...
Loading

Running Databases On Cloud Platforms

2011-10-03by ScaleBase

Lately I heard a customer say that “Cloud is bad for databases”. And frankly, he had every reason to say it. There are so many reports online about issues with I/O performance, machine limitations and other problems, all greatly affecting database performance on the cloud.

What to do? Well – the truth is every part of the application suffers from the same problems, but since other application layers, like the web server, application server, caching mechanisms, etc. are “share-nothing” (so they don’t store any information, just run code) and thus can scale-out, they don’t suffer as much as databases do. Databases are everything but “share-nothing”. They need to save data for your application. They need to isolate transactions and maintain consistency – operations that require synchronization between the many different users who access the database.

So it’s common to see applications that scale very well – but their database is holding them back. Unfortunately, existing scaling solutions for databases don’t work very well on the cloud.

Take, for instance, what is considered by many to be the best scaling solution in the database world, Oracle RAC. Oracle RAC lets you run multiple instances of the Oracle database, all working on the same storage, synchronizing activities between themselves. This is a great idea for local installations, but terrible for the cloud. Since it requires specialized hardware, and a high throughput network, it just doesn’t scale very well on public cloud environments. Other database clustering solutions behave in very much the same way (at the time of writing, Oracle RAC is not supported on Amazon EC2, but that will likely change).

So what can you do? The ideal solution would be to have many small databases working together to provide the application with both performance and scalability. Is there a way to do that: scale-out databases on the cloud? In one word, “Yes”, and in two words, “Definitely Yes”.

The idea is to have each database handle only a part of the overall application data. If the database is small enough it will not be affected by cloud performance, and will only receive a limited number of transactions per second. This technique is called sharding, and most companies that run big databases on the cloud use it.

Sharding is great because:

  1. Smaller databases run faster on the cloud. There are many reasons for this: smaller databases have smaller indexes, and can store more indexes and data in RAM, saving on slow cloud I/O.
  2. Allows for cloud elasticity. As you can dynamically add and remove database instances, the database itself becomes an elastic resource. This allows for better capacity planning and cost optimization – the goals of migrating applications to the cloud in the first place.

The down side of sharding is that it’s a pain to develop. It’s incredibly hard to write the code that splits the data and joins it again. Luckily, there’s ScaleBase.

 

 

ScaleBase does just that: it allows you to run multiple databases on the cloud and trick the application into thinking that only one database is running. No code changes are needed, and better still, all the tools usually used with databases, like reporting and management tools, can continue working the same way.

news Buffer
Author

ScaleBase

ScaleBase

ScaleBase

ScaleBase is a software solution, supported on the Amazon and Rackspace clouds, which can be downloaded for free evaluation at http://www.scalebase.com.

View ScaleBase`s profile for more
line

Leave a Comment