EXPERT RESPONSE
If I understand your question correctly, you want to know if Amazon's
EC2 Web service is a viable solution to scale out one or a series of
database-intensive applications. I think you may be looking for a
magic bullet that does not exist. A lot of people are under the
misconception that virtualization somehow provides automatic scale-
out capabilities. While it is true that you can easily scale out to
more virtual server instances you are still responsible to ensure
that those instances know how to operate in tandem with each other.
As you pointed out, unless your application is designed so that it
can parallelize database operations across n-1 servers it will not
reap the benefits of EC2. This is because at the moment EC2 only
allows you to scale out -- adding more Amazon Machine Instances (AMI)
to the mix, not scale up -- increasing the resources of a single
AMI. So there are of course two solutions to this problem: 1) design
your applications so that they can utilize multiple database servers
at once or 2) as you said, design a load balancer to abstract the
multiple database servers from the application, handling the
parallalization automatically.
EC2 is a very interesting offering from Amazon, and I expect to see
more projects like this crop up as open source virtualization
solutions such as Xen (the software EC2 uses), KVM, and OpenVZ
mature. As more for-rent compute farms like EC2 become available the
competition will drive prices down, but for now the competition is
small, so the current sellers can dictate the dollar amount you pay
for this service. However, that said, the cost of EC2 is very
reasonable, only .10 for each compute hour. That is only $72 for 30
days. There are also charges for amount of bandwidth consumed
external to the cloud (.20/GB) and backing storage (.15/GB on
Amazon's S3 storage model). You can create and configure 100 AMIs
(although the number is limited to 20 in the beta) and only pay for
those that are running. This allows you to configure several hundred
database servers and only boot 3 initially. You can monitor your
application and if database performance starts the monitor could
react by booting more of your database AMIs.
How does EC2 compare to you virtualizing all of this yourself? Well,
how cheaply can you do it? And do you care of someone else is
hosting your data, or would rather retain that in house? Those are
really the two questions that you need to answer. If the answers are
"not very" and "I do not care" then by all means, use EC2.
Database servers are not always prime candidates for virtualization
because of the high disk I/O they produce. However, depending on how
Amazon has its AMIs configured on the back end, a lot of this lost I/
O can be recovered. An alternative would be to actually find a
service like EC2, but instead of offering blank VMs it would offer a
load-balanced database that your application could take advantage
of. As you pointed out, you could build this yourself with EC2,
MySQL, and your own load balancer. Is it cost effective? Do you
have the time? These are questions I cannot answer because I do not
know your business, but hopefully I have given you enough information
that you can now answer them yourself with confidence.
Hope this helps!
|