Geeks With Blogs
Cloud9 Azure and Cloud Services, WCF, WF, Dublin, Geneva and Federated Security, Oslo

If you deploy 10 instances of a web role. e.g. your website on 10 servers
for scale and availability purposes.
You wouldnt want to put all ten on one rack just in case that rack gets hosed.
You'd want to follow some best practice and maybe host them across two racks.
5 on one rack in location X and 5 on another rack in location Y.
If one server goes down on the rack in location Y you are good, theres 4 others still working and the Azure fabric will bring up the 5th as soon as it can.
If the whole rack goes down, you're still good, the Azure fabric will fire up instances of your website on another rack in another location.

But what if you want to update your site or update the operating system with patches etc...
where do you start upgrading?
you dont want to upgrade all the servers in location X because what if location Y goes down... your customers will start getting web site not available errors.

maybe you start upgrading the 4th and 5th one in each location.
so during the upgrade process youll always have at least 8 servers across both your racks.
if while you are doing this upgrade location Y goes down again... you still have the other 3 servers on in location X to limp along with
until Azure gets location Y up again and then you can continue your upgrade process 2 at a time across both locations.

these strategies are Orthogonal and help you keep those availability numbers high.

You might say to yourself... ohhh no. I can't afford to ever have only 3 servers servicing my clients.
Thats when the elasticity of the cloud comes into play. You could dyanamically upgrade momentarily to enough servers
that will still give you 10 live servers servicing requests even if you lose a rack AND are doing a rolling upgrade.

How much money would you have to spend to address that scenario?
Money for machines you dont need until that moment.
Money for smart systems administrators who can execute that strategy.
Money lost if you get it wrong.
Staying late at working trying to figure out some weird error that happened during the process.

Nahhh man I'd rather go home and have a nice dinner with the wife.

Save yourself some dough and some grey hairs and start learning how to use the cloud today :). Posted on Friday, February 13, 2009 9:33 PM Azure | Back to top

Comments on this post: fault domains and upgrade domains

No comments posted yet.
Your comment:
 (will show your gravatar)

Copyright © Juan Suero | Powered by: