Configuring Disks in Azure for More Performance

In an ongoing effort to get the most performance out of Azure, I’ve run several test that lead me to this next tip that is both very helpful and extremely easy implement. Remember, you’re paying for the time you rent the VM. So, the more you get out of each VM, the more you save.

How do I get more disk performance in Azure IaaS?

Basically, my recommendation comes down to this: Think of disks in Azure as a network resources, and not as a disk. That type of reasoning brought me to test out different NTFS cluster sizes (or “allocation units”).

As a side point – Exercise #4 of the “Windows Azure Platform Training Course” says

leave the default Allocation unit size

It doesn’t give a reason… and the reason is probably because it’s “the default”. However, I recommend that you max out the size to 64K (instead of the default 4K). In my (very expensive) testing, I see that this yields a 20% increase in disk performance. And that increase followed directly in to SQL Server disk performance as well!

As they say, a picture is worth a bunch of words – so here is what my testing revealed.

Explanation of the Performance Test

To explain the above – I mounted 16 disks to an Extra Large Azure IaaS VM. I then created a script that would stripe the disks together and format them NTFS with a 4K cluster size. After that, the script would write a 1GB file to the disk. The test was run 30 times, and the measurement was only measuring the time to write the data – not the time to create or delete the file.

Next, the script would reformat the striped volume as NTFS with a 64K cluster size. The 1GB-file-test would then commence.

This process was repeated many times – yielding 300 results for 4K and 300 results for 64K. Also, the method of going back and forth ruled out any thought of a particular time of day being the issue.

The average time it took to write the 1GB file with a 4K cluster size was 2.5 seconds. The average time using the 64K cluster size was 2.0 seconds. This is a 20% increase in disk performance. It was also interesting to note that the 4K cluster size suffered from “spikes” more often than the 64K size did. You can download the spreadsheet of data here: AzureDiskHammer_log_20120827.csv

Hacking Azure for More Disk Performance

So, this is going to be a quick post. I’ve been doing a lot of work in trying to hit the limits of Azure – really getting a feel for the beast. As I’ve already posted about, one of the biggest limitations that I’ve run into has been the disk performance.

We’ve been in close communication with Microsoft on a lot of things pertaining to Azure, IaaS, SQL and the like – none of which I can talk about here. But, what I can talk about is a side project that I did to … guesstimate … what the performance of IaaS can be in the future.

Why the “Hack”?

Here are the “truths” that you likely already know about IaaS / disk performance in Azure.

  1. Azure IaaS uses a single storage account to host your VHD.
  2. Azure storage accounts are limited to 5,000 transactions per second.

Like I’ve mentioned before, that comes out to about two or three 15k spindle hard drives striped together. Here are a couple of things you likely don’t know (or haven’t thought of in the context of this conversation).

  1. Azure had the ability to have persisted hard drives for over 2 years now.
  2. Unlike IaaS (which is limited to 1 storage account), PaaS can mount 16 different storage accounts on one box!

The above statements are referring to using the CloudDrive class, which you have to mount in code. Due to the reliance of needing “RoleEnvironment.IsAvailable” to be true, this feature only works in PaaS. I consider that bad code, but I understand why this happened.

Therefore, the “hack” is to make a single worker role in PaaS that mounts multiple storage accounts into a single striped volume. Then I remote into that machine, install SQL Server and run my previous test to see what the performance of IaaS could be one day (if Microsoft decided that multiple storage accounts was the way to get more IOPs in IaaS).

Disclaimer and Code

I hope that I don’t have to tell you that this is obviously not recommended for a production solution. The whole point of the exercise was to see how far Azure would stretch, and to give an idea of the (potential) future. My goals were happily met. I achieved more than double the performance of my previous test. And I’m sure with the right setup, I could have gotten even more.

Here is the code (that will automatically create a multi-account striped drive) in case you want to do your own performance tests with your app in the cloud:

You’ll need take the following steps for it to work:

  1. Allow Remote Desktop connections (so you can log in and test your app).
  2. Edit the App.Config with real storage account credentials.