Configuring Disks in Azure for More Performance

In an ongoing effort to get the most performance out of Azure, I’ve run several test that lead me to this next tip that is both very helpful and extremely easy implement. Remember, you’re paying for the time you rent the VM. So, the more you get out of each VM, the more you save.

How do I get more disk performance in Azure IaaS?

Basically, my recommendation comes down to this: Think of disks in Azure as a network resources, and not as a disk. That type of reasoning brought me to test out different NTFS cluster sizes (or “allocation units”).

As a side point – Exercise #4 of the “Windows Azure Platform Training Course” says

leave the default Allocation unit size

It doesn’t give a reason… and the reason is probably because it’s “the default”. However, I recommend that you max out the size to 64K (instead of the default 4K). In my (very expensive) testing, I see that this yields a 20% increase in disk performance. And that increase followed directly in to SQL Server disk performance as well!

As they say, a picture is worth a bunch of words – so here is what my testing revealed.

Explanation of the Performance Test

To explain the above – I mounted 16 disks to an Extra Large Azure IaaS VM. I then created a script that would stripe the disks together and format them NTFS with a 4K cluster size. After that, the script would write a 1GB file to the disk. The test was run 30 times, and the measurement was only measuring the time to write the data – not the time to create or delete the file.

Next, the script would reformat the striped volume as NTFS with a 64K cluster size. The 1GB-file-test would then commence.

This process was repeated many times – yielding 300 results for 4K and 300 results for 64K. Also, the method of going back and forth ruled out any thought of a particular time of day being the issue.

The average time it took to write the 1GB file with a 4K cluster size was 2.5 seconds. The average time using the 64K cluster size was 2.0 seconds. This is a 20% increase in disk performance. It was also interesting to note that the 4K cluster size suffered from “spikes” more often than the 64K size did. You can download the spreadsheet of data here: AzureDiskHammer_log_20120827.csv

How To Setup Peer-to-Peer Replication in Azure IaaS (SQL Server 2012)

Recently at my company we have evaluated several options for replicating data in-between Azure data-centers, as well as on-premises. One of the most compelling choices (due to simplicity, and a proven track record) was just using SQL Server’s built-in replication features.

This post will answer the following questions:

  • How do I setup Peer-to-Peer replication in SQL Server 2012?
  • What extra steps do I need to take in a Windows Azure IaaS environment?
  • What is the latency of replication from two different Azure data-centers.

How do I setup Peer-to-Peer replication in SQL Server 2012?

While I’m not a guru on SQL Server, I have been using it pretty heavily for years in distributed, “n-tier” environments. I’ve used synchronous and asynchronous replication, database mirroring, and fail-over partnering with a witness server. Of all of these different options, I have to say that Peer-to-Peer replication is my favorite (and I think the easiest) thing to setup. I also believe it gives tremendous value as you get multi-master, high-availability and scaled-up read performance.

The basic steps for setting up p2p replication in SQL Server are as follows. As a side note, this can all be done in 2008 as well, but 2012 makes it a little easier by removing one step (of right-clicking, going to properties and “enabling” p2p). So here are the steps:

  1. Enable / configure your server to act as a “distributor”.
    1. This means that the server will be able to “push” updates to other members of the replication group.
  2. Create a “publication”.
    1. This basically means that you pick the database and tables to replicate.
  3. Configure the “topology”.
    1. AKA – add other servers to the group.

While it may seem like an over-simplification, those really are the steps. Here they are in more detail with screen-shots. For the sake of this article, we will assume that we already created a database called “AzureReplicationDB”, and that there is a table named “Customers”

1. Enable and Configure Your Distributor

The first step is to enable distribution on your SQL Server. You’ll should do this step on all of the servers that will participate in the replication topology. Simply right-click on the “Replication” node in “Object Explorer” and choose the “Configure Distribution…” option.

SQL Server 2012 - Configure Distribution

You can just keep clicking “next” for all of the default options – especially “‘XyzServer’ will act as its own Distributor”.

2. Create a Publication

Next, expand the “Replication” node and right-click on “Local Publications” and choose the “New Publication…” option.

SQL Server 2012 - New Publication

You can mostly click “next” through this dialog as well, except that you need to make sure to select “Peer-to-Peer publication”, choose your tables that you want to replicate and configure the security (login/password). For simplicity’s sake, I created a single user on all of the servers that had sysadmin rights :) – never do that in a production environment!

SQL Server 2012 - Select Peer-to-Peer publication

SQL Server 2012 - Choose what tables to replicate

Finally, name your publication. For this example, I named it “Demo_Publication”. The name should be the same across all of the servers in the replication topology.

3. Configure the Replication Topology

The final step here is to make all of the servers aware of each other. The nice thing is that you can do this from one location, and all of the servers will be setup. A very important note here is that all of the servers need to already be created, configured for distribution and (should) have the blank database (with the same schema) across the board. This will make your world a lot easier.

So, right-click on the publication and choose the “Configure Peer-to-Peer Topology…” option. Then follow the wizard to 1) choose the publication and 2) add all of the peer nodes. Again, you will have to configure the security account that they all run under.

SQL Server 2012 - Configure Peer-to-Peer Topology

And that’s it! After about 30 seconds, you should be able to insert a row into one of the servers, and then go SELECT * FROM dbo.Customers in the other server to see the results. You can now insert, update, delete (and oddly enough, modify the schema of the Customers table) in any of the database servers that are connected… and all of the transactions will be replicated to each of the other nodes.

What extra steps do I need to take in a Windows Azure IaaS environment?

Because Windows Azure IaaS is just what its name suggests – an “Infrustructure as a Service” – there isn’t a lot of special work that you need to do to get SQL Server replication working between Azure data-centers, or on-prem. Actually, the only thing you need to do is punch a hole in the firewall – port 1433 by default. This shouldn’t come as much of a surprise to you, because it’s what you would have to do anyway in your own data-center.

Something else that I had to do, which may have nothing to do with Windows Azure, but may simply be a SQL Replication thing, is to edit the “hosts” file in Windows so that I can refer to the other SQL Server by name, instead of by it’s DNS name. What I mean is, SQL Server seems to want to connect to “Other-SQL-Box” … and not “Other-SQL-Box.cloudapp.net”.

What is the latency of replication from two different Azure data-centers.

This question is a tricky one. Largely because I am not a SQL DBA, and therefore I don’t know all of the tricks of gaining performance out of SQL Server Peer-to-Peer replication. I did, however, make a few modifications to the Distributor Properties by changing the transaction “CommitBatchSize” to 1,000 (instead of 100), “MaxBcpThreads” to 16 (instead of 1) and the “PollingInterval” to 1 second (instead of 5).

I don’t know if increasing the threads helped at all, because I believe transactional replication can’t use multiple threads. But I do know that pumping up the batch size did increase throughput significantly.

All this being said – I was able to insert 1,000 records in a small table on the Azure East data-center, and they were replicated to the Azure West data-center in 1 second. If I inserted 10,000 records, it took about 10 seconds. This is pretty promising when you consider all that you’re getting here.

As a side point, I may have been able to get more performance, and you might too, if you follow some of the guidelines that I mentioned in this previous article about SQL IOps in Azure IaaS: http://tk.azurewebsites.net/2012/06/18/windows-azure-iaas-performance-sql-and-iops/