Recently at my company we have evaluated several options for replicating data in-between Azure data-centers, as well as on-premises. One of the most compelling choices (due to simplicity, and a proven track record) was just using SQL Server’s built-in replication features.
This post will answer the following questions:
- How do I setup Peer-to-Peer replication in SQL Server 2012?
- What extra steps do I need to take in a Windows Azure IaaS environment?
- What is the latency of replication from two different Azure data-centers.
How do I setup Peer-to-Peer replication in SQL Server 2012?
While I’m not a guru on SQL Server, I have been using it pretty heavily for years in distributed, “n-tier” environments. I’ve used synchronous and asynchronous replication, database mirroring, and fail-over partnering with a witness server. Of all of these different options, I have to say that Peer-to-Peer replication is my favorite (and I think the easiest) thing to setup. I also believe it gives tremendous value as you get multi-master, high-availability and scaled-up read performance.
The basic steps for setting up p2p replication in SQL Server are as follows. As a side note, this can all be done in 2008 as well, but 2012 makes it a little easier by removing one step (of right-clicking, going to properties and “enabling” p2p). So here are the steps:
- Enable / configure your server to act as a “distributor”.
- This means that the server will be able to “push” updates to other members of the replication group.
- Create a “publication”.
- This basically means that you pick the database and tables to replicate.
- Configure the “topology”.
- AKA – add other servers to the group.
While it may seem like an over-simplification, those really are the steps. Here they are in more detail with screen-shots. For the sake of this article, we will assume that we already created a database called “AzureReplicationDB”, and that there is a table named “Customers”
1. Enable and Configure Your Distributor
The first step is to enable distribution on your SQL Server. You’ll should do this step on all of the servers that will participate in the replication topology. Simply right-click on the “Replication” node in “Object Explorer” and choose the “Configure Distribution…” option.
You can just keep clicking “next” for all of the default options – especially “‘XyzServer’ will act as its own Distributor”.
2. Create a Publication
Next, expand the “Replication” node and right-click on “Local Publications” and choose the “New Publication…” option.
You can mostly click “next” through this dialog as well, except that you need to make sure to select “Peer-to-Peer publication”, choose your tables that you want to replicate and configure the security (login/password). For simplicity’s sake, I created a single user on all of the servers that had sysadmin rights :) – never do that in a production environment!
Finally, name your publication. For this example, I named it “Demo_Publication”. The name should be the same across all of the servers in the replication topology.
3. Configure the Replication Topology
The final step here is to make all of the servers aware of each other. The nice thing is that you can do this from one location, and all of the servers will be setup. A very important note here is that all of the servers need to already be created, configured for distribution and (should) have the blank database (with the same schema) across the board. This will make your world a lot easier.
So, right-click on the publication and choose the “Configure Peer-to-Peer Topology…” option. Then follow the wizard to 1) choose the publication and 2) add all of the peer nodes. Again, you will have to configure the security account that they all run under.
And that’s it! After about 30 seconds, you should be able to insert a row into one of the servers, and then go SELECT * FROM dbo.Customers in the other server to see the results. You can now insert, update, delete (and oddly enough, modify the schema of the Customers table) in any of the database servers that are connected… and all of the transactions will be replicated to each of the other nodes.
What extra steps do I need to take in a Windows Azure IaaS environment?
Because Windows Azure IaaS is just what its name suggests – an “Infrustructure as a Service” – there isn’t a lot of special work that you need to do to get SQL Server replication working between Azure data-centers, or on-prem. Actually, the only thing you need to do is punch a hole in the firewall – port 1433 by default. This shouldn’t come as much of a surprise to you, because it’s what you would have to do anyway in your own data-center.
Something else that I had to do, which may have nothing to do with Windows Azure, but may simply be a SQL Replication thing, is to edit the “hosts” file in Windows so that I can refer to the other SQL Server by name, instead of by it’s DNS name. What I mean is, SQL Server seems to want to connect to “Other-SQL-Box” … and not “Other-SQL-Box.cloudapp.net”.
What is the latency of replication from two different Azure data-centers.
This question is a tricky one. Largely because I am not a SQL DBA, and therefore I don’t know all of the tricks of gaining performance out of SQL Server Peer-to-Peer replication. I did, however, make a few modifications to the Distributor Properties by changing the transaction “CommitBatchSize” to 1,000 (instead of 100), “MaxBcpThreads” to 16 (instead of 1) and the “PollingInterval” to 1 second (instead of 5).
I don’t know if increasing the threads helped at all, because I believe transactional replication can’t use multiple threads. But I do know that pumping up the batch size did increase throughput significantly.
All this being said – I was able to insert 1,000 records in a small table on the Azure East data-center, and they were replicated to the Azure West data-center in 1 second. If I inserted 10,000 records, it took about 10 seconds. This is pretty promising when you consider all that you’re getting here.
As a side point, I may have been able to get more performance, and you might too, if you follow some of the guidelines that I mentioned in this previous article about SQL IOps in Azure IaaS: http://tk.azurewebsites.net/2012/06/18/windows-azure-iaas-performance-sql-and-iops/