TruckFuelNet operate 24 hours a day, with transactions occurring continuously throughout the day and night. The risk to the business from infrastructure or application downtime is therefore high. To contain costs and meet acceptable recovery objectives, a warm fail-over configuration was chosen.
Warm fail-over – Manually triggered procedure, the application support team will be required to assess any downtime and in conjunction with the business make a call to initiate the fail over process.
Recovery Time Objective – less than 1 hour downtime
Recovery Point Objective – less than 5 minutes transaction data loss
A cloud based disaster recovery solution makes the most sense from a cost perspective, as the resources can be scaled up on demand and are hosted in an independent data center. The production system for TruckFuelNet is hosted at Internet Solutions.
The server side TFN system is composed of a SQL Server 2012 Database, IIS hosted Web applications and a windows service component.
Foremost in the DR plan for TFN is ensuring minimum data loss. We chose to make use of database transaction log shipping to Azure BLOB storage, with a log file being created and shipped every minute, therefore the recovery point objective is achieved. Full weekly backups to Azure BLOB storage are also made.
In terms of the warm infrastructure hosted in Azure, we have an Azure VM, which is scaled back to the minimum level that allows us to apply the database log files and deploy updated code each time a release is done to the live system.
We continuously apply transaction logs to the Azure fail-over VM database – a check is done for new logs files every 1 minute.
The devops procedures in place perform deployment of any new code to the fail-over VM in addition to the Live environment upon each live release.
Once the fail-over process has been initiated, the first step is to scale the VM up to production specification level, so that applying the final database logs can be done as quickly as possible and the application is ready to take on production loads.
Next we confirm that the database on fail-over has applied all available logs, after which the database can be brought online. The final step for the failover is updating the DNS entries for the application. This is acheived by manually updating the DNS on Azure DNS Zones, where we update the DNS to point to the VM IP address.