Low cost Automatic Failover-Disaster Recovery scenario for Microsoft File service without using expensive SAN replication technology.

Note before you start:Those who are going to implement this solution in their production environment, please visit the “Information about Microsoft support policy for a DFS-R and DFS-N deployment scenario” link (http://support.microsoft.com/kb/2533009) and understand the Microsoft support policy regarding the solution.

The objective of this post is to provide a low cost high availability Disaster Recovery solution for Microsoft file services.  Generally hosting the File Service on a Microsoft failover cluster is sufficient enough to provide high availability of user data. However, for an organization whose data availability is business critical or are using a VDI solution where user profile/data are stored on file servers, the file service should be available even if the cluster itself is offline due to a disaster in the datacenter.

The following design provides you File Service availability through the Disaster Recovery site without utilizing any expensive SAN storage replication technique.

In this scenario, we will require a two node windows failover cluster in production as well as disaster recovery sites to host the file service. Each cluster will be connected to their respective local SAN storage within their sites.

FileCluster

Shared folder configured on a Client Access Point will be used as a target folder for DFS Namespace. Since the Client Access Point can withstand cluster node failure, the Shared folder will be available even one of the cluster node is offline for maintenance.

DFS01

We need to take this scenario to a further stage where the service can be available even when the whole production datacenter is down. A multi-site cluster (geo cluster) using SAN replication would be very expensive in terms of license cost and complexity in implementation.  By tweaking the built-in replication feature (DFS-R) in Windows server Operating System, the above requirement can be achieved without any additional cost.

Step1: Configure domain-based DFS namespace, add both (production as well as DR) servers as Namespace Servers.

DFS02

Step2: Create a Shared Folder on Client Access Point, link the shared folder to the above DFS namespace.

Step3: Create a Shared Folder on DR Client Access Point, add the shared folder as secondary Target Share for the production shared folder.

DFS04

Step4: Run through the “New Replicated Folders Wizard” to configure the shared folder in DR site as full mesh replica.

DFS03

Step5: Set the target priority by configuring referral order and then disable the DR Target Share.

If both targets are enabled, there is a chance that users start writing into different locations overriding the target priority, this causes DFS Replication service to encounter conflicting data and sharing violations.

DFS05

Disabling one of the folder targets leaves only one target enabled in the namespace ensuring that the users will always hit on that folder target, the other target folder will not have any SMB sessions established from end users.

DFS06

Data availability during disaster:

In ideal working condition, users are always connected to the shared target on production site, the data will be replicated to the shared folder in DR site through DFS-Replication configuration. If the production datacenter is down, the target share in DR site needs to be enabled and whenever users access the folder in the DFS namespace, they will be redirected to the active target share in DR site.

The enabling of standby target share can be automated using File Services Management Pack for Operations Manager to have a smooth and automatic failover of namespace folder. The File Services Management Pack for Operations Manager, monitors the status of production Target Share and through a remediation task, it enables the standby target share automatically.

Following command sets folder targets referral status to “Enabled”

dfsutil.exe property state online “ <UNC of DFS Namespace>”  “<UNC of Shared folder>”

Example: dfsutil.exe property state online “\\acme.com\UserData\Data”  “\\ACME-CAP01-DR\Data”

Converting unsupported scenario to supported scenario:

As per the above mentioned Microsoft support policy for a DFS-R and DFS-N deployment, even if you enable only one folder target at a time, configuring one namespace folder to have multiple folder targets is not supported. In such case, just delete the secondary folder target (do not delete replication) and use dfsutil.exe target add command to create link to the secondary folder target.

Following command adds folder targets to the namespace:

dfsutil.exe target add “ <UNC of DFS Namespace>”  “<UNC of Shared folder>”

Example: dfsutil.exe target add “\\acme.com\UserData\Data”  “\\ACME-CAP01-DR\Data”

As explained earlier, you can use any System Monitoring Solution to automate the above process by configuring auto remediation task.