Overview
XReplicator v1.3.0 can be used to set up disaster recovery for Azure workloads recovering into Azure. The web UI orchestrates the DR flow: source disks are kept in sync, readiness is verified through precheck, and a failover blueprint is used to create the recovered Azure VM in the target landing zone.
In observed Azure-to-Azure failover drills, the recovered server can be started in under 2 minutes once staging disks are already healthy and synced. Actual RTO depends on Azure control-plane latency, VM size, OS boot time, networking, and application startup.
When to Use This
Use this flow when:
- The protected workload runs on Azure.
- The DR landing zone is also Azure.
- You want XReplicator to manage DR readiness and failover orchestration.
- You can keep staging disks attached to the DR-side environment and updated by XReplicator.
For AWS or GCP, use the cloud-specific cold DR runbooks until provider-level orchestration is added.
Architecture
The v1.3.0 Azure-to-Azure flow uses these components:
- Source VM agents back up OS and data disks.
- Backup server stores snapshots and coordinates DR state.
- DR manager keeps target staging disks synced to the desired snapshot.
- Web UI manages sources, cloud targets, blueprints, precheck, and failover.
- Azure target config stores the target subscription, region, resource group, network, and service principal details.
Setup Flow
- Install and start the XReplicator backup server in the Azure DR landing zone.
- Install agents on the Azure source VMs.
- Configure backup schedules and confirm snapshots are completing.
- Open the web UI and go to DR.
- Enable DR for each source disk that must be protected.
- Wait for each source to become healthy.
- Add an Azure cloud target with tenant, client, secret, subscription, and region.
- Create a DR blueprint for the workload.
- Select the source OS and data disks.
- Map the target resource group, VNet, subnet, VM size, and disk strategy.
- Run precheck.
- Trigger failover only when precheck is clean.
Readiness Gate
Precheck should pass only when:
- DR is enabled for every selected source disk.
- Source status is healthy.
- The applied snapshot matches the desired snapshot.
- The target Azure config is healthy.
- Blueprint rows include region, resource group, VNet, subnet, VM size, OS disk strategy, and data disk strategy.
Do not trigger failover while a source is pending, backfilling, verifying, or degraded.
Failover Behavior
During failover, XReplicator uses the blueprint to:
- Resolve the protected OS and data disks.
- Create Azure snapshots or managed disks based on the selected strategy.
- Create a network interface in the configured VNet/subnet.
- Create the recovered Azure VM with the selected size.
- Attach recovered data disks.
- Record operation history and row-level execution logs.
RTO Notes
The under-2-minute RTO observation applies to an Azure-to-Azure drill where DR staging disks were already synced before failover. For production planning, measure your own RTO under realistic conditions:
- VM size and disk count.
- OS boot time.
- Azure API response time in the selected region.
- Network security rules and route propagation.
- Application startup and dependency checks.
- Manual approval or DNS/load-balancer cutover steps.
Operational Checklist
- Run at least one failover drill per protected workload.
- Record the VM boot time and application-ready time separately.
- Keep Azure service principal credentials rotated and scoped to the DR resource groups they manage.
- Confirm staging disks are not mounted by other workloads.
- Keep a written cutover plan for DNS, load balancers, and application-specific validation.
- Treat failback as a separate controlled reverse-replication operation.
Current Scope
Supported in v1.3.0:
- Azure-to-Azure DR setup.
- DR source enablement and staging sync.
- Strict precheck.
- Azure target configuration.
- Blueprint-driven Azure failover.
- Operation history and row-level logs.
Not yet the primary path:
- AWS/GCP failover orchestration.
- Automated failback.
- Application-specific quiesce hooks.
Use the manual cloud runbooks for provider flows that are not yet orchestrated.