Daily Operations
Quick Health Check
# Check all service statuses
sudo systemctl status backup-*
# Check disk space
df -h /var/lib/backup/repo
# View recent snapshots
xreplicator snapshots --server localhost:50051Monitor these daily:
- Service status for all components
- Logs for errors or warnings
- Disk space on the backup server
- Confirmation that backups are completing
Backup Strategies
Full Backup Frequency
| Frequency | Recommended For |
|---|---|
| Daily | Critical systems with high change rates |
| Weekly | Most production systems (recommended) |
| Monthly | Archival or low-change systems |
Incremental Backup Frequency
| Frequency | Recommended For |
|---|---|
| Hourly | Critical databases, high-change systems |
| Every 2–4 hours | Typical production systems |
| Daily | Low-change systems |
Retention Policy Guidelines
- Full backups: Keep 4–12 (1–3 months of history)
- Incremental backups: Keep 24–168 (1–7 days of hourly snapshots)
- Adjust based on storage capacity and recovery time objectives
Maintenance Schedule
Weekly
- Review backup logs for errors
- Check repository disk space
- Verify cloud sync status (if configured)
- Test a restore from a recent snapshot
Monthly
- Review and adjust retention policies
- Run compaction (if not automated)
- Verify license expiry date
- Review and update configurations
Quarterly
- Perform a full disaster recovery test
- Review and update backup strategies
- Audit access and permissions
- Update software packages
Key Metrics to Monitor
| Metric | Alert Threshold |
|---|---|
| Backup success/failure rate | Any failure |
| Repository disk usage | 80% full |
| License expiry | 30 days before expiry |
| Agent connectivity | On disconnection |
| Cloud sync status | On failure (if configured) |
Best Practices
Configuration
- Use a consistent
fixed_block_size_mbacross all agents and the server - Keep
chunk_size_avg_kbconsistent to preserve deduplication - Enable TLS for all production gRPC connections
- Store cloud credentials securely — avoid hardcoding in config files
Security
- Restrict network access to the backup server (port 50051)
- Use TLS encryption for gRPC communication
- Use strong, unique credentials for cloud storage
- Rotate credentials on a regular schedule
Performance
- Match pipeline settings (
workers,batch_size,max_pipeline_memory_mb) to available resources - Use compression for network-backed storage
- Enable eBPF change tracking for faster incremental backups
- Monitor and tune batch sizes based on actual network throughput
Reliability
- Test restores regularly — a backup you haven’t restored is an untested backup
- Maintain multiple full backups
- Use cloud sync for offsite/disaster recovery backups
- Document your recovery procedures and keep them up to date
Documentation
- Maintain a list of all backed-up systems and their schedules
- Document the restore procedure for each system type
- Keep copies of configuration files in version control
- Record any configuration changes with the reason and date
Last updated on