DR vs. business continuity
Two related terms. Disaster recovery (DR) is the IT-focused subset: restoring systems, applications, and data. Business continuity (BC) is the broader plan: how the whole business keeps operating during a disruption — alternate locations, manual workarounds, customer communication, payroll. DR is a component of BC. A small business usually combines them into one document; this article focuses on the DR (IT) portion, which is the part that determines whether you recover from ransomware in hours or weeks.
The seven sections of a small-business DR plan
1. Scope and assumptions
What this plan covers (which systems, which locations, which data) and the scenarios it addresses (ransomware, hardware failure, site loss, extended ISP/cloud outage). Plus the assumptions: where the backups live, who has authority to declare a disaster, what "recovered" means.
2. RTO and RPO targets, per system
Not every system needs the same recovery speed. List the critical systems and set two targets for each:
- RTO (Recovery Time Objective) — max acceptable downtime. The EHR might be 4 hours; the marketing file share might be 3 days.
- RPO (Recovery Point Objective) — max acceptable data loss. The transaction database might be 15 minutes; the file share might be 24 hours.
These targets drive the backup design (frequency, immutability, local-vs-cloud). See offsite vs cloud backup for how the targets map to architecture.
3. The recovery runbook
The step-by-step technical procedure for each scenario, in priority order. For a ransomware scenario:
- Isolate — disconnect affected systems, preserve forensic evidence.
- Assess — scope of encryption, what backups are clean, what the last good restore point is.
- Notify — cyber-insurance carrier (within the policy window), affected parties per regulatory requirements.
- Recover — restore systems in priority order from the last clean immutable backup.
- Validate — confirm each system works before reconnecting it.
- Reconnect — bring the network back online in stages.
- Post-incident — root-cause analysis, control improvements, plan update.
The runbook names specific systems, specific backup locations, specific recovery tools. "Restore from backup" is not a runbook step; "restore the SQL server from the immutable Datto cloud copy using the appliance bare-metal-restore wizard" is.
4. The contact tree
Who to call, in what order, with what information. The IT contact (internal or MSP), the cyber-insurance broker and carrier hotline, the leadership decision-makers, the key vendors (ISP, M365 support, LOB software vendor), legal counsel, and — for regulated entities — the regulator notification contacts. Crucially, this list lives somewhere accessible when the network is down: printed, in a phone, in a separate cloud doc, not only on the file server that's currently encrypted.
5. Backup inventory and recovery sources
Exactly where every recoverable copy lives, how to authenticate to it, and how current it is. The local appliance, the immutable cloud copy, the SaaS backup of Microsoft 365, the recovery keys. With the credentials stored securely but accessibly (a password manager that isn't dependent on the down systems, plus a break-glass procedure).
6. Roles and authority
Who declares a disaster (the trigger that activates the plan). Who leads recovery. Who talks to customers. Who talks to the press if it comes to that. Who authorizes spending (emergency hardware, incident-response retainer). In a small business these may be one or two people — but write down who, so there's no hesitation at hour one.
7. The test record
When the plan was last tested, what was tested, what failed, what got fixed. This section is what separates a plan from a binder. More on this below.
The test is the whole point
Most DR plans fail not because they're badly written but because they've never been run. The test reveals what the document hides:
- The recovery credentials expired six months ago.
- A critical system nobody documented isn't in the backup scope.
- The restore that the plan assumed would take 2 hours actually takes 9.
- The one person who knows how to restore the LOB database left the company.
- The "offsite" backup is replicating to a location that the same ransomware reached.
Two kinds of test, both at least annual:
- Tabletop exercise — the team walks through a scenario verbally. Cheap, fast, catches process and contact-tree gaps.
- Technical restore test — actually restore a system from backup to a sandbox and confirm it works. Catches the technical gaps. Backups should get restore-tested quarterly regardless (see backup is the answer; restore is the test).
The compliance angle
A written, tested DR / contingency plan isn't just good practice — it's increasingly required:
- HIPAA requires a contingency plan (Security Rule §164.308(a)(7)), including a data-backup plan, a disaster-recovery plan, and an emergency-mode operation plan.
- FTC Safeguards Rule requires financial institutions to have a written incident-response and recovery plan.
- CJIS requires an incident-response plan with recovery procedures for municipal law-enforcement IT.
- Cyber-insurance carriers ask whether you have a tested DR plan on the application — and may adjust the premium or the coverage based on the answer.
How a Micro-IT plan handles DR
Every Micro-IT client gets a written DR runbook as part of the managed engagement, built around their specific RTO/RPO targets, backed by image-level local backup replicated to immutable cloud storage. Restores are tested quarterly; the full DR plan gets a tabletop exercise annually. The incident-response contact tree and the cyber-insurance notification flow are documented and kept current. For regulated clients (HIPAA, GLBA, CJIS), the DR plan maps to the specific regulatory contingency-plan requirements. See the security page for the incident-response posture, or get a quote scoped to your environment.
