Recovery & Resiliency Manager (Infrastructure & Production)
Today
Fort Mill, South Carolina, United States
Subscribe to job alerts
Get a weekly digest of the latest climate jobs from thousands of companies in your inbox.
Job Description
The Recovery & Resiliency Manager at E-Solutions in Fort Mill, SC, is responsible for ensuring the resilience and rapid recovery of critical infrastructure and production systems. Key duties include developing disaster recovery plans, leading incident resolution, and promoting observability for proactive monitoring. The role requires 5-10+ years of experience in IT disaster recovery and infrastructure operations, alongside strong communication and crisis management skills.
Recovery & Resiliency Manager (Infrastructure & Production)
Fort Mill, SC (Onsite)
Position Summary
The Recovery Manager is responsible for ensuring the availability, resilience, and rapid recovery of critical infrastructure and production systems. This role bridging infrastructure engineering and production support to drive "always-on" capabilities. The manager will define, test, and maintain disaster recovery (DR) plans, implement observability to proactively detect potential outages, lead major incident resolution, and conduct root cause analysis (RCA) to continuously improve service reliability.
Key Responsibilities
1. Resiliency Planning & Disaster Recovery (DR)
• Develop, maintain, and test comprehensive DR plans, runbooks, and Business Impact Analyses (BIA) for hybrid/cloud infrastructure.
• Define RTO (Recovery Time Objective) and RPO (Recovery Point Objective) targets, ensuring infrastructure design meets these requirements.
• Lead regular disaster recovery tests, simulation exercises, and tabletop drills, documenting outcomes and tracking remediation actions to closure.
• Apply infrastructure-as-code (IaC) principles to automate recovery processes.
2. Production Support & Incident Management
• Serve as a primary point of contact (POC) for major infrastructure incidents and high-profile disruptions.
• Coordinate technical recovery efforts across cross-functional teams (network, server, storage, database, cloud) during incidents.
• Lead Root Cause Analysis (RCA) and post-mortem investigations to identify and deploy countermeasures, ensuring incidents do not recur.
• Monitor production system performance and availability, optimizing for high availability (HA).
3. Observability & Monitoring
• Develop and promote a company-wide observability platform (e.g., Splunk, Datadog, Prometheus, Grafana) for real-time monitoring of infrastructure health.
• Establish and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
• Implement proactive monitoring, alerting, and automated healing, ensuring fast incident detection and recovery.
4. Leadership & Governance
• Provide executive-level reporting on resilience posture, test results, and material risks.
• Manage relationships with third-party vendors, partners, and service providers regarding service SLAs.
• Ensure adherence to industry frameworks and compliance requirements (e.g., NIST, ISO 22301, ITIL).
Required Skills & Qualifications
• Experience: 5-10+ years in IT disaster recovery, business continuity, production support, or infrastructure operations.
• Infrastructure: In-depth knowledge of on-premises (VMware, SAN/NAS, Linux/Windows) and Cloud (AWS, Azure) environments.
• Tools: Proficient in monitoring/observability tools (e.g., Datadog, Splunk, Dynatrace) and backup/replication technologies (e.g., Rubrik, Cohesity, Zerto).
• Methodology: Strong understanding of ITIL, DevOps practices, and incident management frameworks.
• Soft Skills: Excellent communication skills, crisis management abilities, and capability to work under pressure.
E-Solutions
|
More Sales / BD / Account Management jobs in climate
Today
Cleveland, United States

Today
Columbus, United States
Today
Canton, United States

Today
Youngstown, United States
Today
Ohio, United States
Today
Fargo, United States
Today
Bismarck, United States

Today
Raleigh, United States
Today
Raleigh, United States
Today
North Carolina, United States
Today
Burnsville, United States
Today
Charlotte, United States
Today
Raleigh, United States
Today
Durham, United States
Today
North Carolina, United States
Today
Charlotte, United States
Other jobs at E-Solutions
Today
Fort Mill, United States