Location: Remote
Duration: 12+ Months
Must have:
• Incident management
• Stakeholder management
• Executive communication
• Troubleshooting
Role Summary:
The Incident Commander is responsible for leading the end-to-end
management of major production incidents across Telecom digital
platforms, including Digital Commerce, Order Management, Payments,
Mobile/Web Applications, and Customer Data systems. This role operates
within a 24×7 centralized operations model and serves as the single
point of command during high-severity incidents, ensuring rapid
stabilization, clear decision-making, and effective stakeholder coordination. The Incident Commander combines strong technical depth with exceptional communication skills to lead large, cross-functional teams under pressure, minimize customer and business impact, and restore services within defined SLAs.
Responsibilities:
Major Incident Management & Command:
• Act as the Incident Commander for Sev-1 and Sev-2 incidents across
Telecom digital platforms.
• Own the incident lifecycle from detection through stabilization,
resolution, and post-incident review.
• Lead incident bridge calls with a large number of technical, business,
and executive stakeholders.
• Establish command-and-control during incidents, driving focus,
accountability, and rapid decision-making.
• Ensure accurate impact assessment and prioritization based on
customer, revenue, and regulatory impact.
Telecom Digital Platform Expertise:
• Lead incident response across:
• Digital Commerce platforms (customer acquisition, checkout,
promotions)
• Order Management and fulfillment systems
• Payments, billing integrations, and financial transaction flows
Mobile and web applications:
• Customer Information and data management platforms
• Quickly understand complex, distributed system interactions and
failure modes.
• Provide technical direction and guidance during root cause
identification and remediation.
Centralized Operations & 24×7 Support:
• Operate within a centralized 24×7 operations model supporting
mission-critical digital platforms.
• Coordinate across global onshore and offshore support teams, SREs, engineering, infrastructure, and vendors. • Ensure adherence to incident response SLAs, escalation paths, and operational runbooks. • Drive continuous improvement of incident response processes and tooling. Stakeholder & Executive Communication: • Serve as the single, authoritative voice during incidents for internal and external stakeholders. • Communicate incident status, impact, mitigation steps, and ETAs clearly and concisely.
• Manage executive-level updates and ensure consistent messaging across
all forums.
• Handle high-pressure situations with confidence, clarity, and
professionalism.
Technical Leadership & Problem Solving:
• Lead technical troubleshooting efforts without necessarily being
hands-on in code.
• Challenge assumptions, validate hypotheses, and drive teams toward
data-driven resolution paths.
• Ensure effective use of monitoring, logging, and observability tools.
• Balance speed of recovery with risk, customer impact, and system
integrity.
Post-Incident Review & Prevention:
• Facilitate post-incident reviews (PIRs / RCAs) with engineering and
operations teams.
• Ensure root causes are clearly identified and corrective actions are
defined and tracked.
• Identify systemic issues and recommend long-term preventive measures.
• Drive improvements in platform resilience, monitoring, automation, and
operational readiness.
Thanks
Jaydeep Singh
Technical Recruiter | Empower Professionals
jay@empowerprofessionals.com