Get C2C/W2 Jobs & hotlist update

Mainframe Site Reliability Engineering (SRE) Lead – Columbus, OH Onsite

Mainframe Site Reliability Engineering (SRE) Lead

Columbus, OH Onsite

 

Role Summary

The Mainframe SRE Lead is responsible for ensuring the reliability, availability, performance, and scalability of enterprise mainframe platforms. This role blends traditional mainframe engineering with modern SRE principles, focusing on automation, observability, incident management, and continuous improvement. The lead will guide a team of engineers while partnering closely with application, infrastructure, and operations teams.

 

Key Responsibilities

  • Lead the Mainframe SRE team, providing technical direction, mentoring, and performance guidance
  • Own the reliability, availability, and resilience of mainframe environments (z/OS and related subsystems)
  • Define and implement SRE practices such as SLIs, SLOs, SLAs, error budgets, and reliability metrics
  • Drive automation to reduce manual operations, improve recovery time, and enhance system stability
  • Oversee monitoring, alerting, and observability for mainframe systems using modern and legacy tools
  • Lead incident management, root cause analysis (RCA), and post-incident reviews
  • Partner with application development teams to improve reliability, performance, and deployment practices
  • Plan and execute capacity management, performance tuning, and workload optimization
  • Ensure compliance with security, regulatory, and audit requirements
  • Lead disaster recovery (DR) planning, testing, and high-availability strategies
  • Champion continuous improvement, DevOps, and SRE culture within mainframe operations

 

Required Qualifications

  • 10+ years of experience in mainframe systems engineering or operations
  • Strong hands-on expertise with IBM z/OS
  • Experience with core mainframe components such as:
    • CICS, IMS, DB2
    • JES2/JES3
    • MQ, SMF, SDSF
  • Solid understanding of mainframe performance tuning and capacity planning
  • Experience leading production support and managing major incidents
  • Strong scripting and automation skills (REXX, JCL, CLIST, Python, or equivalent)
  • Familiarity with monitoring and scheduling tools (e.g., OMEGAMON, CA/BMC tools, Control-M)

 

 

Preferred Qualifications

  • Experience applying SRE principles in a mainframe or hybrid (mainframe + distributed) environment
  • Exposure to DevOps, CI/CD, and automation frameworks
  • Knowledge of Linux on Z and cloud integration patterns
  • Experience with resilience engineering, chaos testing, or fault injection concepts
  • Prior people-lead or technical-lead experience

 

Looking forward to work with you !!

 

Riya  | Manager – Talent Acquisition

United IT Solutions Inc | | Email: riya@uniteditinc.com

LinkedIn: https://www.linkedin.com/in/rajeshwari-r-riya-81848921a/

www.uniteditinc.com | 1212 Corporate Dr, Suite 555, Irving, TX – 75038

*United IT Solutions, Inc. is Celebrating 16 years in the IT Industry *

 

 

 
 
 

To unsubscribe from future emails or to update your email preferences click here

About Author

I’m Monica Kerry, a passionate SEO and Digital Marketing Specialist with over 9 years of experience helping businesses grow their online presence. From SEO strategy, keyword research, content optimization, and link building to social media marketing and PPC campaigns, I specialize in driving organic traffic, boosting rankings, and increasing conversions. My mission is to empower brands with result-oriented digital marketing solutions that deliver measurable success.

Leave a Reply

Your email address will not be published. Required fields are marked *

×

Post your C2C job instantly

Quick & easy posting in 10 seconds

Keep it concise - you can add details later
Please use your company/professional email address
Simple math question to prevent spam