Site Reliability Engineer (SRE)
Pittsburg (PA) / Lake Mary (Fl) (onsite)
Long Term Contract
Role Description
Requirements:
– Experience: 5+ years of experience in system administration, engineering, or a related field.
– Technical Skills: Proficiency in:
– Linux/Unix operating systems.
– Scripting languages (e.g. Python, Ruby, Bash).
– Monitoring and alerting tools (e.g. Prometheus, Grafana, PagerDuty).
– Cloud platforms (e.g. AWS, GCP, Azure).
– System Design: Strong understanding of system design principles, including scalability, reliability, and performance.
– Troubleshooting: Excellent troubleshooting skills, with ability to analyze complex system issues and identify root causes.
Nice to Have:
– Certifications: Relevant certifications, such as AWS Certified DevOps Engineer or Google Cloud Certified – Professional Cloud Developer, are desirable.
– Experience with Other Tools: Experience with other tools, such as configuration management tools (e.g. Puppet, Chef) or logging and analytics tools (e.g. ELK Stack).
Responsibilities and Expectations:
– System Reliability: Ensure systems are designed and implemented with reliability and scalability in mind.
– System Maintenance: Perform regular system maintenance to ensure high availability and performance.
– Incident Response: Respond to and resolve system incidents in a timely and effective manner.
– Collaboration: Collaborate with development teams to ensure systems meet business requirements and are designed with reliability and scalability in mind.
Skills and Qualifications:
– Technical Skills: Proficiency in Linux/Unix operating systems, scripting languages, monitoring and alerting tools, and cloud platforms.
– System Design: Strong understanding of system design principles, including scalability, reliability, and performance.
– Troubleshooting: Excellent troubleshooting skills, with ability to analyze complex system issues and identify root causes.
– Collaboration: Ability to collaborate with development teams to ensure systems meet business requirements and are designed with reliability and scalability in mind.
—
—