Hi, This is Sreekar from Codeforce360. One of our prominent clients is looking for a profile like yours. If you are available for a contract role, please review the JD below, share your updated resume and expected hourly rate. Job Title: SRE – Site Reliability Engineer Location: Wilmington, DE Duration: 6+ Month Job Description Primary Responsibilities: - 9 + Years of technical experience
- Troubleshoots incidents, conducts blameless post-mortems and ensures permanent closure of incidents.
- Engages with development team throughout the life cycle to help develop software for reliability.
- Applies analytics on historic data, such as incidents and usage patterns, to predict issues and take proactive action.
- Drives adoption of self-healing and resiliency patterns such as circuit breaker, bulkhead etc.
- Designs and conducts performance tests, identifies bottlenecks and opportunities for optimization.
- Defines and drives adoption of best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting.
- Designs, develops, tests and delivers software to automate manual operational work
- Deploys software and product upgrades.
- Facilitates maximum speed of delivery by objectively binding to error budgets of the service.
- Manages the effort split between manual operational work and engineering work.
- Coaches other team members and manages teams as needed.
Required Skills: - Excellent debugging and trouble shooting skills.
- Expert in performance monitoring and capacity management of large systems using various tools.
- Expert in at least one technology stack (Java/J2EE/Python) with designing, coding, testing, and delivering software.
- Expert in at least one of the relational databases (SQL Server, Oracle, DB2 etc.).
- Hands-on experience with cloud technologies (Cloud Foundry, Kubernetes, AWS).
- Hands-on experience with big data services (Hadoop, HDFS, Hive, Yarn, HBase, Kafka, Zookeeper).
- Working knowledge of Groovy, batch scripting, PowerShell or shell scripting.
- Experience developing, deploying and debugging distributed systems in a Linux, Hadoop environment.
- Experience with monitoring tools such as AppD, Splunk, ELK, Geneos.
- Analysis of SLI metrics and performance data. Interpreting and correlating it to SLOs and SLAs.
- Experience with deployment automation, CI/CD, DevOps, Jenkins, GIT, BitBucket.
- Experience with cloud/container environments, big data, analytical tools (Tableau, Alteryx).
- Expert practitioner in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm.
- Working knowledge of infrastructure components like routers, load balancers and networks.
- Comfortable working in Agile mode and proficient in continuous integration and continuous delivery.
- Solid understanding of micro-service design methodologies.
- Attention to detail and time-management skills.
| |
|