Role: Splunk Subject Matter Expert (SME) and Enterprise Monitoring Engineer
Location: Atlanta, GA & Frisco, TX
Contract
Mandatory Skills: Splunk Enterprise, Splunk, Dashboard Design, Monitoring Systems
Job Summary:
We are looking for a highly skilled Splunk Subject Matter Expert (SME) and Enterprise Monitoring Engineer to lead the design, implementation, and optimization of our monitoring and observability ecosystem. The ideal candidate will be an expert in Splunk, with a strong background in enterprise IT infrastructure, system performance monitoring, and log analytics. You will play a pivotal role in ensuring end-to-end visibility across our systems, applications, and services.
Key Responsibilities:
Splunk Administration & Engineering
• Serve as the SME for Splunk architecture, deployment, and configuration across the enterprise.
• Maintain and optimize Splunk infrastructure, including indexers, forwarders, search heads, and clusters.
• Develop and manage custom dashboards, alerts, saved searches, and visualizations.
• Ensure high availability, scalability, and performance of the Splunk environment.
• Creating dashboards, Reports, Alerts, Advance Splunk Search, Visualization, log parsing and external table lookups
• Expertise with SPL (Search Processing Language ) and understanding of Splunk architecture, including configuration files.
• Wide Experience in monitoring and troubleshooting applications using tools like AppDynamics, Splunk, Grafana, Argos ,OTEL, etc. to build observability for large-scale microservice deployments.
• Creating dashboards for various applications to monitor health, network issues and configure alerts.
• Excellent problem-solving, triaging, and debugging skills in large-scale distributed systems
• Establishing and documenting run books and guidelines for using the multi-cloud infrastructure and microservices platform.
• Experience in optimized search queries using summary indexing.
• Solid knowledge and experience in monitoring the Splunk infrastructure.
• Develop a long-term strategy and roadmap for AI/ML tooling to support the AI capabilities across the Splunk portfolio.
• Diagnose and resolve network-related issues affecting CI/CD pipelines, debug DNS, firewall, proxy, and SSL/TLS problems, and use tools like tcpdump, curl, and netstat for proactive maintenance.
Enterprise Monitoring & Observability
• Design and implement holistic enterprise monitoring solutions integrating Splunk with tools like AppDynamics, Dynatrace, Prometheus, Grafana, SolarWinds, or others.
• Collaborate with application, infrastructure, and security teams to define monitoring KPIs, SLAs, and alert thresholds.
• Build end-to-end visibility into application performance, system health, and user experience.
• Integrate Splunk with ITSM platforms (e.g., ServiceNow) for event and incident management automation.
Operations, Troubleshooting & Optimization
• Perform data onboarding, parsing, and field extraction for structured and unstructured data sources.
• Support incident response and root cause analysis using Splunk for troubleshooting and forensics.
• Regularly audit and optimize search performance, data retention policies, and index lifecycle management.
• Create runbooks, documentation, and SOPs for Splunk and monitoring tool usage.
Required Qualifications:
• 5+ years of experience in IT infrastructure, DevOps, or monitoring roles.
• 3+ years of hands-on experience with Splunk Enterprise as an admin, architect, or engineer.
• Experience designing and managing large-scale, multi-site Splunk deployments.
• Strong skills in SPL (Search Processing Language), dashboard design, and alerting strategies.
• Familiarity with Linux systems, scripting (e.g., Bash, Python), and APIs.
• Experience with enterprise monitoring tools and integration with Splunk (e.g., AppDynamics, Dynatrace, Nagios, Zabbix, etc.).
• Understanding of logging, metrics, and tracing in modern environments (on-prem and cloud).
• Strong understanding of network protocols, system logs, and application telemetry.
Preferred Qualifications:
• Splunk certifications (e.g., Splunk Certified Power User, Admin, Architect).
• Experience with Splunk ITSI, Enterprise Security, or Observability Suite.
• Knowledge of cloud-native environments (AWS, Azure, or GCP) and cloud monitoring integrations.
• Experience with log aggregation, security event monitoring, or compliance (e.g., PCI, HIPAA, SOX).
• Familiarity with CI/CD pipelines and GitOps practices.
Thanks & Regards,