Location: San Diego – Day 1 Onsite – Local Only
Exp: 12 years
Skills: AWS, Azure, or GCP , Kafka, Databricks, PySpark, Python, and SQL
Job Description:
Role Summary
We are seeking a highly skilled Lead Data Engineer with deep, hands-on experience in building large-scale data ingestion pipelines, real-time streaming solutions, and high-performance data processing systems. The ideal candidate excels at writing clean, efficient code, refactoring complex systems, improving scalability and performance, addressing production issues, and delivering reliable data solutions across cloud platforms such as AWS, Azure, or GCP.
Key Responsibilities
- Design, build, and maintain high-volume data ingestion and processing pipelines for batch and real-time workloads.
- Implement and optimize real-time streaming pipelines using platforms such as Kafka.
- Develop scalable data solutions using Databricks, PySpark, Python, and SQL.
- Perform core refactoring to modernize and optimize existing pipelines and data services.
- Build robust, fault-tolerant pipelines capable of large-scale, high-throughput data processing.
- Write unit tests, automate validation, and ensure high code quality and reliability.
- Integrate pipelines into CI/CD workflows to streamline and automate deployment processes.
- Identify, troubleshoot, and fix production issues, ensuring system reliability and stability.
- Address performance bottlenecks and implement improvements for scalability, throughput, and efficiency.
- Work extensively across AWS, Azure, or GCP cloud environments and cloud-native data services.
- Design and orchestrate end-to-end pipelines using workflow and orchestration tools.
- Collaborate with Data Scientists and BI Engineers to deliver clean, analytics-ready datasets.
- Communicate complex technical topics clearly to non-technical stakeholders.
Required Qualifications
- 8+ years of hands-on data engineering experience building and maintaining large-scale data systems.
- Proven experience with high-volume data ingestion, ETL/ELT, and real-time data processing.
- Strong expertise with Kafka or similar streaming technologies.
- Advanced proficiency in Databricks, PySpark, Python, and SQL.
- Experience with core refactoring, improving code maintainability and system performance.
- Demonstrated ability to design and build scalable, low-latency data pipelines.
- Strong skills in debugging, performance optimization, and pipeline tuning.
- Hands-on experience with at least one major cloud platform: AWS, Azure, or GCP.
- Experience writing unit tests and integrating solutions with CI/CD pipelines.
- Strong problem-solving skills and excellent communication abilities.
Mandatory Skills: Enterprise Platform Engineering FrontEnd .
Warm Regards,
Prema
Reveille Technologies, Inc
prema@reveilletechnologies.com
Desk Number : (704) 444 -0697 Ext 829