Job Title: Senior Data Engineer with Talend Cloud (8.0)
Location : Penske, PA (Hybrid) (Local only)
Duration Contract to Hire
12+Years experience
The Team
You will be part of a high-performing Data Engineering & Analytics team responsible for building scalable, cloud-native data platforms on AWS. The team partners closely with product, analytics, and business stakeholders to deliver reliable data pipelines that support advanced analytics, reporting, and data-driven decision-making.
This team emphasizes modern data engineering practices, reusable frameworks, performance optimization, and production-grade data solutions.
The Role
As a Senior Data Engineer, you will design, build, and maintain end-to-end data pipelines leveraging Talend Cloud (8.0), PySpark, and AWS services. You will play a key role in ingesting, transforming, and optimizing large-scale structured and unstructured datasets while ensuring scalability, performance, and data quality across the platform.
Key responsibilities include:
- Designing and developing ETL/ELT workflows using Talend 8.0 on Cloud
- Integrating data from APIs, flat files, and streaming sources
- Ingesting and managing data in AWS S3–based data lakes
- Developing PySpark jobs for large-scale data processing and transformations
- Implementing Spark SQL for complex transformations and schema management
- Building and supporting cloud-native data pipelines using AWS services such as Glue, Lambda, Athena, and EMR
- Applying performance tuning and optimization techniques for Spark workloads
What You Will Bring
- 10+ years of experience in data engineering, big data, or ETL development roles
- Strong hands-on expertise with Talend ETL (Talend Cloud / Talend 8.x)
- Advanced experience in PySpark and Spark SQL for large-scale data processing
- Proficiency in Python, including building reusable data transformation modules
- Solid experience with AWS data services, including:
- S3 for data lake storage and lifecycle management
- Glue for ETL/ELT orchestration
- Lambda for event-driven processing
- Athena for serverless analytics
- EMR for Spark/PySpark workloads
- Strong understanding of ETL/ELT patterns, data lakes, and distributed systems
- Ability to optimize performance, ensure data quality, and build production-ready pipelines
- Excellent collaboration skills and experience working with cross-functional teams
Regards
Sangeetha
—
—