Location: Atlanta, GA/Frisco, TX ,Bellevue, WA/Seattle, WA Locals Only
Locals Only!!
Job Responsibilities
- Develop and maintain batch and streaming data pipelines using Azure Data Factory (ADF), Databricks, and Spark that ingest customer data from dozens of source systems into CDP
- Build data ingestion frameworks from APIs, event streams (Event Hub/Kafka), and enterprise databases — ensuring every customer touchpoint feeds into the unified profile
- Perform data transformation, cleansing, and enrichment using Python and Spark to maintain the accuracy and completeness that an authoritative source of truth demands
- Implement data validation and quality checks at every pipeline stage — catching anomalies, duplicates, and schema drift before they erode trust in customer profiles
- Work with ADLS, Snowflake, SQL Server, and Cosmos DB to store and process customer data across bronze, silver, and gold layers
- Support data modeling and schema design under guidance from senior engineers, ensuring alignment with CDP's unified customer profile standards
- Build curated data layers that serve analytics, reporting, personalization, and AI/ML use cases with production-ready, well-documented datasets
- Ensure data pipelines meet SLA requirements for timeliness — customer data must be current and available when contextual experiences demand it
- Support data pipelines used in identity resolution and AI/ML workflows by ensuring data readiness, consistency, and lineage traceability
- Participate in job orchestration (Control-M), CI/CD pipelines, and monitoring — maintaining the operational discipline required for a platform serving the entire US adult population
- Troubleshoot pipeline failures and resolve production issues with urgency, understanding that data delays directly impact customer experiences
- Contribute to documentation, data dictionaries, and team knowledge sharing to strengthen the platform's institutional knowledge
Education and Work Experience
- Bachelor's degree in Computer Science, Engineering, or related field
- 3+ years of experience in data engineering or related roles
- Experience working in cloud-based data platforms (Azure preferred)
- Experience with Agile development methodologies
Technical Skills
- Programming: Python (required), SQL, basic Spark (PySpark)
- Azure stack: ADF, ADLS, Databricks (working knowledge)
- Databases: Snowflake, SQL Server, Cosmos DB (basic to intermediate)
- Solid understanding of ETL/ELT concepts and data pipeline design patterns
- Hands-on knowledge with APIs and data integration patterns for diverse source systems
- Experience with data quality checks, logging, monitoring, and alerting
- Understanding of data lineage, metadata management, and how data supports downstream AI/ML workflows
- Knowledge of CI/CD and orchestration tools like Control-M
Knowledge, Skills, and Abilities
- Strong analytical and problem-solving skills with attention to data accuracy
- Ability to work with large datasets and distributed systems at population scale
- Eagerness to learn and adapt to emerging AI and data technologies
- Understanding of why data quality and timeliness matter for customer-facing experiences
- Good collaboration and communication skills across engineering and business teams
—
Best Regards
Sireesha | Senior Technical Recruiter
2100 Geng Rd, #210
Palo Alto, CA 94303
—