Hi,
Hope you are doing good,
Job Title: Senior Azure Databricks Developer – Azure Databricks / Python / Spark Streaming
Location: Pleasanton, CA – 3 days onsite
Duration: 12+ Months Contract
Location: Pleasanton, CA – 3 days onsite
Duration: 12+ Months Contract
Mandatory Areas
Must Have Skills –
Skill 1 – Yrs of Exp – Azure Databricks
Skill 2 – Yrs of Exp – Python
Skill 3 – Yrs of Exp – Spark Streaming
Must Have Skills –
Skill 1 – Yrs of Exp – Azure Databricks
Skill 2 – Yrs of Exp – Python
Skill 3 – Yrs of Exp – Spark Streaming
About the Role:
We are looking for a highly skilled Senior Azure Databricks (ADB) Developer to join our Data Engineering team. This role involves developing large-scale batch and streaming data pipelines on Azure Cloud. The ideal candidate will have strong expertise in Python, Databricks Notebooks, Apache Spark (including Structured Streaming), and real-time integration with Kafka. You will work with both relational databases like DB2 and NoSQL systems such as MongoDB, focusing on performance optimization and scalable architecture.
We are looking for a highly skilled Senior Azure Databricks (ADB) Developer to join our Data Engineering team. This role involves developing large-scale batch and streaming data pipelines on Azure Cloud. The ideal candidate will have strong expertise in Python, Databricks Notebooks, Apache Spark (including Structured Streaming), and real-time integration with Kafka. You will work with both relational databases like DB2 and NoSQL systems such as MongoDB, focusing on performance optimization and scalable architecture.
Key Responsibilities:
• Design and Develop: Create real-time and batch data pipelines using Azure Databricks, Apache Spark, and Structured Streaming.
• Data Processing: Write efficient ETL scripts and automate workflows using Python.
• Data Integration: Integrate with various data sources and destinations, including DB2, MongoDB, and other enterprise-grade data systems.
• Performance Optimization: Tune Spark jobs for optimal performance and cost-effective compute usage on Azure.
• Collaboration: Work with platform and architecture teams to ensure secure, scalable, and maintainable cloud data infrastructure.
• CI/CD Support: Implement CI/CD for Databricks pipelines and notebooks using tools like GitHub and Azure DevOps.
• Stakeholder Communication: Interface with product owners, data scientists, and business analysts to translate data requirements into production-ready pipelines.
• Design and Develop: Create real-time and batch data pipelines using Azure Databricks, Apache Spark, and Structured Streaming.
• Data Processing: Write efficient ETL scripts and automate workflows using Python.
• Data Integration: Integrate with various data sources and destinations, including DB2, MongoDB, and other enterprise-grade data systems.
• Performance Optimization: Tune Spark jobs for optimal performance and cost-effective compute usage on Azure.
• Collaboration: Work with platform and architecture teams to ensure secure, scalable, and maintainable cloud data infrastructure.
• CI/CD Support: Implement CI/CD for Databricks pipelines and notebooks using tools like GitHub and Azure DevOps.
• Stakeholder Communication: Interface with product owners, data scientists, and business analysts to translate data requirements into production-ready pipelines.
Required Skills:
• 10+ years of experience in data engineering
• Python Proficiency:
• Data Manipulation: Using libraries like Pandas and NumPy for data manipulation and analysis.
• Data Processing: Writing efficient ETL scripts.
• Automation: Automating repetitive tasks and workflows.
• Debugging: Strong debugging skills to troubleshoot and optimize code.
• Database Management:
• SQL: Advanced SQL skills for querying and managing relational databases.
• NoSQL: Experience with NoSQL databases like MongoDB or Cassandra.
• Data Warehousing: Knowledge of data warehousing solutions like Google BigQuery or Snowflake.
• Big Data Technologies:
• Kafka: Knowledge of data streaming platforms like Apache Kafka.
• Version Control:
• Git: Using version control systems for collaborative development.
• Data Modeling:
• Schema Design: Designing efficient and scalable database schemas.
• Data Governance: Ensuring data quality, security, and compliance.
• Database Management:
• DB2: Understanding of DB2 architecture, SQL queries, and database management.
• MongoDB: Knowledge of MongoDB schema design, indexing, and query optimization.
• Programming Skills:
• Proficiency in languages such as Java, Python, or JavaScript to write scripts for data extraction and transformation.
• Experience with BSON (Binary JSON) for data conversion.
• Cloud Services:
• Experience with cloud platforms like AWS or Azure for deploying and managing databases.
• 10+ years of experience in data engineering
• Python Proficiency:
• Data Manipulation: Using libraries like Pandas and NumPy for data manipulation and analysis.
• Data Processing: Writing efficient ETL scripts.
• Automation: Automating repetitive tasks and workflows.
• Debugging: Strong debugging skills to troubleshoot and optimize code.
• Database Management:
• SQL: Advanced SQL skills for querying and managing relational databases.
• NoSQL: Experience with NoSQL databases like MongoDB or Cassandra.
• Data Warehousing: Knowledge of data warehousing solutions like Google BigQuery or Snowflake.
• Big Data Technologies:
• Kafka: Knowledge of data streaming platforms like Apache Kafka.
• Version Control:
• Git: Using version control systems for collaborative development.
• Data Modeling:
• Schema Design: Designing efficient and scalable database schemas.
• Data Governance: Ensuring data quality, security, and compliance.
• Database Management:
• DB2: Understanding of DB2 architecture, SQL queries, and database management.
• MongoDB: Knowledge of MongoDB schema design, indexing, and query optimization.
• Programming Skills:
• Proficiency in languages such as Java, Python, or JavaScript to write scripts for data extraction and transformation.
• Experience with BSON (Binary JSON) for data conversion.
• Cloud Services:
• Experience with cloud platforms like AWS or Azure for deploying and managing databases.
Preferred Skills:
• Experience with Java or Scala in Spark streaming.
• Familiarity with Azure services like Data Lake, Data Factory, Synapse, and Event Hubs.
• Background in building data platforms in regulated or large-scale enterprise environments.
• Experience with Java or Scala in Spark streaming.
• Familiarity with Azure services like Data Lake, Data Factory, Synapse, and Event Hubs.
• Background in building data platforms in regulated or large-scale enterprise environments.
To unsubscribe from future emails or to update your email preferences click here