Data Engineer with Java + Azure
Hybrid @ Dayton, OH Need Local In person
Long Term
Contract
Must-Haves
• Java + Spark in production
• Azure Databricks + ADF
• ETL/ELT pipeline ownership
• SQL + NoSQL
• Data modeling experience
• Multithreading + performance tuning
• Git, Maven, JUnit
• Cloud data engineering experience
Nice-to-Haves
• Delta Lake
• Unity Catalog
• Hadoop ecosystem
• Data visualization exposure
• Experience writing technical design docs
• Design and implement models to transform complex, connected datasets into actionable business intelligence using graph analytics principles and technologies
• Core JAVA development including wrapper classes to support object-oriented features
• Write Java classes, build JARs, perform unit testing, and optimize code for performance
• Create entity models based on raw data sources
• Build data pipelines for ingestion and incremental updates
• Build automated test suites
• Validate data quality and investigate data discrepancies
• Review complex requirements and translate them into software designs and solutions
• Write optimized code to clean, transform, and analyze large datasets
• Knowledge sharing through peer programming, group code reviews
• Adhere to code standardization and team best practices
• Collaborate with team members, technical peers and stakeholders
• Test data solutions, monitoring jobs, and fixing issues
• Document technical specs and processes
• Operate in various development environments (Agile, Kanban, Waterfall)
• All other duties as assigned.
Skills:
• Meticulous attention to details and strong problem-solving abilities
• Ability to apply logic and critical thinking to interpret data patterns and connections between different data sources
• Able to work with ambiguity, gaining clarity using independent analysis and thorough investigation
• Ability to work with complex data models and data modeling principles
• Understanding of relational database principles
• High proficiency in core JAVA development language OOP, Collections, Multithreading, Data Structures and Exception Handling
• Proficiency with JUNIT testing framework, Maven build tool, GitHub version control and IntelliJ IDE.
• Proficient in Big Data Frameworks such as Apache Spark (preferred) or Hadoop
• Knowledgeable in Azure Databricks, Delta Lake, Spark Core, Azure Data Factory (ADF) and Unity Catalog
• Knowledgeable in analyzing data, finding data patterns, data visualization
• Ability to ingest and transform data using PySpark in Azure Databricks
• Understanding of key Data Warehousing and ETL/ELT Processes such as data pipelines and database management
• Ability to design, build, and maintain robust Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) pipelines
• Proficient in SQL and NoSQL databases, MySQL and Azure Cosmos DB are preferred
• Knowledgeable in object-oriented programming (OOP) concepts, design patterns, and general software architecture
• Knowledgeable in Cloud Platforms, such as Microsoft Azure (preferred) or AWS
• Experience in performance optimization, including multi-threading, concurrency, and memory management
• Ability to interpret complex data, visualize results, and explain insights to technical and non-technical stakeholders.
• Ability to collaborate with internal and external technology resources.
• Ability to write and review detailed technical specifications
• Knowledge of software development methodologies (e.g., Scrum, Kanban, Waterfall)
Qualifications:
10+ years of Software Development experience
B.S Engineering/Computer Science or equivalent required
Munesh
770-838-3829,
CYBER SPHERE LLC