Get C2C/W2 Jobs & hotlist update

C2C Job opening – Data Engineer with Cloud Data Integration & Transformation – Remote

Job Title

: Data Engineer with Cloud Data Integration & Transformation

Location

: Remote (with some Travel to NC, Client will pay for travel)

Type

: 12 Months

 

 

Note:-

  1. Candidates need to use their own Laptop (Minimum 16 GB)
  2. Some travel to NC will be required (Client will pay for travel to NC)

 

About the Role:

We are seeking a hands-on Data Engineer to develop and maintain scalable data pipelines and transformation routines within a modern Azure + Databricks environment. This role is focused on executing ingestion, cleansing, standardization, matching, merging, and enrichment of complex legacy datasets into a governed data Lakehouse architecture.

The ideal candidate brings deep experience with Spark (PySpark), Delta Lake, Azure Data Factory, and data wrangling techniques — and is comfortable working in a structured, code-managed, team-based delivery environment.


Key Responsibilities:

 

Data Cleansing & Transformation:

Apply cleansing logic for deduplication, parsing, standardization, and enrichment based on business rule definitions.

Use Spark-Cobol Library to parse EBCDIC/COBOL-formatted VSAM files into structured DataFrames.

Maintain 'bronze → silver → gold' structured layers and ensure quality during data transformations.

Support classification and mapping logic in collaboration with analysts and architects.

 

Observability, Testing & Validation:


Collaboration & Profiling:

  • Collaborate with architects to ensure alignment with data platform standards and governance models.
  • Work with analysts and SMEs to profile data, refine cleansing logic, and conduct variance analysis using Databricks Notebooks and Databricks SQL Warehouse.
  • Support metric publication and lineage registration using Microsoft Purview and Unity Catalog and contribute to profiling datasets for Power BI consumption.

 

 Required Skills & Experience:

  • 5+ years of experience in data engineering or ETL development roles.

 

Proficiency in:
Databricks, PySpark, SQL
Delta Lake and Azure Data Lake Storage Gen2
Azure Data Factory for orchestration and event-driven workflows

 

Experience with:
Cleansing, deduplication, parsing, and merging of high-volume datasets
Parsing EBCDIC/COBOL-formatted VSAM files using Spark-Cobol Library
Connecting to Db2 databases using JDBC drivers for ingestion

 

Familiarity with:
Git, Azure DevOps Repos & Pipelines Great Expectations or dbt for validation Azure Monitor + Log Analytics for job tracking and alerting Azure Key Vault for secrets and credentials
Microsoft Purview and Unity Catalog for metadata and lineage registration

 

Regards,

Lakkumanan S

DataCaliper LLC

+1(919) 666-6447

lakkumanan.s@datacaliper.com

🔔 Get our daily C2C jobs / notifications on WHATSAPP 

About Author

JOHN KARY graduated from Princeton University in New Jersey and backed by over a decade, I am Digital marketing manager and voyage content writer with publishing and marketing excellency, I specialize in providing a wide range of writing services. My expertise encompasses creating engaging and informative blog posts and articles.
I am committed to delivering high-quality, impactful content that drives results. Let's work together to bring your content vision to life.

Leave a Reply

Your email address will not be published. Required fields are marked *