Job Title: PySpark Consultant
Location: Irving, TX ( 3 days from Office )
Experience Required: Minimum 10 Years
Responsibilities:
- Experience with big data processing and distributed computing systems like Spark.
- Implement ETL pipelines and data transformation processes.
- Ensure data quality and integrity in all data processing workflows.
- Troubleshoot and resolve issues related to PySpark applications and workflows.
- Understand source, dependencies and data flow from converted PySpark code.
- Strong programming skills in Python and SQL.
- Experience with big data technologies like Hadoop, Hive, and Kafka.
- Understanding of data warehousing concepts and relational databases like SQL.
- Demonstrate and document code lineage.
- Integrate PySpark code with frameworks such as Ingestion Framework, DataLens, etc.,
- Ensure compliance with data security, privacy regulations, and organizational standards.
- Knowledge of CI/CD pipelines and DevOps practices.
- Strong problem-solving and analytical skills.
- Excellent communication and leadership abilities.
Qualifications:
- 4+ years of experience in big data development, Hadoop, Hive & Spark framework.
- Good to have experience in SAS.
- Strong Python, PySpark Development and SQL knowledge.
- Certification in big data or cloud technologies is preferred.
- Excellent communication, collaboration, and problem-solving skills.
—
—
