Pyspark Data Engineer

Hyderabad, India

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

DATAECONOMY

Enabling Businesses to Monetize Data at Data Speeds with cutting edge Technology Services and Solutions. Big Data Management, Cloud enablement, Data Science, etc..

View all jobs at DATAECONOMY

Apply now Apply later

Job Title: PySpark Data Engineer
Experience: 5 – 8 Years
Location: Hyderabad
Employment Type: Full-Time

 

Job Summary:

We are looking for a skilled and experienced PySpark Data Engineer to join our growing data engineering team. The ideal candidate will have 5–8 years of experience in designing and implementing data pipelines using PySpark, AWS Glue, and Apache Airflow, with strong proficiency in SQL. You will be responsible for building scalable data processing solutions, optimizing data workflows, and collaborating with cross-functional teams to deliver high-quality data assets.

 

 

Key Responsibilities:

·         Design, develop, and maintain large-scale ETL pipelines using PySpark and AWS Glue.

·         Orchestrate and schedule data workflows using Apache Airflow.

·         Optimize data processing jobs for performance and cost-efficiency.

·         Work with large datasets from various sources, ensuring data quality and consistency.

·         Collaborate with Data Scientists, Analysts, and other Engineers to understand data requirements and deliver solutions.

·         Write efficient, reusable, and well-documented code following best practices.

·         Monitor data pipeline health and performance; resolve data-related issues proactively.

·         Participate in code reviews, architecture discussions, and performance tuning.



Requirements

·         5–8 years of experience in data engineering roles.

·         Strong expertise in PySpark for distributed data processing.

·         Hands-on experience with AWS Glue and other AWS data services (S3, Athena, Lambda, etc.).

·         Experience with Apache Airflow for workflow orchestration.

·         Strong proficiency in SQL for data extraction, transformation, and analysis.

·         Familiarity with data modeling concepts and data lake/data warehouse architectures.

·         Experience with version control systems (e.g., Git) and CI/CD processes.

·         Ability to write clean, scalable, and production-grade code.



Benefits

Company standard benefits.
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Airflow Architecture Athena AWS AWS Glue CI/CD Data pipelines Data quality Data warehouse Engineering ETL Git Lambda Pipelines PySpark SQL

Perks/benefits: Health care

Region: Asia/Pacific
Country: India

More jobs like this