Data Engineer
Pune, Maharashtra, India
Weekday
At Weekday, we help companies hire engineers who are vouched by other software engineers. We are enabling engineers to earn passive income by leveraging & monetizing the unused information in their head about the best people they have worked...This role is for one of the Weekday's clients
Min Experience: 2 years
Location: Pune
JobType: full-time
We are seeking a skilled and motivated Data Engineer with 2–4 years of hands-on experience in building and maintaining automated, production-grade data pipelines. The ideal candidate will be passionate about working on modern data platforms, enabling scalable and efficient data movement and transformation workflows. Exposure to MLOps practices and the pharmaceutical domain will be an added advantage.
You will collaborate closely with data scientists, ML engineers, business analysts, and DevOps teams to develop robust data solutions that are integral to our data-driven initiatives. This role requires a solid understanding of data engineering tools and best practices, cloud technologies, orchestration frameworks, and Agile processes.
Requirements
Key Responsibilities:
- Design, build, and deploy scalable data pipelines using PySpark and related big data technologies.
- Develop and manage end-to-end CI/CD pipelines to ensure reliable deployment and testing of data workflows.
- Implement orchestration of data workflows using Argo Workflows and Kedro pipelines.
- Integrate with cloud-native data platforms, preferably on AWS (e.g., S3, Glue, EMR, Lambda, Step Functions).
- Ensure robust monitoring, alerting, and logging for production pipelines using best-in-class practices.
- Collaborate with MLOps teams to support ML model deployment, versioning, and lifecycle management.
- Participate in Agile/Scrum ceremonies, effectively using tools like JIRA and Confluence for sprint tracking and documentation.
- Optimize performance and reliability of existing data pipelines, ensuring high data quality and availability.
- Contribute to the design of data architecture and ensure best practices for data governance, security, and compliance.
Skills and Qualifications:
- 2–4 years of relevant work experience as a Data Engineer or in a similar role.
- Proficiency in PySpark for big data processing and transformation.
- Experience with CI/CD tools and practices for automating deployments in data engineering workflows.
- Hands-on experience with AWS services, especially data and compute services such as S3, Glue, EMR, Lambda, and CloudWatch.
- Familiarity with orchestration frameworks like Argo Workflows and pipeline development using Kedro.
- Understanding of MLOps concepts and how data engineering integrates into the machine learning lifecycle.
- Exposure to Agile methodologies and tools like JIRA and Confluence for project collaboration and documentation.
- Excellent problem-solving skills, attention to detail, and ability to work independently and within a team.
Preferred Qualifications:
- Prior experience in the pharmaceutical or life sciences domain.
- Knowledge of data governance, data security, and compliance requirements in regulated industries.
- Familiarity with Docker or containerization for reproducible data environments is a plus.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Architecture AWS Big Data CI/CD Confluence Data governance Data pipelines Data quality DevOps Docker Engineering Jira Lambda Machine Learning MLOps Model deployment Pharma Pipelines PySpark Scrum Security Step Functions Testing
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.