Data Engineer
India
Important Information
Location: PAN India
Experience: 6+ Years
Job Mode: Full-time
Work Mode: Work from home
Job Summary
We are seeking a highly skilled Data Engineer with a strong focus on Apache Airflow to play a critical role in a large-scale data modernization project. The ideal candidate will be responsible for designing, implementing, and managing robust data pipelines using Apache Airflow to orchestrate complex data workflows, specifically focusing on migrating legacy SQL Server Agent jobs to Airflow and automating critical data processes, including those currently handled by client FTEs and related to real estate Deeds data.
Responsibilities and Duties
Apache Airflow Development and Implementation:
- Design, develop, and maintain scalable and efficient data pipelines using Apache Airflow to orchestrate a wide range of data processes, including data ingestion, transformation, validation, and loading.
- Specifically focus on migrating existing SQL Server Agent jobs related to Deeds and parcel data processes to Apache Airflow.
- Collaborate with the client’s internal data engineers, who will be assigned part-time to guide and support the Airflow implementation.
- Take ownership of automating critical data processes currently managed by 1 client FTE, ensuring a smooth transition and knowledge transfer.
Data Pipeline Orchestration and Integration:
- Seamlessly integrate Apache Airflow workflows with existing AWS Glue and dbt processes to create a unified and cohesive data pipeline orchestration system.
- Implement robust monitoring, logging, and alerting mechanisms for Airflow pipelines to ensure data quality, identify potential issues, and facilitate proactive problem resolution.
- Contribute to the development and maintenance of comprehensive documentation for all Airflow pipelines, ensuring clarity and ease of maintenance for the team.
Collaboration and Communication:
- Work closely with the Solution Architect, Data Architect, Data Migration Specialist, Cloud Engineer, and other team members to ensure seamless integration of Airflow pipelines within the broader data platform modernization project.
- Actively participate in technical discussions and decision-making processes, providing insights and expertise on Airflow best practices and implementation strategies.
- Communicate effectively with stakeholders, providing clear and concise updates on the progress of Airflow development and implementation, addressing any concerns, and ensuring alignment with project goals.
- 5+ years of hands-on experience developing and managing data pipelines using Apache Airflow in a production environment.
- Proven experience migrating legacy orchestration systems, such as SQL Server Agent jobs, to Apache Airflow.
- Strong proficiency in Python and SQL, with a deep understanding of data structures, algorithms, and best practices for writing efficient and maintainable code.
- Familiarity with AWS cloud services relevant to data processing, including S3, EMR, Glue, and Kinesis.
- Experience working with dbt for data transformation and modeling.
- Excellent problem-solving and debugging skills, with the ability to identify and resolve complex data pipeline issues effectively.
- Strong communication and collaboration skills, with the ability to work effectively in a team environment and interact with technical and non-technical stakeholders.
- Experience with cloud-native solutions on AWS, including AWS Aurora and Amazon S3
- Familiarity with data governance and security best practices.
- Experience with DevOps practices and CI/CD pipelines.
- Contributions to the Apache Airflow open-source community.
About Encora
Encora is the preferred digital engineering and modernization partner of some of the world’s leading enterprises and digital native companies. With over 9,000 experts in 47+ offices and innovation labs worldwide, Encora’s technology practices include Product Engineering & Development, Cloud Services, Quality Engineering, DevSecOps, Data & Analytics, Digital Experience, Cybersecurity, and AI & LLM Engineering.
At Encora, we hire professionals based solely on their skills and qualifications, and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow AWS AWS Glue CI/CD Data governance Data pipelines Data quality dbt DevOps Engineering Kinesis LLMs Open Source Pipelines Python Security SQL
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.