Lead Data Engineer (ML Ops)
KL Sentral - Redstation
AirAsia
Download AirAsia MOVE today and get only the best deals on flights, hotels, ride and more! Completing your travel, all in one app.
Job Description
Why AirAsia Move
Are you ready to take off and be part of AirAsia Move. AirAsia Move is the latest offering from AirAsia Group, focused on leveraging cutting-edge technologies to build and scale data-driven products and services across the travel, e-commerce, and logistics ecosystems. We are transforming the way people experience travel, offering seamless access to travel-related services. As a part of our expansion, we are looking for a talented Lead Data Engineer to join our growing team and help us unlock the full potential of our data capabilities.
Key Responsibilities:
1. Data Architecture & Pipeline Development:
Lead the design and implementation of scalable, reliable, and optimized data pipelines that enable seamless data ingestion, transformation, and delivery.
Architect, build, and manage end-to-end data workflows, ensuring efficient data processing for both real-time and batch systems.
Work with stakeholders across product, analytics, and engineering teams to ensure data solutions meet business needs.
Optimize and maintain data pipelines for performance, scalability, and cost-efficiency.
Lead the design, development, and optimization of MLOps pipelines and infrastructure to streamline the deployment, monitoring, and maintenance of machine learning models at scale.
Drive the adoption of best practices for continuous integration, continuous deployment (CI/CD), and automated testing in machine learning workflows.
Define and enforce standards for model versioning, governance, and lifecycle management.
2. Model Deployment & Monitoring:
Design and implement automated workflows for deploying machine learning models in production environments, ensuring models are delivered on time and meet required performance metrics.
Implement model monitoring systems to ensure the ongoing health and performance of models in production, including model drift detection, data quality monitoring, and performance alerting.
Work with data scientists to ensure models are deployable, reproducible, and maintainable in production environments.
Oversee the operationalization of machine learning models, ensuring scalability, efficiency, and performance of both batch and real-time systems.
3. Cloud Infrastructure & Data Integration:
Leverage cloud platforms (e.g., Google Cloud) to build and scale data solutions.
Work with various data integration technologies (e.g., APIs, ETL tools, Kafka, Pub/Sub) to ensure seamless data flow between systems.
Implement data versioning, lineage, and governance to maintain data integrity, security, and compliance.
4. Data Quality & Monitoring:
Ensure that data quality, accuracy, and consistency are maintained across all data pipelines and systems.
Implement and monitor data quality checks, logging, and alerting to ensure early detection of issues in the data pipelines.
Continuously evaluate and improve data architectures, ensuring data availability and minimizing downtime.
5. Collaboration, Leadership & Team Management:
Collaborate with cross-functional teams, including data scientists, engineers, product managers, tech and devops to understand business needs and deliver impactful AI/ML solutions.
Lead a team of data and MLOps engineers, providing mentorship, guidance, and performance feedback to foster a collaborative and innovative team culture.
Promote best practices in data engineering, code quality, testing, and deployment.
Work with senior leadership and other departments to define and prioritize data-related initiatives and align with business objectives.
Lead the development of technical roadmaps and strategies for long-term data infrastructure goals.
Document processes, workflows, and best practices for knowledge sharing across teams.
7. Innovation & Continuous Improvement:
Stay up to date with the latest trends, tools, and technologies in the data engineering space.
Propose innovative solutions to improve the speed, quality, and scalability of data systems.
Drive a culture of continuous improvement by proactively identifying areas for optimization and automation.
Required Qualifications:
Education & Experience:
Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field, or equivalent practical experience.
6+ years of hands-on experience in data engineering, including at least 3 years in a leadership or management role.
Proven experience in building large-scale, cloud-based data infrastructure and data pipelines for high-volume, high-velocity data.
Strong experience with ETL tools, data orchestration frameworks (e.g., Apache Airflow), and batch/streaming data processing (e.g., Apache Kafka, Spark).
Expertise in working with cloud platforms such as Google Cloud Platform (GCP)
Technical Skills:
Proficiency in Python, Java, or Scala; experience with SQL and NoSQL databases (e.g., BigQuery, MongoDB, PostgreSQL).
Deep understanding of data warehousing concepts and experience in designing and managing data models.
Proficiency in GCP Services such as Vertex AI, Kubeflow, TensorFlow Extended (TFX), Google Kubernetes Engine (GKE),BigQuery ML, Cloud Functions & Cloud Run, BigQuery, Cloud Storage, Composer, Cloud Pub/Sub & Dataflow, AI Building Blocks, Model Monitoring & Explainability
Experience with data governance, data quality, and data privacy best practices.
Familiarity with containerization technologies (Docker, Kubernetes) and infrastructure-as-code (Terraform, CloudFormation).
Soft Skills:
Strong leadership, communication, and collaboration skills, with experience managing cross-functional teams.
Excellent problem-solving and analytical abilities, with a focus on scalability and efficiency.
Ability to translate complex technical concepts into business value for non-technical stakeholders.
Passionate about mentoring and developing teams to reach their potential.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow APIs Architecture BigQuery CI/CD CloudFormation Computer Science Dataflow Data governance Data pipelines Data quality Data Warehousing DevOps Docker E-commerce Engineering ETL GCP Google Cloud Java Kafka Kubeflow Kubernetes Machine Learning ML models MLOps Model deployment MongoDB NoSQL Pipelines PostgreSQL Privacy Python Scala Security Spark SQL Streaming TensorFlow Terraform Testing TFX Vertex AI
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.