Machine Learning Platform Engineer
Dubai, Dubai, United Arab Emirates
Delivery Hero
Delivery Hero - Always delivering an amazing experience.Company Description
Since launching in Kuwait in 2004, talabat, the leading on-demand food and Q-commerce app for everyday deliveries, has been offering convenience and reliability to its customers. talabat’s local roots run deep, offering a real understanding of the needs of the communities we serve in eight countries across the region.
We harness innovative technology and knowledge to simplify everyday life for our customers, optimize operations for our restaurants and local shops, and provide our riders with reliable earning opportunities daily.
Here at talabat, we are building a high performance culture through engaged workforce and growing talent density. We're all about keeping it real and making a difference. Our 6,000+ strong talabaty are on an awesome mission to spread positive vibes. We are proud to be a multi great place to work award winner.
Job Description
Summary
As the leading delivery company in the region, we have a great responsibility and opportunity to impact the lives of millions of customers, restaurant partners, and riders. To realize our potential, we need to scale and evolve our machine learning capabilities across the company. This requires robust, efficient, and scalable ML platforms that empower teams to build, deploy, and operate intelligent systems with speed and reliability.
As an ML Platform Engineer, your mission is to build and maintain the infrastructure and tooling that accelerates the development, deployment, and monitoring of machine learning models in production. You’ll work closely with data scientists, ML engineers, and product teams to design seamless ML workflows, from experimentation to serving, and ensure a high standard of operational excellence in our ML systems.
Qualifications
Responsibilities
Design, build, and maintain scalable, reusable, and reliable machine learning platforms and tooling to support the full ML lifecycle: data ingestion, training, evaluation, deployment, and monitoring.
Collaborate with ML practitioners to understand their workflows and abstract them into flexible platform components and services.
Automate and streamline ML model training pipelines, model versioning, artifact management, and deployment workflows using modern MLOps practices.
Integrate with infrastructure components (e.g. feature stores, model registries, experiment tracking, orchestration engines) and cloud-native services to build robust systems.
Ensure reliability, observability, and scalability of production ML workloads; implement monitoring, alerting, and performance evaluation for deployed models.
Support reproducibility and governance in ML development by building infrastructure for metadata tracking, lineage, and auditability.
Drive engineering best practices within the ML platform including CI/CD, testing, documentation, and performance optimization.
Partner with data engineering, product, and infra teams to align ML platform initiatives with broader company goals and architecture.
Contribute to internal documentation, onboarding, and tooling adoption for data scientists and ML engineers.
Champion a platform mindset and improve developer productivity by reducing friction in the ML workflow.
Requirements
Technical Experience
Strong software engineering background with experience in building distributed systems or platforms, ideally focused on ML/AI use cases.
Proficiency in Python and experience with ML frameworks (e.g. TensorFlow, PyTorch, Scikit-learn), but with a focus on operationalizing rather than developing novel models.
Hands-on experience with ML infrastructure tooling: model training and serving platforms (e.g., Vertex AI, SageMaker, Kubeflow, MLflow, Ray, BentoML), orchestration frameworks (e.g., Airflow, Flyte, Dagster), and containerization (Docker, Kubernetes).
Familiarity with CI/CD pipelines, version control, and infrastructure-as-code (e.g., Terraform, Helm).
Experience working with cloud platforms, preferably GCP (BigQuery, Vertex AI, GKE, etc.).
Experience in building and managing feature stores, model registries, or experiment tracking systems is a plus.
Strong understanding of model lifecycle management, including monitoring for drift, decay, and data integrity.
Solid SQL and familiarity with data warehouse modeling; experience with streaming or batch data pipelines is a plus.
Understanding of the statistical and analytical needs of ML teams and the ability to translate them into scalable infrastructure.
Qualifications
Bachelor's degree in Computer Science, Engineering, or a related field. A postgraduate degree is a plus but not required.
3+ years of experience in ML platform, ML infrastructure, or related roles.
Proven track record of building systems that enable ML practitioners to ship models faster and with higher reliability.
A system thinker with a product mindset and a passion for enabling others.
An excellent collaborator with strong communication skills.
High ownership, pragmatism, and a bias for action.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture BentoML BigQuery CI/CD Computer Science Dagster Data pipelines Data warehouse Distributed Systems Docker Engineering GCP Helm Kubeflow Kubernetes Machine Learning MLFlow ML infrastructure ML models MLOps Model training Pipelines Python PyTorch SageMaker Scikit-learn SQL Statistics Streaming TensorFlow Terraform Testing Vertex AI
Perks/benefits: Career development Flex hours
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.