ML Ops Engineer - Lead

Hyderabad, TS, India

Blend360

Blend360 co-creates value with leading companies through the integration of data, advanced analytics, technology & people. Get in touch with us today.

View all jobs at Blend360

Apply now Apply later

Company Description

Blend is a premier AI services provider, committed to co-creating meaningful impact for its clients through the power of data science, AI, technology, and people. With a mission to fuel bold visions, Blend tackles significant challenges by seamlessly aligning human expertise with artificial intelligence. The company is dedicated to unlocking value and fostering innovation for its clients by harnessing world-class people and data-driven strategy. We believe that the power of people and AI can have a meaningful impact on your world, creating more fulfilling work and projects for our people and clients. For more information, visit www.blend360.com

Job Description

We are seeking a highly skilled Lead MLOps Engineer with 6+ years of hands-on experience in DevOps/MLOps, including 3+ years in building and managing machine learning pipelines, specifically in on-premises environments. The ideal candidate will have successfully delivered at least two end-to-end MLOps projects on on-prem infrastructure or private cloud, with expertise in automation, infrastructure management, and MLOps best practices.

Key Responsibilities

  • Design, build, and maintain scalable ML pipelines in on-premises environments, ensuring high availability and reliability.

  • Implement and manage Infrastructure as Code (IaC) using tools like Ansible, Terraform (for private cloud), or Puppet for development, testing, and production setups.

  • Extend and enhance existing ML workflows to support evolving data science needs with minimal but impactful changes.

  • Act as the infrastructure and MLOps SME, collaborating with data scientists to guide and support model deployment and operationalization.

  • Document system architecture, infrastructure usage, and design via tools like Confluence, GitHub Wikis, and architectural diagrams.

  • Research and implement optimizations for ML workflows, compute resource utilization, and storage management.

  • Lead cross-functional initiatives related to ML product deployment, re-platforming, and modernization efforts in the on-prem environment.

Qualifications

  • 6+ years of hands-on DevOps / MLOps experience in on-premises or private cloud environments.

  • Proven track record delivering at least two end-to-end MLOps projects on on-prem infrastructure or private cloud.

  • Expertise in containerization technologies (e.g., Docker, Podman) and container orchestration (e.g., Kubernetes, OpenShift, Rancher).

  • Strong skills in Infrastructure as Code using tools like Ansible, Puppet, or Terraform (private cloud).

  • Proficiency in scripting and automation with Python and Bash.

  • Solid understanding of MLOps principles, CI/CD, model versioning, and monitoring practices.

  • Experience managing version control systems (e.g., Git, GitHub, GitLab) and automating pipelines (e.g., GitHub Actions, Jenkins, GitLab CI).

  • Experience building and maintaining Docker images and registries in an on-prem setup.

  • Strong collaboration and communication skills, with experience working alongside data science and IT teams.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Ansible Architecture CI/CD Confluence DevOps Docker Git GitHub GitLab Jenkins Kubernetes Machine Learning MLOps Model deployment Pipelines Puppet Python Research Terraform Testing

Region: Asia/Pacific
Country: India

More jobs like this