Engineer - AI OPS Engineer
Chennai, TN, India
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Sutherland
As a digital transformation company, Sutherland leverages AI, Analytics, Cloud, and Automation to transform operations, drive innovation, and engineer digital outcomes for global enterprises.Company Description
About Sutherland
Artificial Intelligence. Automation.Cloud engineering. Advanced analytics.For business leaders, these are key factors of success. For us, they’re our core expertise.
We work with iconic brands worldwide. We bring them a unique value proposition through market-leading technology and business process excellence.
We’ve created over 200 unique inventions under several patents across AI and other critical technologies. Leveraging our advanced products and platforms, we drive digital transformation, optimize critical business operations, reinvent experiences, and pioneer new solutions, all provided through a seamless “as a service” model.
For each company, we provide new keys for their businesses, the people they work with, and the customers they serve. We tailor proven and rapid formulas, to fit their unique DNA.We bring together human expertise and artificial intelligence to develop digital chemistry. This unlocks new possibilities, transformative outcomes and enduring relationships.
Sutherland
Unlocking digital performance. Delivering measurable results.
Job Description
We are looking for a proactive and detail-oriented AI OPS Engineer to support the deployment, monitoring, and maintenance of AI/ML models in production. Reporting to the AI Developer, this role will focus on MLOps practices including model versioning, CI/CD, observability, and performance optimization in cloud and hybrid environments.
Key Responsibilities:
- Build and manage CI/CD pipelines for ML models using platforms like MLflow, Kubeflow, or SageMaker.
- Monitor model performance and health using observability tools and dashboards.
- Ensure automated retraining, version control, rollback strategies, and audit logging for production models.
- Support deployment of LLMs, RAG pipelines, and agentic AI systems in scalable, containerized environments.
- Collaborate with AI Developers and Architects to ensure reliable and secure integration of models into enterprise systems.
- Troubleshoot runtime issues, latency, and accuracy drift in model predictions and APIs.
- Contribute to infrastructure automation using Terraform, Docker, Kubernetes, or similar technologies.
Qualifications
Required Qualifications:
- 3–5 years of experience in DevOps, MLOps, or platform engineering roles with exposure to AI/ML workflows.
- Hands-on experience with deployment tools like Jenkins, Argo, GitHub Actions, or Azure DevOps.
- Strong scripting skills (Python, Bash) and familiarity with cloud environments (AWS, Azure, GCP).
- Understanding of containerization, service orchestration, and monitoring tools (Prometheus, Grafana, ELK).
- Bachelor’s degree in computer science, IT, or a related field.
Preferred Skills:
- Experience supporting GenAI or LLM applications in production.
- Familiarity with vector databases, model registries, and feature stores.
- Exposure to security and compliance standards in model lifecycle management
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs AWS Azure Chemistry CI/CD Computer Science DevOps Docker ELK Engineering GCP Generative AI GitHub Grafana Jenkins Kubeflow Kubernetes LLMs Machine Learning MLFlow ML models MLOps Pipelines Python RAG SageMaker Security Terraform
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.