Senior AI Engineer

Bangalore - Carina, India

Applications have closed

Red Hat

Red Hat is the world’s leading provider of enterprise open source solutions, including high-performing Linux, cloud, container, and Kubernetes technologies.

View all jobs at Red Hat

Find more jobs like this Jobs in India

Posted 4 months ago

About the Job:

The Data Development Insights & Strategy (DDIS) team is seeking a Senior AI Engineer to design, scale, and maintain our AI model lifecycle framework within Red Hat's OpenShift AI and RHEL AI infrastructures. As a Senior AI Engineer, you will contribute to managing and optimizing large-scale AI models, collaborating with cross-functional teams to ensure high availability, continuous monitoring, and efficient integration of new model updates, while driving innovation through emerging AI technologies.

In this role, you will leverage your expertise in AI, MLOps/LLMOps, cloud computing, and distributed systems to enhance model performance, scalability and operational efficiency. You'll work in close collaboration with the Products & Global Engineering(P&GE) and IT AI Infra teams, ensuring seamless model deployment and maintenance in a secure and high-performance environment. This is an exciting opportunity to drive AI model advancements and contribute to the operational success of mission-critical applications.

What you will do?

Develop and maintain the lifecycle framework for AI models within Red Hat’s OpenShift and RHEL AI infrastructure, ensuring security, scalability and efficiency throughout the process.
Design, implement, and optimize CI/CD pipelines and automation for deploying AI models at scale using tools like Git, Jenkins, and Terraform, ensuring zero disruption during updates and integration.
Continuously monitor and improve model performance using tools such as OpenLLMetry, Splunk, and Catchpoint, while responding to performance degradation and model-related issues.
Work closely with cross-functional teams, including Products & Global Engineering(P&GE) and IT AI Infra teams, to seamlessly integrate new models or model updates into production systems with minimal downtime and disruption.
Enable a structured process for handling feature requests (RFEs), prioritization, and resolution, ensuring transparent communication and timely resolution of model issues.
Assist in fine-tuning and enhancing large-scale models, including foundational models like Mistral and LLama, while ensuring computational resources are optimally allocated (GPU management, cost management strategies).
Drive performance improvements, model updates, and releases on a quarterly basis, ensuring that all RFEs are processed and resolved within 30 days.
Collaborate with stakeholders to align AI model updates with evolving business needs, data changes, and emerging technologies.
Contribute to mentoring junior engineers, fostering a collaborative and innovative environment.

What you will bring?

A bachelor's or master’s degree in Computer Science, Data Science, Machine Learning, or a related technical field is required.
Hands-on experience that demonstrates your ability and interest in AI engineering and MLOps will be considered in lieu of formal degree requirements.
Experience programming in at least one of these languages: Python, with a strong understanding of Machine Learning frameworks and tools.
Experience working with cloud platforms such as AWS, GCP, or Azure, and have familiarity with deploying and maintaining AI models at scale in these environments.
As a Senior AI Engineer, you will be most successful if you have experience working with large-scale distributed systems and infrastructure, especially in production environments where AI and LLM models are deployed and maintained. You should be comfortable troubleshooting, optimizing, and automating workflows related to AI model deployment, monitoring, and lifecycle management. We value a strong ability to debug and optimize model performance and automate manual tasks wherever possible.
Additionally, you should be well-versed in managing AI model infrastructure using containerization technologies like Kubernetes and OpenShift, and have hands-on experience with performance monitoring tools (e.g., OpenLLMetry, Splunk, Catchpoint). We also expect you to have a solid understanding of GPU-based computing and resource optimization, with a background in high-performance computing (e.g., CUDA, vLLM, MIG, TGI, TEI).
Experience working in Agile development environments.
Work collaboratively within cross-functional teams to solve complex problems and drive AI model updates will be key to your success in this role.

Desired skills:

5+ years of experience in AI or MLOps, with a focus on deploying, maintaining, and optimizing large-scale AI models in production.
Expertise in deploying and managing models in cloud environments (AWS, GCP, Azure) and containerized platforms like OpenShift or Kubernetes.
Familiarity with large-scale distributed systems and experience managing their performance and scalability.
Experience with performance monitoring and analysis tools such as OpenLLMetry, Prometheus, or Splunk.
Deep understanding of GPU-based deployment strategies and computational cost management.
Strong experience in managing model lifecycle processes, from training to deployment, monitoring, and updates.
Ability to mentor junior engineers and promote knowledge sharing across teams.
Excellent communication skills, both verbal and written, with the ability to engage with technical and non-technical stakeholders.
A passion for innovation and continuous learning in the rapidly evolving field of AI and machine learning.

This is an exciting opportunity for a Senior AI Engineer to contribute to the growing AI ecosystem at Red Hat, ensuring robust, scalable, and secure infrastructure for AI models. If you're looking for a challenging and rewarding role that blends technical excellence with business impact, we encourage you to apply.

About Red Hat

Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.

Diversity, Equity & Inclusion at Red Hat
Red Hat’s culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from diverse backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions of diversity that compose our global village.

Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.