MLOps Engineer

Düsseldorf, North Rhine-Westphalia, Germany - Remote

Full Time EUR 69K - 129K * ^est.

Cognigy

Generative and conversational AI powered customer service agents for your business. Get a demo or try free now!

View all jobs at Cognigy

Apply now Apply later

Posted 3 weeks ago

About Cognigy

Cognigy is transforming the customer service industry with the most advanced AI Agent platform for enterprise contact centers. Its award-winning solution, Cognigy.AI, empowers enterprises to deliver instant, hyper-personalized, multilingual service on any channel. By integrating Generative and Conversational AI to create Agentic AI, Cognigy delivers AI Agents that redefine customer experiences, drive satisfaction, and support contact center employees in real-time.

Our skilled #CognigyCrew are the people behind our leading technology and we are now looking for more talented people to join our global team.

Why you’ll love working at Cognigy - Our promise to you

We empower our people to be successful as part of a diverse, passionate and respectful team who are proud to be enabling customer and employee service that is loved by everyone.

We do this by challenging each other to succeed and being enabled to do our best work. Encouraging and supporting growth is at the heart of our success, founded on a culture of mutual respect and trust – always! It’s no wonder that the values that inspire and drive our #CognigyCrew are our 4Ts - Team, Trust, Transparency, Technology.

Your new role – MLOps Engineer

Location: On-site in Düsseldorf or remote in Germany

We are looking for a skilled and ambitious MLOps Engineer to join our Engineering team and take ownership of building and operating scalable, secure infrastructure for Large Language Models (LLMs). You will support our Machine Learning, Product, and SRE teams in deploying and maintaining production-grade AI workloads on Kubernetes using cutting-edge technologies like KubeRay.

You’ll help ensure optimal performance, reliability, observability, and cost-efficiency of Cognigy’s AI infrastructure, automating processes and championing modern MLOps best practices.

Your responsibilities will include

Build & Operate LLM Infrastructure – Design and maintain scalable LLM-serving systems using Kubernetes and KubeRay.
Automate & Optimize – Automate deployments, rollbacks, and scaling of LLMs while optimizing resource usage and performance.
Enhance Observability – Ensure robust monitoring, logging, and alerting for LLM operations (Prometheus, Grafana, etc.).
Support AI Teams – Empower ML and product engineers with self-service pipelines and scalable infrastructure.
Prioritize Security – Enforce secure deployments, compliance practices, and robust incident response strategies.
Improve Documentation – Create and maintain technical documentation to streamline knowledge sharing and onboarding.
Drive Innovation – Evaluate, adopt, and integrate the latest MLOps and LLM-serving technologies.
Reduce SRE Toil – Eliminate repetitive tasks and improve operational efficiency across the platform.

Growth Potential

At Cognigy we are committed to your professional growth. This role offers significant opportunities for career development, including access to ongoing training, and involvement in high-impact projects allowing you to showcase and advance your unique skills and experience.

Requirements

About you

Hands-on experience running production ML or LLM workloads in Kubernetes
Familiarity with distributed ML frameworks such as KubeRay, Ray Serve, or similar
Deep understanding of Kubernetes internals, especially GPU scheduling, autoscaling, and multi-tenant environments
Proficiency with CI/CD systems for ML models, and versioned deployment strategies
Strong experience with cloud platforms (AWS, GCP, or Azure), networking, and security best practices
Skilled in monitoring and observability for ML workloads (e.g., Prometheus, Grafana)
Passion for automation, performance tuning, and cost optimization for LLM workloads
Clear communicator and proactive team player who thrives in fast-paced, cross-functional environments
MLOps or DevOps certifications (nice to have)

Benefits

Life at Cognigy - What we offer you

We are an ambitious and international tech company with a great culture, and we make sure that everyone feels welcome. Our excellent benefits make us a fantastic place to work - these include

Attractive and performance-oriented salary
Company Pension Scheme
25 days paid leave, plus 5 floating days, plus public holidays
Unique opportunity to help build and shape the company, with little hierarchy
Flexible working options
Colleague recognition, reward and celebration events
Global Employee Assistance Program
ClassPass membership, giving you access to a variety of fitness and wellness experiences
Ongoing learning and development opportunities, including Udemy
One paid ‘Giving Back Day' each year, so you can volunteer for a charity or community activity of your choice
Subscription to the Calm app for you plus five friends/family members, giving you access to guided meditation, sleep stories, music, masterclasses, and much more

Equal Opportunity Employer Statement - Cognigy does not discriminate on the basis of race, sex, color, religion, age, national origin, marital status, disability, veteran status, genetic information, sexual orientation, gender identity or any other reason prohibited by law in provision of employment opportunities and benefits.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 25 2 0

Categories: Engineering Jobs Machine Learning Jobs MLOps Jobs

Tags: AWS Azure CI/CD Conversational AI DevOps Engineering GCP GPU Grafana Kubernetes LLMs Machine Learning ML infrastructure ML models MLOps Pipelines Security