ML/LLM Operations Engineer
Work at Home, United States
Full Time Mid-level / Intermediate USD 130K - 150K
- Remote-first
- Website
- @EvolentHealth 𝕏
- Search
Evolent
Evolent Health's family of brands is coming together under a single name — simply "Evolent" — to improve outcomes for people with the most complex and costly health conditions.Your Future Evolves Here
Evolent partners with health plans and providers to achieve better outcomes for people with most complex and costly health conditions. Working across specialties and primary care, we seek to connect the pieces of fragmented health care system and ensure people get the same level of care and compassion we would want for our loved ones.
Evolent employees enjoy work/life balance, the flexibility to suit their work to their lives, and autonomy they need to get things done. We believe that people do their best work when they're supported to live their best lives, and when they feel welcome to bring their whole selves to work. That's one reason why diversity and inclusion are core to our business.
Join Evolent for the mission. Stay for the culture.
What You’ll Be Doing:
We are seeking a skilled ML/LLM Operations Engineer to join our Data Science team at Evolent Health to ensure our AI systems deliver consistent, reliable, and compliant results in healthcare settings. This role is perfect for someone who thrives at the intersection of machine learning, operations, and healthcare compliance.
The role combines a passion for operational excellence in AI systems with a meticulous approach to monitoring, evaluation, and regulatory compliance in healthcare applications.
Collaboration Opportunities: This position will play a critical role partnering with our Data Science and Engineering teams while also interacting with cross-functional organizations including DevOps, Compliance, Quality Assurance, and Product Management to ensure our AI systems operate reliably and meet all healthcare industry requirements.
What You Will Be Doing:
- Build and maintain comprehensive monitoring systems for deployed AI/ML models to track performance, detect drift, and alert the team to anomalies
- Develop standardized evaluation frameworks to consistently measure AI feature performance across relevant healthcare metrics
- Oversee regulatory compliance processes, including documentation for bias assessments, model cards, and audit trails required in healthcare
- Support the transition from successful POCs to production-ready services with testing, validation, and monitoring infrastructure
- Configure and maintain Docker container environments for AI microservices
- Create and maintain documentation, runbooks, and operational procedures for all deployed AI systems
- Coordinate with DevOps on infrastructure requirements and optimization
- Implement A/B testing infrastructure for comparing model versions in production settings
- Prepare regular reports on system health, performance, and compliance status
Qualifications Required and Preferred:
- Bachelor's or master's degree in computer science, data science, or related field
- 3+ years of experience in MLOps, DevOps, or similar operational roles supporting AI/ML systems
- Strong proficiency in Python and experience with ML/AI frameworks (PyTorch, TensorFlow, Hugging Face, etc.)
- Experience with monitoring tools and practices for AI systems, including performance metrics, drift detection, and alerting
- Knowledge of containerization technologies like Docker and Kubernetes for deploying and scaling AI services
- Familiarity with healthcare compliance requirements for AI systems (preferred)
- Experience with cloud environments (AWS, Azure) for AI deployments
- Knowledge of CI/CD pipelines and automation for AI model deployment
- Experience with logging and monitoring tools (Prometheus, Grafana, ELK stack, etc.)
- Understanding of LLM evaluation metrics and techniques
- Experience with API development and maintenance (FastAPI, Flask, etc.)
- Excellent documentation skills and attention to detail
- Strong communication skills for cross-functional collaboration
Technical Requirements:
We require that all employees have the following technical capability at their home: High speed internet over 10 Mbps and, specifically for all call center employees, the ability to plug in directly to the home internet router. These at-home technical requirements are subject to change with any scheduled re-opening of our office locations.
Evolent is an equal opportunity employer and considers all qualified applicants equally without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, or disability status. If you need reasonable accommodation to access the information provided on this website, please contact recruiting@evolent.com for further assistance.
The expected base salary/wage range for this position is $130,000 - 150,000. This position is also eligible for a bonus component that would be dependent on pre-defined performance factors. As part of our total compensation package, Evolent is proud to offer comprehensive benefits (including health insurance benefits) to qualifying employees. All compensation determinations are based on the skills and experience required for the position and commensurate with experience of selected individuals, which may vary above and below the stated amounts.Tags: A/B testing API Development APIs AWS Azure CI/CD Computer Science DevOps Docker ELK Engineering FastAPI Flask Grafana Kubernetes LLMs Machine Learning Microservices ML models MLOps Model deployment Pipelines Python PyTorch TensorFlow Testing
Perks/benefits: Career development Health care Insurance
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.