Lead Machine Learning & AI Evaluation Engineer

Boston MA, United States

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Senior-level / Expert USD 147K - 273K * ^est.

athenahealth

Join 150K providers on the largest connected network in healthcare. See how we’re making connections that improve patient outcomes & clinician experiences.

View all jobs at athenahealth

Apply now Apply later

Posted 10 hours ago

Join us as we work to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.

RESEARCHER – AI EVALUATION (VERIFICATION & VALIDATION)

We are looking for a Lead Machine Learning Engineer focusing on AI Evaluation to join the Research team in the Core AI Subdivision. Evaluating LLMs and applications integrating LLMs & agents presents unique challenges compared to traditional software or machine learning models due to their inherent non-deterministic nature and the complexity of assessing the quality of their multimodal outputs. Effective verification & validation of LLMs and applications integrating LLMs & agents is paramount for ensuring accuracy, reliability, safety, and the user trust. You will work with the team to establish scalable methodologies, designs, and tooling to accomplish this.

About you:

You love to own important work and find it difficult to turn down a good challenge. You are excited about the latest developments in AI & ML and keep abreast of the latest models, methods, and technologies. You have experience building, tuning, evaluating, and deploying ML models at scale. You have strong communication skills and can work with colleagues from a variety of technical and non-technical backgrounds. You enjoy both learning and teaching, and you are excited to help share knowledge with a multi-thousand-person company. You love collaboration and working closely with a team of other experts and with technical and non-technical stakeholders. Finally, you have a strong interest in improving the delivery of healthcare.

About the team:

The Core AI Subdivision is bringing Artificial Intelligence to bear against the hardest problems in healthcare. We are working with product and engineering leaders across the company to build AI into our Best In KLAS suite of products. We work together with athenahealth engineers to deploy state-of-the-art machine learning models and agents.

Job Responsibilities:

As a member of the Research team focusing on AI verification & validation, you provide subject matter expertise, practical technical guidance, and tooling for evaluating LLMs and applications employing LLMs & agents in their workflow. Your domain includes:

Leveraging standardized benchmarks for initial assessment
Calculating and interpreting quantitative metrics such as accuracy, precision, recall, F1, perplexity, BLEU, ROUGE, text similarity, exact match etc.
Human evaluation
Conventional testing such as unit, functional and scale/load.
Model explainability & output consistency
Testing to understand bias, toxicity, fairness.
Prompt variation/robustness testing
Factual accuracy/coherence/relevance/fluency/hallucination testing
Security testing
Monitoring in production (especially important given the non-deterministic nature of LLMs)
Overall observability (accuracy, perf metrics, traces/explainability, cost, usage…)
Techniques/approaches for improving key aspects of overall model performance such as accuracy and latency e.g., advanced prompt engineering, RAG, domain specific fine tuning, reasoning, and self-checking.
Incorporating end user feedback loops
Establishing best practice for evaluation of applications integrating LLMs
Automating as much as practical to make AI evaluation reliable, scalable, and repeatable, including integration into CI/CD pipelines

As a member of the Research team, you will:

Identify opportunities to make AI evaluation deterministic, performant, and cost-effective.
Understand and follow conventions and best practices for modeling, coding, architecture, and statistics; and hold other team members accountable for doing so.
Apply rigorous testing of statistics, models, and code.
Contribute to the development of internal tools and Core AI team standards.

Typical Qualifications:

Excellent verbal communication and writing skills.
Bachelors in relevant field: math, computer science, data science, economics.
At least 6 years of professional experience developing and evaluating machine learning models.
At least 2 years enterprise experience training, evaluating, and deploying models.
Proficient in Python.
Experience using machine learning models and libraries
Familiarity with NLP, computer vision, ambient computing techniques.
Experience with commercial and open-source AI evaluation tooling, frameworks, and best practices.
Experience using the AWS ecosystem a bonus, including Kubernetes, Kubeflow or EKS experience.

About athenahealth

Our vision: In an industry that becomes more complex by the day, we stand for simplicity. We offer IT solutions and expert services that eliminate the daily hurdles preventing healthcare providers from focusing entirely on their patients — powered by our vision to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.

Our company culture: Our talented  employees — or athenistas, as we call ourselves — spark the innovation and passion needed to accomplish our vision. We are a diverse group of dreamers and do-ers with unique knowledge, expertise, backgrounds, and perspectives. We unite as mission-driven problem-solvers with a deep desire to achieve our vision and make our time here count. Our award-winning culture is built around shared values of inclusiveness, accountability, and support.

Our DEI commitment: Our vision of accessible, high-quality, and sustainable healthcare for all requires addressing the inequities that stand in the way. That's one reason we prioritize diversity, equity, and inclusion in every aspect of our business, from attracting and sustaining a diverse workforce to maintaining an inclusive environment for athenistas, our partners, customers and the communities where we work and serve.

What we can do for you:

Along with health and financial benefits, athenistas enjoy perks specific to each location, including commuter support, employee assistance programs, tuition assistance, employee resource groups, and collaborative  workspaces  — some offices even welcome dogs.

We also encourage a better work-life balance for athenistas with our flexibility. While we know in-office collaboration is critical to our vision, we recognize that not all work needs to be done within an office environment, full-time. With consistent communication and digital collaboration tools, athenahealth enables employees to find a balance that feels fulfilling and productive for each individual situation.

In addition to our traditional benefits and perks, we sponsor events throughout the year, including book clubs, external speakers, and hackathons. We provide athenistas with a company culture based on learning, the support of an engaged team, and an inclusive environment where all employees are valued.

Learn more about our culture and benefits here: athenahealth.com/careers

https://www.athenahealth.com/careers/equal-opportunity

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Categories: Deep Learning Jobs Engineering Jobs Leadership Jobs Machine Learning Jobs

Tags: Architecture AWS CI/CD Computer Science Computer Vision Economics Engineering Kubeflow Kubernetes LLMs Machine Learning Mathematics ML models NLP Open Source Pipelines Prompt engineering Python RAG Research Security Spark Statistics Teaching Testing