Clinical Data Scientist
New York
Phare Health
We build AI tools to make claims more accurate and transparent, so that you can focus on patients not price tags.About Us
Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by General Catalyst, we’re scaling quickly - join us!
The Role
You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:
Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.
Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.
Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.
Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.
Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.
Who we're looking for
3+ years applying NLP or data-science to clinical (or similarly complex) text.
Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).
Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.
Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).
Track record operating production-grade ML systems with monitoring and uptime targets.
Bonus points
Peer-reviewed publications or open-source contributions in clinical NLP.
Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.
Experience in customer-facing roles communicating data science requirements and gathering specs from end users.
Benefits
Top-of-market compensation (salary + equity)
Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)
Mission-driven, collaborative team
Twice-yearly offsites to align, build, and celebrate.
Hiring Process
Initial application.
Intro call: Discuss your background, career goals, and our mission.
2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.
Referees: We ask for 2 referees who can speak to your professional/technical work
Culture interview: Ways of working, and a chance to ask questions
Offer
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture CI/CD Clinical NLP Data analysis EDA LLMs LOINC Machine Learning Model design NLP Open Source Pandas Pipelines Python R RAG R&D Research SNOMED SQL
Perks/benefits: Career development Equity / stock options Flex hours Flex vacation Salary bonus
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.