Clinical Data Scientist

New York

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Mid-level / Intermediate USD 52K - 123K * ^est.

Phare Health

We build AI tools to make claims more accurate and transparent, so that you can focus on patients not price tags.

View all jobs at Phare Health

Apply now Apply later

Posted 3 weeks ago

About Us

Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by General Catalyst, we’re scaling quickly - join us!

The Role

You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:

Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.
Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.
Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.
Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.
Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.

Who we're looking for

3+ years applying NLP or data-science to clinical (or similarly complex) text.
Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).
Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.
Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).
Track record operating production-grade ML systems with monitoring and uptime targets.

Bonus points

Peer-reviewed publications or open-source contributions in clinical NLP.
Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.
Experience in customer-facing roles communicating data science requirements and gathering specs from end users.

Benefits

Top-of-market compensation (salary + equity)
Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)
Mission-driven, collaborative team
Twice-yearly offsites to align, build, and celebrate.

Hiring Process

Initial application.
Intro call: Discuss your background, career goals, and our mission.
2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.
Referees: We ask for 2 referees who can speak to your professional/technical work
Culture interview: Ways of working, and a chance to ask questions
Offer