Clinical Data Scientist

New York

Phare Health

We build AI tools to make claims more accurate and transparent, so that you can focus on patients not price tags.

View all jobs at Phare Health

Apply now Apply later

About Us

Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by General Catalyst, we’re scaling quickly - join us!


The Role

You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:

  • Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.

  • Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.

  • Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.

  • Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.

  • Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.


Who we're looking for

  • 3+ years applying NLP or data-science to clinical (or similarly complex) text.

  • Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).

  • Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.

  • Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).

  • Track record operating production-grade ML systems with monitoring and uptime targets.

Bonus points

  • Peer-reviewed publications or open-source contributions in clinical NLP.

  • Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.

  • Experience in customer-facing roles communicating data science requirements and gathering specs from end users.

Benefits

  • Top-of-market compensation (salary + equity)

  • Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)

  • Mission-driven, collaborative team

  • Twice-yearly offsites to align, build, and celebrate.


Hiring Process

  1. Initial application.

  2. Intro call: Discuss your background, career goals, and our mission.

  3. 2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.

  4. Referees: We ask for 2 referees who can speak to your professional/technical work

  5. Culture interview: Ways of working, and a chance to ask questions

  6. Offer

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Data Science Jobs

Tags: Architecture CI/CD Clinical NLP Data analysis EDA LLMs LOINC Machine Learning Model design NLP Open Source Pandas Pipelines Python R RAG R&D Research SNOMED SQL

Perks/benefits: Career development Equity / stock options Flex hours Flex vacation Salary bonus

Region: North America
Country: United States

More jobs like this