Student Assistant Technical Specialist for Evaluating Fine-Tuned LLMs

Main Campus (Gainesville, FL)

University of Florida

A top five public land-grant research university, the University of Florida creates a collaborative environment and accelerates future solutions.

View all jobs at University of Florida

Apply now Apply later

Classification Title:

Student Assistant Technical Specialist for Evaluating Fine-Tuned LLMs

Classification Minimum Requirements:
  • Currently enrolled in a graduate or undergraduate program in Computer Science, Data Science, Biomedical Engineering, or a related field.
  • Proficiency in Python and experience with machine learning libraries such as PyTorch.
  • Strong understanding of NLP and transformer-based language models (e.g., BERT, GPT, LLaMA).
  • Basic understanding of evaluation metrics for NLP models and experimental design.
  • Ability to work independently and collaboratively in a fast-paced, research-driven environment.
Job Description:

The Intelligent Critical Care Center (IC3) is a multi-disciplinary center focused on developing and providing sustainable support and leadership for transformative medical AI research, education, and clinical applications to advance patients' health in critical and acute care medicine. The Center addresses an unprecedented opportunity for world-leading ambient, immersive, and artificial intelligence (AI2) research and innovation to transform the diagnosis, monitoring, and treatment for critically and acutely ill patients using the multimodal clinical and research data and resources from UF Health (UFH), one of Florida’s largest health care systems. 

With a growing team of 37 faculty, scientists, researchers, and students, IC3 aims to revolutionize critical and acute care medicine. We are idealists, problem solvers, and explorers of digital health and AI. We’re looking for team members who are driven and enthusiastic to be a part of our mission to use AI and digital technologies to advance health care so that critically and acutely ill patients can receive the best possible treatment when they need it the most. 

We are looking for motivated and qualified students to join our team and contribute to evaluating fine-tuned large language models (LLMs) using benchmark datasets. This role involves designing and executing experiments to assess model performance, robustness, and clinical relevance, leveraging both open-source and in-house healthcare datasets.

Responsibilites:

  • Design and conduct experiments to evaluate fine-tuned large language models using clinical and general-domain benchmark datasets.
  • Analyze LLM performance across key metrics such as accuracy, hallucination rate, faithfulness, and response relevance in healthcare scenarios.
  • Compare multiple fine-tuning strategies and datasets to identify optimal configurations for clinical use cases.
  • Document experimental results and collaborate with research mentors to interpret outcomes and suggest model improvements.
  • Support the development of evaluation frameworks and tooling for reproducible and scalable testing of LLMs in a healthcare context.
Expected Salary:

$20/hr

Required Qualifications:
  • Currently enrolled in a graduate or undergraduate program in Computer Science, Data Science, Biomedical Engineering, or a related field.
  • Proficiency in Python and experience with machine learning libraries such as PyTorch.
  • Strong understanding of NLP and transformer-based language models (e.g., BERT, GPT, LLaMA).
  • Basic understanding of evaluation metrics for NLP models and experimental design.
  • Ability to work independently and collaboratively in a fast-paced, research-driven environment.
Preferred:
  • Experience evaluating LLMs using benchmark datasets such as MedQA, MMLU, PubMedQA, or BioASQ.
  • Knowledge of prompt engineering or fine-tuning methods (e.g., LoRA, PEFT, full fine-tuning).
  • Background in healthcare data, biomedical informatics, or clinical NLP applications.
  • Experience with experiment tracking tools (e.g., Weights & Biases, MLflow).
  • Strong written and verbal communication skills for presenting experimental results.
Special Instructions to Applicants:

Application must be submitted by 11:55 p.m. (ET) of the posting end date.

Health Assessment Required: No

 

Apply now Apply later
Job stats:  0  0  0
Category: NLP Jobs

Tags: BERT Classification Clinical NLP Computer Science Engineering GPT LLaMA LLMs LoRA Machine Learning MLFlow NLP Open Source Prompt engineering Python PyTorch Research Testing Weights & Biases

Perks/benefits: Career development

Region: North America
Country: United States

More jobs like this