Research Scientist, LLM Evaluation & Post-Training
USD 150K-160K Senior-level Full Time
Tasks
- Analyze model behavior and failure patterns
- Build evaluation and post training pipelines with ML teams
- Create benchmark datasets and evaluation reports
- Define and execute LLM evaluation research agenda
- Design experiments for post training outcomes
- Develop evaluation frameworks and benchmarks
- Implement scoring reliability and measurement validity
- Improve evaluation redesign recommendations
- Partner with customers to review evaluation methodologies
- Publish research findings and contribute to open-source
- Run human and automated evaluation studies
Perks/Benefits
- N/A
Skills/Tech-stack
Benchmarking | Context evaluation | DPO | Data Processing | Error Analysis | Experimental Design | GRPO | Hugging Face | Hugging Face Transformers | Human evaluation | JAX | LLM Evaluation | Language Processing | Long Context | Long Context Evaluation | Machine Learning | Metric Design | Model Alignment | Multimodal evaluation | Natural Language | Natural Language Processing | PPO | PyTorch | Python | RAG | RLAIF | RLHF | Reinforcement Learning | Robustness Testing | SFT | Significance Testing | Statistical Analysis | Stress Testing | TensorFlow | Uncertainty Quantification | Vector Databases
Education
Roles
Related jobs
-
Senior Specialist - Data Science USD 85K-150KAPIs | Data Ingestion | Data Transformation | Experimental Design | Generative AI401k matching | Employee assistance program | Flexible health insurance | Health savings account | Paid HolidaysSenior-level Full TimeCharlotte, NC, 601 S. Tryon Street, … R12h ago
-
Senior Data Scientist, DHA - AI Integration Retail USD 117K-161KBusiness Analysis | Data Analysis | Data Preparation | Machine Learning | Mathematics401k retirement savings plan | Dental insurance | Medical insurance | Paid Holidays | Paid time offSenior-level Full TimeWork at Home - Kentucky, United … R1d ago
-
Senior Data Scientist, NLP USD 184K-200KActive Learning | Continuous Learning | Data cleaning | Databricks | Deep learningSenior-level Full TimeRemote - United States R1d ago
-
Data Engineering Analyst Lead/Scientist USD 90K-150KAWS | Cloud Computing | Credit Risk | Data Privacy | Data Science401k matching | Dental insurance | Flexible schedule | Flexible time off | Hybrid workSenior-level Full TimeUnited States, UNITED STATES, United States R1d ago
-
DERs Data Scientist I, Solar Programs (Remote -US) USD 95K-115KData Reconciliation | Data Transformation | Data Warehousing | Data cleaning | Descriptive StatisticsDental insurance | Employee stock ownership plan | Health insurance | Remote work | Retirement planMid-level Full TimeRemote - US R1d ago
-
A/B | A/B Testing | B testing | Causal Inference | Computer VisionSenior-level Full TimeSan Mateo, CA, United States R1d ago
-
Senior/Staff Machine Learning Data Scientist USD 150K-228KAnomaly Detection | Bayesian statistics | Data Mapping | Deep learning | MLflowSenior-level Full TimeRemote - US Only R1d ago
-
Data Scientist Senior Consultant I USD 100K-170KArtificial Intelligence | Data Science | Data analytics | Machine Learning | Predictive AnalyticsSenior-level Full TimeUSA - NC (Remote), United States R2d ago
-
Applied Scientist USD 80K-218KAWS | Agentic Systems | Artificial Intelligence | Azure | Cloud DevelopmentAccess to Headspace app | Career development | Employee assistance program | Employee stock purchase plan | Fitness reimbursementMid-level Full TimeUnited States of America, Eagan, Minnesota R2d ago
-
Data Science and Forecasting Lead USD 96K-192KARIMA | Business Intelligence | Data Visualization | Excel | ForecastingDependent care spending account | Health care spending account | Health savings account | Long-term disability | Medical/Dental/VisionSenior-level Full TimeCAFLO: Carrier-Home Florida Remote Location, Remote … R2d ago
-
Data Scientist, People & Workforce Analytics USD 80K-110KAI Agents | Data Lake | Data cleaning | Deep learning | Feature EngineeringFull benefits package | Paid Holidays | Paid time offMid-level Full TimePA Philadelphia, United States R2d ago
-
Mid-level Full TimeUSA - CA - Los Angeles … R2d ago
-
Artificial Intelligence | Audio generation | Data Curation | Data Modeling | Data QualitySenior-level Full TimeSan Jose, United States R2d ago
-
Data Scientist USD 72K-84KAgentic AI | Clustering | Data Analysis | Data cleaning | Data integrationDental insurance | Disability insurance | Flexible spending account | Health insurance | Health savings accountEntry-level Full TimeRemote CW Site - USA - … R2d ago
-
DNA Lead Data Scientist (Multiple Positions) USD 105K-115KAzure Synapse | Data Mining | Data Warehousing | Data analytics | DatabricksAt will position | Telecommuting up to 1 day per weekSenior-level Full TimeErlanger, KY - Kentucky, United States R2d ago
-
Senior Specialist, Data Science USD 129K-203KAWS | Agent Orchestration | Bayesian Modeling | Causal Inference | Computational BiologySenior-level Full TimeUSA - Pennsylvania - West Point, … R2d ago
-
Principal Data Scientist, Operations USD 183K-224KApache Airflow | Code review | DVC | Kubernetes | MLflow401k company match | Dental insurance | Flexible schedule | Flexible spending account | Generous PTOSenior-level Full TimeRemote, United States R2d ago
-
Senior Applied Scientist - Search USD 200K-200KData Science | Fine Tuning | Hybrid search | Information Retrieval | Knowledge graphs401k retirement plan | Dental insurance | Equity compensation | Growth opportunities | Health insuranceSenior-level Full TimeNew York City R2d ago
-
Data Scientist (L5) - Ads (Experimentation) USD 372K-600KAuction theory | Causal Discovery | Causal Inference | Experimental Design | Machine Learning401k match | Disability insurance | Family-forming benefits | Flexible spending account | Flexible time offSenior-level Full TimeUSA - Remote, United States R3d ago
-
Pricing Data Scientist USD 175K-198KA/B | A/B Testing | B testing | Data Engineering | Data Modeling401k plan | Dental insurance | Health insurance | Life insurance | Paid HolidaysSenior-level Full TimeUnited States of America - Irvine, … R4d ago
-
Data Modeling | Deep learning | Ecommerce | Fine Tuning | Human-in-the-loopEquity | Health benefits | Hybrid work flexibility | Remote work up to 4 weeks per yearSenior-level Full TimeSan Francisco, CA R4d ago
-
Senior Data Scientist, Acquisition Marketing USD 120K-170KA/B | A/B Testing | Applied statistics | B testing | Causal Inference401k match | Company events | Dental insurance | Disability benefits | Flexible paid time offSenior-level Full TimeAtlanta, GA preferred, Remote R4d ago
-
Senior Data Scientist, West USD 170K-190KAPI Development | Cloud Architecture | Dash | Data Engineering | Data Visualization401k company match | Commuter benefits | Dental insurance | Employer paid disability coverage | Flexible spending accountsSenior-level Full TimeUnited States, Remote R4d ago
-
Senior Data Scientist (I & II) USD 161K-230KAttribution Modeling | Causal Inference | Difference-in-differences | Econometrics | Experiment designFlexible work options | Remote workSenior-level Full TimeUnited States - Remote R4d ago
-
Data Scientist II USD 110K-140KApache Hive | Apache Impala | Apache Spark | Big Data | Data Analysis401k | Dental insurance | Disability insurance | Life insurance | Medical insuranceMid-level Full TimeRemote, United States R4d ago