AI/ML Research Scientist, LLM Post-Training & Evaluation
Tasks
- Analyze model behavior and failure patterns
- Collaborate with data scientists and engineers
- Conduct human evaluation and rubric design
- Create benchmark datasets and test suites
- Define LLM evaluation research agenda
- Design LLM post training experiments
- Design scoring methods and evaluation protocols
- Develop LLM evaluation frameworks
- Engage with customer technical stakeholders
- Evaluate human vs automated evaluation methods
- Perform robustness and stress testing
- Produce technical reports and documentation
- Publish research and open-source contributions
- Study long context and multimodal evaluation
- Translate research methods into evaluation pipelines
Perks/Benefits
- N/A
Skills/Tech-stack
Alignment | Benchmarking | DPO | Data Processing | Error Analysis | Fine Tuning | GRPO | Hugging Face | Human evaluation | Inter-rater reliability | JAX | LLM Evaluation | Language Models | Large Language Models | Machine Learning | Metric Design | PPO | PyTorch | Python | RLAIF | RLHF | Reproducible Research | Robustness Testing | Rubric Design | SFT | Safety evaluation | Significance Testing | Statistical Analysis | Stress Testing | TensorFlow | Uncertainty Quantification | Visualization
Education
Related jobs
-
Lead Data Scientist USD 150K-175KClassification | Cloud Computing | Clustering | Clustering Analysis | Computer Vision401k matching | Dental and vision care | Employee assistance program | Employee discount program | Health and wellbeing benefitsSenior-level Full TimeRemote - Nationwide, United States R2h ago
-
Applied AI Scientist, Agentic AI USD 100K-120KA2A protocol | Accuracy | Agentic AI | Autogen | CrewAI401k match | Dental insurance | Employee assistance program | Flexible schedule | Health insuranceMid-level Full TimeWork From Home, United States R2h ago
-
Senior-level Full TimeCenter, Center District, IL4h ago
-
Senior-level Full TimeCenter, Center District, IL4h ago
-
AI Research Scientist, SysML - FAIR USD 143K-208KArtificial Intelligence | C# | C++ | Co-design | Compiler designMid-level Full TimeMenlo Park, CA | Boston, MA …8h ago
-
3D machine perception | Benchmarking | Computer Vision | Deep learning | Generative ModelingEntry-level InternshipRedmond, WA8h ago
-
Research Scientist, AI & Systems Co-design (PhD) USD 117K-173KC# | C++ | Communication optimization | Compiler optimization | Deep learningNone Full TimeMenlo Park, CA8h ago
-
Research Scientist Intern, Robotic Control Policy (PhD) USD 130K-204KControl Theory | Dynamics | Imitation Learning | JAX | KinematicsEntry-level InternshipRedmond, WA | Burlingame, CA8h ago
-
AI Research Scientist, Media Data Research - MSL FAIR USD 117K-173KApache Spark | Computer Vision | Data Curation | Data Generation | Data Scaling LawsEntry-level Full TimeMenlo Park, CA8h ago
-
Bias Mitigation | Computational modeling | Computer Vision | Data Analysis | Data SetEntry-level InternshipRedmond, WA8h ago
-
AI Research Scientist - FAIR Social Intelligence USD 144K-251KArtificial Intelligence | Computational statistics | Game theory | Machine Learning | PythonEntry-level Full TimeBellevue, WA | Seattle, WA8h ago
-
3D computer graphics | Action Recognition | Architecture Search | C++ | Cloud processingEntry-level InternshipRedmond, WA | Seattle, WA8h ago
-
Data Scientist, Products & Applied Research USD 173K-235KBias Mitigation | Causal Inference | Data Analysis | Data Mining | ExperimentationCareer growthMid-level Full TimeMenlo Park, CA8h ago
-
Automatic Speech Recognition | Fine Tuning | Language Models | Language Processing | Large Language ModelsSenior-level Full TimeMenlo Park, CA8h ago
-
AI ethics | Agent Orchestration | Bias Mitigation | Capacity Planning | Data StorytellingSenior-level Full TimeBellevue, WA | Menlo Park, CA8h ago
-
Postdoctoral Researcher, Fundamental AI Research (PhD) USD 117K-145KComputational statistics | Computer Vision | Data Compression | Deep learning | Generative ModelingEntry-level Full TimeMenlo Park, CA8h ago
-
Data Scientist, Analytics (Technical Leadership) USD 160K-190KAI Workflow Optimization | AI workflow | Agent Orchestration | Bias Mitigation | Causal InferenceCareer development | World class analytics communitySenior-level Full TimeRemote, US | Bellevue, WA | … R8h ago
-
Research Scientist, Central Applied Science (PhD) USD 112K-173KAgent Orchestration | Algorithm Development | Apache Hive | Apache Spark | Artificial IntelligenceWork authorization supportEntry-level Full TimeMenlo Park, CA | New York, …8h ago
-
Mid-level Full TimeMenlo Park, CA8h ago
-
AI Research Scientist - MSL FAIR Foundations USD 147K-251KBenchmarking | Deep learning | Evaluation methodology | Language Model | Language Model EvaluationMid-level Full TimeMenlo Park, CA8h ago
-
Audio Algorithm Architect, Applied Research USD 237K-329KAcoustic Modeling | Audio signal processing | C plus plus | Deep learning | JAXSenior-level Full TimeIrvine, CA, USA8h ago
-
Data Scientist Computer Vision USD 109K-164KAWS | AWS SageMaker | Active Learning | Airflow | CI/CDDental insurance | Health insurance | Paid time off | Retirement plan | Sick leaveMid-level Full TimeChesterfield, Missouri, US11h ago
-
Senior Data Scientist USD 132K-187KA/B | A/B Testing | B testing | Data Analysis | Data PipelinesCommuter benefits | Disability benefits | Equity awards | Financial wellness support | Health insuranceSenior-level Full TimeSan Jose, California15h ago
-
Senior Data Scientist, SPB Global Optimization USD 175K-236KExperiment design | Machine Learning | Python | R | SASSenior-level Full TimeNew York, New York, USA19h ago
-
Data Scientist Consultant USD 125K-267KClustering | Data Visualization | Data Wrangling | Language Processing | Machine Learning401k plans | Flexible vacation policy | Hybrid work model | Medical and dental coverage | Paid time off for holidaysMid-level Full TimeHoboken, NJ, US, 07030 R19h ago