Find jobs in AI/ML, Data Science and Big Data
17 results
for Reinforcement Learning from AI Feedback
(Skill/Tech stack)
-
AI Feedback | Deep learning | Direct Preference Optimization | Fine Tuning | Human FeedbackMid-level Full Time上海3d ago
-
AI Feedback | Data Pipelines | Evaluation | Experiment design | GradingMid-level Full TimeSan Francisco3d ago
-
Agent Post-Training, API & Power Users USD 295K-445KAI Feedback | Agent systems | Computer use | Cost Optimization | Data GenerationSenior-level Full TimeSan Francisco3d ago
-
Agent Post-Training Research USD 295K-445KAI Feedback | Agent systems | Calibrated Reasoning | Data Pipelines | Deep learningMid-level Full TimeSan Francisco5d ago
-
Staff Software Engineer, AI/ML USD 216K-271KAI Feedback | Agentic AI | Data Pipelines | Direct Preference Optimization | Experimentation platformsConference reimbursement | Education reimbursement | Employee assistance program | Employee stock purchase program | Equity compensationSenior-level Full TimeSeattle6d ago
-
Senior Machine Learning Engineer, Computer Vision/VLM USD 204K-259KAI Feedback | Computer Vision | Data Processing | Data Processing Pipelines | Deep learningSenior-level Full TimeMountain View, CA, USA; San Francisco, …6d ago
-
Senior Solutions Architect, Generative AI Research USD 184K-287KAI Agents | AI Feedback | Agent evaluation | Artificial Intelligence | BatchingSenior-level Full TimeUS, FL, Remote, United States R6d ago
-
Director, Reinforcement Learning & Agentic Post-Training EUR 151K-200KAI Feedback | API Integration | Distributed Training | Environment Design | EvaluationExecutive-level Full TimeParis, France6d ago
-
Senior Software Engineer - Model Training & AI Evals INR 3500K-5000KAI Feedback | Ablation Studies | Benchmarking | CI/CD | Data GenerationSenior-level Full TimeRemote (India) R12d ago
-
AI Feedback | Agent Orchestration | Agent systems | Agentic AI | Autonomous ReasoningSenior-level Full TimeSeoul, South Korea13d ago
-
Head of Physical AI Programs USD 300KAI Data | AI Feedback | AI data operations | Autonomous Systems | BenchmarkingExecutive-level Full TimeEast Palo Alto, CA, United States15d ago
-
Research Scientist, LLM Evaluation & Post-Training USD 150K-300KAI Feedback | Alignment | Benchmarking | Context evaluation | Deep learningMid-level Full TimeRemote Work( USA), United States R18d ago
-
Researcher: Agent Post-Training, API & Power-Users USD 295K-445KAI Feedback | Calibrated Reasoning | Data Generation | Deep learning | Error RecoverySenior-level Full TimeSan Francisco24d ago
-
Senior-level Full TimeNew York, New York, United States25d ago
-
AI Feedback | Agentic Systems | Direct Preference Optimization | Distributed Training | EvaluationSenior-level Full TimeAMER - United States - California … R1mo ago
-
Applied Scientist II, Alexa International Team USD 142K-193KA/B | A/B Testing | AI Feedback | B testing | Deep learningEntry-level Full Time InternshipBellevue, Washington, USA1mo ago
-
Staff AI Scientist USD 190K-300KAI Feedback | AIBERT Family | Adversarial Machine Learning | Agentic Systems | BERT401k plan | Fastrak reimbursement | Free annual Caltrain pass | Free lunch | Health, dental and vision coverageSenior-level Full TimePalo Alto1mo ago