Find jobs in AI/ML, Data Science and Big Data
147 results
for Reinforcement Learning from Human Feedback
(Skill/Tech stack)
-
AI Feedback | Deep learning | Direct Preference Optimization | Fine Tuning | Human FeedbackMid-level Full Time上海3d ago
-
大语言模型后训练/Agentic算法工程师 CNY 180K-360KDistributed Training | Function Calling | GRPO | Human Feedback | JSONEntry-level Full Time上海、北京3d ago
-
AI Feedback | Data Pipelines | Evaluation | Experiment design | GradingMid-level Full TimeSan Francisco3d ago
-
Agent Post-Training, API & Power Users USD 295K-445KAI Feedback | Agent systems | Computer use | Cost Optimization | Data GenerationSenior-level Full TimeSan Francisco3d ago
-
Research Scientist, Machine Learning EUR 60K-76KA/B | A/B Testing | B testing | Data pipeline | Deep learningMid-level Full TimeParis, France4d ago
-
Agent Post-Training Research USD 295K-445KAI Feedback | Agent systems | Calibrated Reasoning | Data Pipelines | Deep learningMid-level Full TimeSan Francisco5d ago
-
Junior Foundation AI Engineer EUR 30KAWS | Accelerate | Azure | CUDA | Cloud ComputingCorporate welfare | Health insurance | Meal vouchers | Smart working | TrainingEntry-level Full TimeMilano (Bassi), Italy5d ago
-
LLM Engineer USD 100K-150KAdapters | DeepSpeed ZeRO | Direct Preference Optimization | Efficient Attention | FSDPMid-level Full TimeUnited States - Remote R5d ago
-
Data Pipelines | Evaluation | Fine Tuning | Human Feedback | LLM Fine-tuningSenior-level Full TimeParis, France5d ago
-
Artificial Intelligence | Autoregressive modeling | Computer Vision | Conditional Computation | Deep learningFlexible start dates | Holiday pay | On site work in Amsterdam | Relocation assistance | Sick payEntry-level InternshipAmsterdam, North Holland, Netherlands5d ago
-
Staff Software Engineer, AI/ML USD 216K-271KAI Feedback | Agentic AI | Data Pipelines | Direct Preference Optimization | Experimentation platformsConference reimbursement | Education reimbursement | Employee assistance program | Employee stock purchase program | Equity compensationSenior-level Full TimeSeattle6d ago
-
Senior Machine Learning Engineer, Computer Vision/VLM USD 204K-259KAI Feedback | Computer Vision | Data Processing | Data Processing Pipelines | Deep learningSenior-level Full TimeMountain View, CA, USA; San Francisco, …6d ago
-
Senior-level Full TimeCN-OCG International Center, Cheng Du, China6d ago
-
Senior Solutions Architect, Generative AI Research USD 184K-287KAI Agents | AI Feedback | Agent evaluation | Artificial Intelligence | BatchingSenior-level Full TimeUS, FL, Remote, United States R6d ago
-
Senior Applied Scientist USD 142K-270KData Pipelines | Diffusion Models | Direct Preference Optimization | Evaluation metrics | Fine TuningSenior-level Full TimeSeattle, United States R6d ago
-
Data Science Experts USD 140K-200KData Annotation | Data Visualization | Experimental Design | Human Feedback | Human-in-the-loopFull-time hours | Long term projects | Weekday availabilityMid-level Full TimeUnited States - Remote R6d ago
-
Director, Reinforcement Learning & Agentic Post-Training EUR 151K-200KAI Feedback | API Integration | Distributed Training | Environment Design | EvaluationExecutive-level Full TimeParis, France6d ago
-
Machine Learning Engineer, Community Support Engineering USD 170K-180KAgent Orchestration | Agentic AI | Artificial Intelligence | Autonomous Reasoning | Fine TuningSenior-level Full TimeSan Francisco, CA6d ago
-
Benchmarking | Deep learning | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksFlexible working arrangements | Global collaboration | Publication support | Remote work | Research innovation opportunitiesSenior-level Full TimeEstonia R6d ago
-
Benchmarking | Deep learning | Distributed Training | Efficient Fine Tuning | Fine TuningFlexible working arrangements | Fully remote | High autonomy | Professional growth | Publication supportSenior-level Full TimeHungary R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Fine Tuning | GPU infrastructureFlexible working arrangements | Fully remote work | High autonomy | Open Source Collaboration Opportunities | Professional growth opportunitiesSenior-level Full TimeFinland R6d ago
-
Benchmarking | Computer Vision | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksAutonomy | Fully remote | Professional growth | Publication support | Work-life balanceSenior-level Full TimeCzechia R6d ago
-
Benchmarking | Deep learning | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksAutonomy and ownership | Flexible working arrangements | Fully remote work | Professional growth opportunities | Publication supportSenior-level Full TimeNorway R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningAutonomy | Flexible working arrangements | Fully remote | Global collaboration | Publication supportSenior-level Full TimeLuxembourg R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Fine Tuning | GPU infrastructureFlexible working arrangements | High autonomy | Professional growth opportunities | Remote work | Research publication supportSenior-level Full TimeBulgaria R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningFlexible working arrangements | Fully remote work | Professional growth opportunities | Publication supportSenior-level Full TimeDenmark R6d ago
-
Benchmarking | Dataset curation | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksAutonomy and ownership | Flexible working arrangements | Fully remote | Professional growth opportunities | Publication supportSenior-level Full TimeGreece R6d ago
-
Benchmarking | Computer Vision | Data Pipelines | Deep learning | Distributed TrainingAutonomy | Flexible working arrangements | Fully remote | Open source collaboration | Professional growth opportunitiesSenior-level Full TimeChile R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningFlexible working arrangements | High autonomy | Professional growth opportunities | Publication support | Remote workSenior-level Full TimePoland R6d ago
-
Benchmarking | Computer Vision | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksAutonomy | Flexible working arrangements | Publication support | Remote work | Work-life balanceSenior-level Full TimeAustria R6d ago
-
Benchmarking | Dataset curation | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksAutonomy | Flexible working arrangements | Fully remote work | Professional growth opportunities | Publication supportSenior-level Full TimeSweden R6d ago
-
Benchmarking | Deep learning | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksFlexible working arrangements | High autonomy | Open source publication support | Professional growth opportunities | Remote workSenior-level Full TimeIsrael R6d ago
-
Benchmarking | Computer Vision | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksFlexible working arrangements | Professional growth | Publication support | Remote workSenior-level Full TimeBelgium R6d ago
-
Benchmarking | Dataset curation | Deep learning | Distributed Training | Efficient Fine TuningFully remote | High autonomy | Professional growth | Publication support | Work-life balanceSenior-level Full TimeUnited Arab Emirates R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningFlexible working arrangements | Fully remote work | Professional growth opportunities | Work-life balanceSenior-level Full TimeTurkey R6d ago
-
Benchmarking | Dataset curation | Deep learning | Distributed Training | Efficient Fine TuningFlexible working arrangements | Fully remote work | Professional growth opportunities | Publication supportSenior-level Full TimeAustralia R6d ago
-
Benchmarking | Data Curation | Dataset Filtering | Distributed Training | Efficient Fine TuningAutonomy | Flexible working arrangements | Professional growth | Publication support | Remote workSenior-level Full TimeSouth Africa R6d ago
-
Benchmarking | Computer Vision | Dataset curation | Deep learning | Distributed TrainingAutonomy | Flexible working arrangements | Fully remote | Global collaboration opportunities | Professional growthSenior-level Full TimeMexico R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningFlexible working hours | Professional growth opportunities | Publication support | Remote work | Work-life balanceSenior-level Full TimeRomania R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningAutonomy and ownership | Flexible working arrangements | Fully remote work | Professional growth opportunities | Publication supportSenior-level Full TimeItaly R6d ago
-
Benchmarking | Computer Vision | Dataset curation | Deep learning | Distributed TrainingFlexible working arrangements | High autonomy | Professional growth | Publication support | Remote workSenior-level Full TimePortugal R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningFlexible working arrangements | Fully remote | High autonomy | Professional growth opportunities | Publication supportSenior-level Full TimeNetherlands R6d ago
-
Benchmarking | Computer Vision | Data Curation | Deep learning | Distributed TrainingFlexible working hours | Professional growth | Publication support | Remote work | Work-life balanceSenior-level Full TimeIreland R6d ago
-
Benchmarking | Distributed Training | Efficient Fine Tuning | Evaluation Frameworks | Fine TuningFully remote | Global collaboration | High autonomy | Professional growth | Publication supportSenior-level Full TimeSwitzerland R6d ago
-
Benchmarking | Dataset curation | Distributed Training | Efficient Fine Tuning | Fine TuningAutonomy | Flexible working arrangements | Professional growth | Publication support | Remote workSenior-level Full TimeFrance R6d ago
-
Benchmarking | Computer Vision | Dataset curation | Deep learning | Distributed TrainingFully remote | High autonomy | Professional growth opportunities | Research publication support | Work-life balanceSenior-level Full TimeGermany R6d ago
-
Benchmarking | Dataset curation | Distributed Training | Efficient Fine Tuning | Fine TuningAutonomy | Flexible working arrangements | High-impact projects | Professional growth opportunities | Publication supportSenior-level Full TimeSpain R6d ago
-
Benchmarking | Dataset curation | Deep learning | Distributed Training | Efficient Fine TuningAutonomy and ownership | Flexible working arrangements | Professional growth | Publication support | Remote workSenior-level Full TimeBrazil R6d ago
-
Benchmarking | Dataset curation | Distributed Training | Efficient Fine Tuning | Evaluation FrameworksFlexible working arrangements | Fully remote | High autonomy | Professional growth opportunities | Publication supportSenior-level Full TimeCanada R6d ago
-
Benchmarking | Computer Vision | Dataset curation | Distributed Training | Efficient Fine TuningAutonomy | Flexible working arrangements | Fully remote | Global collaboration opportunities | Professional growthSenior-level Full TimeIndia R6d ago