Find jobs in AI/ML, Data Science and Big Data
46 results
for Reward Modeling
(Skill/Tech stack)
-
Intern Engineer – RL Post-Training for LLMs CAD 58K-104KData Generation | Deep learning | DeepSpeed | Distributed Training | GRPOInternshipEntry-level InternshipVancouver, British Columbia, Canada2d ago
-
Sr. Physical AI Research Scientist CAD 140K-180KAI alignment | Artificial Intelligence | Computer Vision | Constitutional AI | Continual LearningHybrid work scheduleSenior-level Full TimeToronto, ON, CA5d ago
-
Research Engineer - LLM Training & Alignment Systems CAD 127K-225KAutomation | Benchmarking | C# | C++ | Data CurationMid-level Contract Full TimeKingston, Ontario, Canada7d ago
-
Mid-level Internship上海8d ago
-
大语言模型后训练/Agentic算法工程师 CNY 180K-360KAgentic RL | DAPO | Distributed Training | Evaluation | Function CallingEntry-level Full Time上海、北京8d ago
-
Machine Learning Researcher - RL and Agentic Systems USD 190K-287KAgentic Systems | Benchmarking | Data Validation | Dataset Quality Evaluation | Dataset qualityMid-level Full TimeRemote R13d ago
-
Data Curation | Deep learning | DeepSpeed | Direct Preference Optimization | EvaluationSenior-level Full TimeSingapore, Singapore16d ago
-
Staff Machine Learning Engineer, AV Core USD 336K-370K3D Scene | 3D Scene Understanding | Action models | Behavior Modeling | C++Hybrid work | Work from homeSenior-level Full TimeSunnyvale19d ago
-
Agent simulation | Behavioral Modeling | DPO | Data Curation | Data GenerationEntry-level Full Time InternshipUS, CA, Santa Clara, United States20d ago
-
Adversarial ML | Benchmarking | Data Mining | Environment Design | Function CallingMid-level Full TimeMountain View, CA, USA; New York, …20d ago
-
Data Curation | Data Generation | Deep learning | Distributed Training | Fine TuningInternship benefitsEntry-level Full Time InternshipUS, CA, Santa Clara, United States21d ago
-
Audio Processing | Autoregression | Autoregressive models | Computer Vision | Deep learningRemote workSenior-level Full TimeRemote job R22d ago
-
Applied Scientist, Wayve Labs USD 147K-213KAutoregressive models | Depth Estimation | Diffusion Models | Foundation Models | LanguageDaily yoga | Enhanced parental leave | Flexible working hours | Hybrid working | Large Social BudgetsMid-level Full TimeSunnyvale23d ago
-
Agent Orchestration | Data Pipelines | Debugging | Evaluation | Language ModelsDirect founder collaboration | High technical ownership | Hybrid option | Meaningful architectural influence | Mission-driven healthcare impactSenior-level Full TimeRemote; Boston, MA; Onsite R24d ago
-
Applied AI Engineer USD 175K-275KEmbeddings | Generative AI | LanceDB | Langchain | Language ModelsDevelopment opportunities | Hybrid work culture | Mentorship | Professional growthSenior-level Full TimeSan Francisco24d ago
-
Applied Scientist, Wayve Labs CAD 100K-132KAutoregressive models | Computer Vision | Data sets | Depth Estimation | Diffusion ModelsDaily yoga | Enhanced parental leave | Flexible working hours | Large Social Budgets | Onsite barMid-level Full TimeVancouver26d ago
-
Applied Scientist, Wayve Labs GBP 80K-96KAutoregressive models | Depth Estimation | Diffusion Models | Foundation Models | Human FeedbackDaily yoga | Enhanced parental leave | Flexible working hours | Onsite bar | Onsite chefMid-level Full TimeLondon26d ago
-
AI Scientist GBP 46K-46KAzure | Azure OpenAI | Azure OpenAI Services | Databricks | Dataset PreparationMid-level Full TimeLondon, United Kingdom27d ago
-
Principal Machine Learning Engineer, Short-form USD 233K-350KCloud platform | Data Modeling | Feedback Loop Mitigation | Feedback loop | GCP Pipelines401k plan | Dental insurance | Disability insurance | Life insurance | Medical insuranceSenior-level Full TimeNew York, NY, US, 1003627d ago
-
A/B | A/B Testing | B testing | Data Pipelines | Fine Tuning401k retirement plan | Health insurance | Meal allowance | Paid flexible holidays | Paid parental leaveSenior-level Full TimeNew York, NY29d ago
-
Software Engineer - Machine Learning USD 190K-220KAdversarial Data | Adversarial Data Generation | Adversarial Training | Content Moderation | DPOMid-level ContractMountain View, CA30d ago
-
Data Processing | Deep learning | Distributed Training | Generative Models | Human FeedbackFamily leave | Free food and snacks | Health care plan | Life insurance | Long-term disabilitySenior-level Full Time费利蒙30d ago
-
Deep learning | GPU Computing | Language Models | Language Processing | Large Language ModelsEntry-level Full Time InternshipUS, CA, Santa Clara, United States1mo ago
-
Alignment | Benchmark design | Constitutional AI | Continued Pretraining | Data CurationSenior-level Full TimeDublin, CA (HQ)1mo ago
-
Alignment | Benchmark design | DPO | Data Curation | Data DeduplicationSenior-level Full TimeIndia/Bengaluru1mo ago
-
Constitutional AI | Continued Pretraining | DPO | Data Curation | DeduplicationSenior-level Full TimeBrazil/Remote R1mo ago
-
Senior Applied AI Researcher (India) INR 2500K-4500KArtificial Intelligence | DPO | Data parallelism | DataLoader | DeepSpeedSenior-level Full TimeIndia/Bengaluru1mo ago
-
Senior Applied AI Researcher (Brazil) BRL 271K-370KCI/CD | DPO | Data parallelism | Deep learning | DeepSpeedSenior-level Full TimeBrazil/Remote R1mo ago
-
Senior Applied AI Researcher (Dublin, CA) USD 190K-300KAutomated testing | Continuous Evaluation | Data parallelism | Deep learning | DeepSpeedSenior-level Full TimeDublin, CA (HQ)1mo ago
-
Applied AI Researcher (India) INR 2000K-3465KAWS | Automated testing | Azure | CI/CD | Cloud ComputingMid-level Full TimeIndia/Bengaluru1mo ago
-
Applied AI Researcher (Dublin, CA) USD 239K-331KCI/CD | Computer Vision | Data Preprocessing | Deep learning | Direct Preference OptimizationMid-level Full TimeDublin, CA (HQ)1mo ago
-
Bayesian optimization | Causal Inference | Causal Models | Combinatorial Optimization | Computer VisionEntry-level Full TimeTel Aviv-Jaffa, Tel Aviv District, IL1mo ago
-
Machine Learning Engineer - Personalization USD 170K-212KA/B | A/B Testing | AWS | Agile methodology | Apache Beam401k retirement plan | Health insurance | Meal allowance | Paid flexible holidays | Paid parental leaveSenior-level Full TimeNew York, NY1mo ago
-
Forward Deployed Engineer Lead | LLM Post-training USD 165K-258KData Generation | Data Pipelines | Dataset versioning | Distributed Training | Evaluation methodologyDental insurance | Disability insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeNew York1mo ago
-
Data Analysis | Dataset Processing | Direct Preference Optimization | Evaluation Pipelines | Fine TuningEntry-level InternshipSan Jose, California, United States1mo ago
-
Senior Director, AI Model LifeCycle USD 301K-355KCheckpointing | Dataset versioning | Experiment tracking | Failure recovery | Fine Tuning401k match | Cell phone stipend | Commuter benefits | Dental insurance | HSA contributionsSenior-level Full TimeSan Francisco, CA - US1mo ago
-
Senior Software Engineer, RL Post-Training Frameworks USD 184K-356KActor Based Programming | C# | C++ | Consistency models | DPOComprehensive benefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States1mo ago
-
Entry-level Internship上海1mo ago
-
Senior Machine Learning Engineer (Small Language Models) USD 154K-189KAWS | Adapter-Tuning | Axolotl | Cloud Computing | Data labelingFlexible remote days | Flexible work scheduleSenior-level Full TimeCanada - Remote R1mo ago
-
LLM Applied Data Scientist (RAG/ NLP) TWD 480K-612KA Star | API Integration | C++ | Deep learning | EmbeddingsCareer growth | Continuous learning | Work from homeMid-level Full TimeTaiwan, Taipei1mo ago
-
Principal PMT-ES - AI/ML Training, Annapurna Labs USD 181K-281KAI/ML | Customer Requirements | DPO | Deep learning | Developer experienceCareer growth resources | Flexible organization | Knowledge sharing | Mentorship | Work-life balanceSenior-level Full TimeCupertino, California, USA1mo ago
-
Member of technical staff - Research - Model - London GBP 230K-340KData Pipelines | Deep learning | Distributed Training | Evaluation | GitCareer development | Continuous learning | Hybrid work | Professional growthSenior-level Full TimeLondon1mo ago
-
Applied AI Researcher, Post-Training USD 150K-250KAgentic collaboration | Continual Learning | Continual pretraining | DPO | Data Analysis401k | Commuter benefits | In-office lunch | Medical, dental & vision coverageMid-level Full TimeSan Francisco1mo ago
-
Applied AI Researcher, System Self-Improvement USD 150K-250KAgentic collaboration | Data Analysis | Ensembling | Evaluation | Graph-of-Thoughts401k | Commuter benefits | Equity | In-office lunch | Medical, dental & vision coverageMid-level Full TimeSan Francisco1mo ago
-
Deep learning | Language Processing | Large-scale | Large-scale experimentation | Machine LearningCompany-sponsored medical plan | Paid Holidays | Paid sick leaveEntry-level Full Time InternshipUS-Washington-Bellevue, United States1mo ago
-
Deep learning | Language Processing | Natural Language | Natural Language Processing | PyTorchMedical plan enrollment | Paid Holidays | Paid sick leaveEntry-level Full Time InternshipUS-Washington-Bellevue, United States1mo ago