Find jobs in AI/ML, Data Science and Big Data
63 results
for Reward Modeling
(Skill/Tech stack)
-
Adversarial ML | Benchmarking | Data Mining | Environment Design | Function CallingMid-level Full TimeMountain View, CA, USA; New York, …7h ago
-
Audio Processing | Autoregression | Autoregressive models | Computer Vision | Deep learningRemote workSenior-level Full TimeRemote job R2d ago
-
Senior Machine Learning Engineer, Agentic USD 163K-245KArtificial Intelligence | Direct Preference Optimization | Evaluation | Fine Tuning | Human-in-the-loop401k matching | Catered meals | Employee events | Employer-paid disability insurance | Employer-paid life insuranceSenior-level Full TimeBellevue, WA; Menlo Park, CA2d ago
-
大语言模型后训练/Agentic算法工程师 CNY 180K-360KAgentic RL | DAPO | Distributed Training | Function Calling | GRPOEntry-level Full Time上海、北京3d ago
-
Applied Scientist, Wayve Labs USD 147K-213KAutoregressive models | Depth Estimation | Diffusion Models | Foundation Models | LanguageDaily yoga | Enhanced parental leave | Flexible working hours | Hybrid working | Large Social BudgetsMid-level Full TimeSunnyvale3d ago
-
Agent Orchestration | Data Pipelines | Debugging | Evaluation | Language ModelsDirect founder collaboration | High technical ownership | Hybrid option | Meaningful architectural influence | Mission-driven healthcare impactSenior-level Full TimeRemote; Boston, MA; Onsite R3d ago
-
Applied AI Engineer USD 175K-275KEmbeddings | Generative AI | LanceDB | Langchain | Language ModelsDevelopment opportunities | Hybrid work culture | Mentorship | Professional growthSenior-level Full TimeSan Francisco4d ago
-
Applied Scientist, Wayve Labs CAD 100K-132KAutoregressive models | Computer Vision | Data sets | Depth Estimation | Diffusion ModelsDaily yoga | Enhanced parental leave | Flexible working hours | Large Social Budgets | Onsite barMid-level Full TimeVancouver5d ago
-
Applied Scientist, Wayve Labs GBP 80K-96KAutoregressive models | Depth Estimation | Diffusion Models | Foundation Models | Human FeedbackDaily yoga | Enhanced parental leave | Flexible working hours | Onsite bar | Onsite chefMid-level Full TimeLondon5d ago
-
AI Scientist GBP 46K-46KAzure | Azure OpenAI | Azure OpenAI Services | Databricks | Dataset PreparationMid-level Full TimeLondon, United Kingdom6d ago
-
Principal Machine Learning Engineer, Short-form USD 233K-350KCloud platform | Data Modeling | Feedback Loop Mitigation | Feedback loop | GCP Pipelines401k plan | Dental insurance | Disability insurance | Life insurance | Medical insuranceSenior-level Full TimeNew York, NY, US, 100366d ago
-
Head of World Models (Universal Robots, India) INR 3000K-6000KAI orchestration | Actor-critic | Agent Frameworks | Autogen | DPOExecutive-level Full TimeBangalore, IN8d ago
-
Head of Simulation (Universal Robots, India) INR 3000K-6000KAI orchestration | Actor-Critic methods | Actor-critic | Agent Frameworks | AutogenExecutive-level Full TimeBangalore, IN8d ago
-
A/B | A/B Testing | B testing | Data Pipelines | Fine Tuning401k retirement plan | Health insurance | Meal allowance | Paid flexible holidays | Paid parental leaveSenior-level Full TimeNew York, NY9d ago
-
Software Engineer - Machine Learning USD 190K-220KAdversarial Data | Adversarial Data Generation | Adversarial Training | Content Moderation | DPOMid-level ContractMountain View, CA9d ago
-
Data Processing | Deep learning | Distributed Training | Generative Models | Human FeedbackFamily leave | Free food and snacks | Health care plan | Life insurance | Long-term disabilitySenior-level Full Time费利蒙9d ago
-
大模型算法工程师(开放域对话) CNY 180K-300KDPO | Deep learning | DeepSpeed | Distributed Training | Function CallingInternshipMid-level Internship上海12d ago
-
Senior Applied Scientist USD 142K-270KData Pipelines | Diffusion Models | Direct Preference Optimization | Fine Tuning | Generative AISenior-level Full TimeSan Jose, United States R13d ago
-
Deep learning | GPU Computing | Language Models | Language Processing | Large Language ModelsEntry-level Full Time InternshipUS, CA, Santa Clara, United States15d ago
-
Alignment | Benchmark design | Constitutional AI | Continued Pretraining | Data CurationSenior-level Full TimeDublin, CA (HQ)15d ago
-
Alignment | Benchmark design | DPO | Data Curation | Data DeduplicationSenior-level Full TimeIndia/Bengaluru15d ago
-
Constitutional AI | Continued Pretraining | DPO | Data Curation | DeduplicationSenior-level Full TimeBrazil/Remote R15d ago
-
Senior Applied AI Researcher (India) INR 2500K-4500KArtificial Intelligence | DPO | Data parallelism | DataLoader | DeepSpeedSenior-level Full TimeIndia/Bengaluru15d ago
-
Senior Applied AI Researcher (Brazil) BRL 271K-370KCI/CD | DPO | Data parallelism | Deep learning | DeepSpeedSenior-level Full TimeBrazil/Remote R15d ago
-
Senior Applied AI Researcher (Dublin, CA) USD 190K-300KAutomated testing | Continuous Evaluation | Data parallelism | Deep learning | DeepSpeedSenior-level Full TimeDublin, CA (HQ)15d ago
-
Applied AI Researcher (India) INR 2000K-3465KAWS | Automated testing | Azure | CI/CD | Cloud ComputingMid-level Full TimeIndia/Bengaluru16d ago
-
Applied AI Researcher (Dublin, CA) USD 239K-331KCI/CD | Computer Vision | Data Preprocessing | Deep learning | Direct Preference OptimizationMid-level Full TimeDublin, CA (HQ)16d ago
-
Bayesian optimization | Causal Inference | Causal Models | Combinatorial Optimization | Computer VisionEntry-level Full TimeTel Aviv-Jaffa, Tel Aviv District, IL18d ago
-
Machine Learning Engineer - Personalization USD 170K-212KA/B | A/B Testing | AWS | Agile methodology | Apache Beam401k retirement plan | Health insurance | Meal allowance | Paid flexible holidays | Paid parental leaveSenior-level Full TimeNew York, NY24d ago
-
Forward Deployed Engineer Lead | LLM Post-training USD 165K-258KData Generation | Data Pipelines | Dataset versioning | Distributed Training | Evaluation methodologyDental insurance | Disability insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeNew York24d ago
-
Data Analysis | Dataset Processing | Direct Preference Optimization | Evaluation Pipelines | Fine TuningEntry-level InternshipSan Jose, California, United States26d ago
-
Senior Director, AI Model LifeCycle USD 301K-355KCheckpointing | Dataset versioning | Experiment tracking | Failure recovery | Fine Tuning401k match | Cell phone stipend | Commuter benefits | Dental insurance | HSA contributionsSenior-level Full TimeSan Francisco, CA - US26d ago
-
Researcher, Agentic Post-Training USD 295K-445KAgent systems | Data Pipelines | Diagnostics | Evals | Function CallingSenior-level Full TimeSan Francisco28d ago
-
Senior Software Engineer, RL Post-Training Frameworks USD 184K-356KActor Based Programming | C# | C++ | Consistency models | DPOComprehensive benefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States28d ago
-
Entry-level Internship上海30d ago
-
AI Research Scientist - Agentic Systems USD 220K-295KAPIs | Data Augmentation | Data Generation | Fine Tuning | Language Models401k | Medical, dental, and vision insurance | Mental health and wellness support | Unlimited PTO | Work-life balanceMid-level Full TimeNew York, NY1mo ago
-
Senior Machine Learning Engineer (Small Language Models) USD 154K-189KAWS | Adapter-Tuning | Axolotl | Cloud Computing | Data labelingFlexible remote days | Flexible work scheduleSenior-level Full TimeCanada - Remote R1mo ago
-
LLM Applied Data Scientist (RAG/ NLP) TWD 480K-612KA Star | API Integration | C++ | Deep learning | EmbeddingsCareer growth | Continuous learning | Work from homeMid-level Full TimeTaiwan, Taipei1mo ago
-
Principal PMT-ES - AI/ML Training, Annapurna Labs USD 181K-281KAI/ML | Customer Requirements | DPO | Deep learning | Developer experienceCareer growth resources | Flexible organization | Knowledge sharing | Mentorship | Work-life balanceSenior-level Full TimeCupertino, California, USA1mo ago
-
Member of technical staff - Research - Model - London GBP 230K-340KData Pipelines | Deep learning | Distributed Training | Evaluation | GitCareer development | Continuous learning | Hybrid work | Professional growthSenior-level Full TimeLondon1mo ago
-
Applied AI Researcher, Post-Training USD 150K-250KAgentic collaboration | Continual Learning | Continual pretraining | DPO | Data Analysis401k | Commuter benefits | In-office lunch | Medical, dental & vision coverageMid-level Full TimeSan Francisco1mo ago
-
Applied AI Researcher, System Self-Improvement USD 150K-250KAgentic collaboration | Data Analysis | Ensembling | Evaluation | Graph-of-Thoughts401k | Commuter benefits | Equity | In-office lunch | Medical, dental & vision coverageMid-level Full TimeSan Francisco1mo ago
-
Deep learning | Language Processing | Large-scale | Large-scale experimentation | Machine LearningCompany-sponsored medical plan | Paid Holidays | Paid sick leaveEntry-level Full Time InternshipUS-Washington-Bellevue, United States1mo ago
-
Deep learning | Language Processing | Natural Language | Natural Language Processing | PyTorchMedical plan enrollment | Paid Holidays | Paid sick leaveEntry-level Full Time InternshipUS-Washington-Bellevue, United States1mo ago
-
Helix AI Engineer, Reinforcement Learning USD 150K-350KCredit Assignment | Distributed Training | Experiment Management | Exploration | Model-based reinforcement learningIn-office collaborationSenior-level Full TimeSan Jose, CA1mo ago
-
Helix AI Engineer, Pretraining USD 175K-400KComputer Vision | Data Mixture Optimization | Deep learning | Distributed Training | Language ProcessingSenior-level Full TimeSan Jose, CA1mo ago
-
Mid-level Full TimeMedellín, Medellín, Antioquia, Colombia, Antioquia, Colombia1mo ago
-
Staff Software Engineer, Generative AI, Core ML USD 207K-300KAI Feedback | Computer Vision | Data Processing | Deep learning | Digital TwinSenior-level Full TimeMountain View, CA, USA1mo ago
-
Machine Learning Engineer I USD 151K-189KAWS | Azure | Classification | Cloud Computing | Code review401k match | Equity | Flexible PTO | Learning stipend | Medical/Dental/Vision insuranceMid-level Full TimeSan Francisco, CA1mo ago
-
AI Research Scientist - Safety Alignment Team USD 213K-293KAdversarial prompts | Automation | Computer Vision | DPO | Dataset curationSenior-level Full TimeMenlo Park, CA1mo ago