Find jobs in AI/ML, Data Science and Big Data

15 results for Offline Reinforcement Learning (Skill/Tech stack)

机器人真机强化学习算法实习生 CNY 25K-37K

Actor-critic | Data Analysis | GRPO | Multimodal Learning | Offline Reinforcement Learning

Entry-level Internship

上海

3d ago
Machine Learning Engineer – RL USD 155K-180K

Actor-critic | Adversarial Testing | Conservative Policies | Constraint Enforcement | DPO

Senior-level Full Time

United States - Remote R

4d ago
Senior AI Engineer (Closed-loop Simulation & RL) KRW 30000K-30000K

3DGS | Agent simulation | Autonomous Driving | CARLA | Counterfactual Simulation

Senior-level Full Time

Pangyo (Software Dream Center), South Korea

7d ago
[MS/PhD Intern] AI Engineer (정규직 전환형) KRW 25284K-25754K

Ablation Studies | Auto-labeling | Autonomous Driving | Benchmarking | Critical Systems

Entry-level Internship

Pangyo (Software Dream Center), South Korea

7d ago
强化学习算法工程师 CNY 180K-300K

A/B | A/B Testing | B testing | Data pipeline | Deep learning

Entry-level Full Time

上海

12d ago
全身运控操作研究员 CNY 180K-360K

Contact modeling | Diffusion Models | Dynamic stability | Imitation Learning | Isaac Sim

Academic publishing support | High level research environment | Research platform support | Team collaboration with hardware and control experts

Mid-level Full Time

北京、上海

14d ago
Sr. Data Scientist II (Remote Eligible) USD 155K-185K

Batch Processing | Causal Inference | Contextual Bandits | Data Pipelines | Databricks

401k match | Employer subsidized medical dental and vision | Flexible time off | Life insurance | Long-term disability

Mid-level Full Time

-REMOTE, USA- R

16d ago
Applied Reinforcement Learning Engineer USD 150K-300K

A2C | A3C | Actor-critic | Agent systems | BCQ

Collaborate with industry leaders | Equal opportunity employer | Hybrid remote work | Research publications support

Mid-level Full Time

Remote Work( USA), United States R

19d ago
Software Engineer Sys 5 USD 141K-307K

AI monitoring | Access Control | Agent Orchestration | Anthropic | Audit Logging

Senior-level Full Time

US-CA-Fremont (1003)

21d ago
Director, Reinforcement Learning & Agentic Post-Training EUR 151K-200K

AI Feedback | API Integration | Distributed Training | Environment Design | Evaluation

Executive-level Full Time

Paris, France

1mo ago
Research Scientist Intern (TikTok Recommendation-LLMs, RL, GenAI) - 2026 Start (PhD) USD 136K-221K

Bandit Algorithms | Data Analysis | Deep learning | Generative AI | Language Models

Career growth opportunities | Hands-on project experience | Research mentorship

Entry-level Internship

San Jose, California, United States

1mo ago
Sr Principal Data Scientist INR 3000K-4000K

Agentic AI | Causal Inference | Causal ML | Contextual Bandits | Databricks

Senior-level Full Time

Bangalore, INDIA R

1mo ago
Sr. Physical AI Research Scientist CAD 140K-180K

AI alignment | Artificial Intelligence | Computer Vision | Constitutional AI | Continual Learning

Hybrid work schedule

Senior-level Full Time

Toronto, ON, CA

1mo ago
具身智能-强化学习(灵巧操作方向) 实习生 CNY 25K-37K

Actor-critic | Diffusion Models | Distributed Training | Embodied intelligence | Flow matching

Entry-level Full Time Internship

深圳

1mo ago
Machine Learning Engineer - Reinforcement Learning USD 150K-250K

Data Processing | Deep learning | Distributed Training | Evaluation metrics | Generative Models

Dental insurance | Family leave | Free food and snacks | Health insurance | Life insurance

Senior-level Full Time

Fremont, California, United States

1mo ago