Find jobs in AI/ML, Data Science and Big Data
17 results
for PPO
(Skill/Tech stack)
-
Mid-level Full Time北京 R1d ago
-
Adversarial Robustness | Agent learning | Audio Processing | Computer Vision | Content ModerationCareer growth | Research mentorshipNone Full TimeSan Jose, California, United States1d ago
-
AIGC Detection | Adversarial Learning | Agentic Systems | Cross-modal alignment | GRPONone Full TimeSeattle, Washington, United States1d ago
-
Adversarial Networks | Adversarial Training | Cross-modal alignment | GRPO | Generative Adversarial NetworksEntry-level InternshipSan Jose, California, United States1d ago
-
Applied Reinforcement Learning Engineer 2 USD 150K-300KActorCritic | BCQ | BehavioralCloning | CQL | DQNMid-level Full TimeRedmond, Washington, United States3d ago
-
Senior Software Engineer, RL Post-Training Frameworks USD 184K-356KActor Based Programming | C# | C++ | Consistency models | DPOComprehensive benefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States8d ago
-
Tech Lead, Robotic AI Model USD 150K-180KAction Chunking | Action Tokenization | Behavior Cloning | DPO | DeepSpeedSenior-level Full TimeEl Segundo, California, United States9d ago
-
Entry-level Internship上海9d ago
-
Mid-level Internship上海9d ago
-
Robotics ML Expert, AI USD 60K-60KAgent systems | Control Theory | Dm_control | Domain Randomization | DrakeAsync collaboration | Fully remote | Independent contractor 1099Senior-level Full TimeMiami R9d ago
-
Research Scientist, LLM Evaluation & Post-Training USD 150K-160KBenchmarking | Context evaluation | DPO | Data Processing | Error AnalysisSenior-level Full TimeRemote Work( USA), United States R10d ago
-
Senior Solutions Architect, Retail USD 184K-356KAPI Integration | Agent systems | Agents SDK | Benchmarking | C++Equity | Health benefits | Paid time offSenior-level Full TimeUS, CA, Remote, United States R10d ago
-
C++ | Deep learning | GPU clusters | GRPO | High PerformanceEquity | Healthcare benefits | Paid time off | Retirement benefitsSenior-level Full TimeUS, CA, Santa Clara, United States16d ago
-
AI/ML Research Scientist, LLM Post-Training & Evaluation USD 150K-160KAlignment | Benchmarking | DPO | Data Processing | Error AnalysisMid-level Full TimeRedmond, Washington, United States17d ago
-
Engineer - ML & RL CAD 93K-116KAgent systems | Contextual bandit | Deep learning | DeepSpeed | Distributed TrainingMid-level Contract Full TimeEdmonton, Alberta, Canada21d ago
-
AI Research Scientist - Safety Alignment Team USD 213K-293KAdversarial prompts | Automation | Computer Vision | DPO | Dataset curationSenior-level Full TimeMenlo Park, CA24d ago
-
Senior Staff Software Engineer, Model LifeCycle USD 237K-288KAPI Design | CUDA | Checkpointing | DPO | DeepSpeed401k matching | Cell phone stipend | Commuter benefits | Dental insurance | HSA employer contributionsSenior-level Full TimeTel Aviv - IL1mo ago