Find jobs in AI/ML, Data Science and Big Data
17 results
for Proximal Policy Optimization
(Skill/Tech stack)
-
具身智能-强化学习(灵巧操作方向) 实习生 CNY 25K-37KActor-critic | Diffusion Models | Distributed Training | Embodied intelligence | Flow matchingEntry-level Full Time Internship深圳7d ago
-
Mid-level Internship上海7d ago
-
Senior Staff AI Engineer USD 180K-240KA3C | Actor-critic | Adaptive computation | Benchmarks | C plus plusSenior-level Full TimeLos Altos, California,7d ago
-
Mid-level Full Time北京 R10d ago
-
AI Feedback | Agentic Systems | Direct Preference Optimization | Distributed Training | EvaluationSenior-level Full TimeAMER - United States - California … R15d ago
-
Applied Machine Learning Engineer USD 110K-165KA3C | Apache Kafka | C plus plus | C# | CUDA401k plan | Education assistance | Flexible work schedules | Health care and wellness plans | Paid HolidaysSenior-level Full TimeColorado Springs, United States15d ago
-
Applied AI Engineer USD 99K-225KAWS | AgentOps | Azure | ChromaDB | Continued Pretraining401k retirement plan | Bike storage | Commuter benefits | Dependent care FSA | Desk setup stipendMid-level Full TimeWashington DC23d ago
-
AI Scientist GBP 46K-46KAzure | Azure OpenAI | Azure OpenAI Services | Databricks | Dataset PreparationMid-level Full TimeLondon, United Kingdom26d ago
-
Senior Machine Learning Engineer, RL / Locomotion USD 220K-336KActor-critic | Domain Randomization | GPU Computing | Isaac Lab | Isaac-GymHealth benefits | Recovery BenefitsSenior-level Full TimeCosta Mesa, California, United States26d ago
-
Head of World Models (Universal Robots, India) INR 3000K-6000KAI orchestration | Actor-critic | Agent Frameworks | Autogen | DPOExecutive-level Full TimeBangalore, IN28d ago
-
Agentic Systems | Deep learning | Diffusion Models | Fine Tuning | Generative AI401k eligibility | Annual bonus | Dental insurance | Medical insurance | Paid time offSenior-level Full TimeLos Altos, CA1mo ago
-
Adversarial Networks | Computer Vision | Cross-modal alignment | GRPO | Generative Adversarial NetworksEntry-level InternshipSeattle, Washington, United States1mo ago
-
Data Analysis | Dataset Processing | Direct Preference Optimization | Evaluation Pipelines | Fine TuningEntry-level InternshipSan Jose, California, United States1mo ago
-
Actor-critic | Air Traffic Management | Air traffic | Machine Learning | OptimizationFlexible working space | Informal corporate culture | Thesis assignment allowanceEntry-level InternshipAmsterdam, Noord-Holland, Nederland R1mo ago
-
Tech Lead, Robotic AI Model USD 150K-180KAction Chunking | Action Tokenization | Behavior Cloning | DPO | DeepSpeedSenior-level Full TimeEl Segundo, California, United States1mo ago
-
Senior AI Engineer Specialist INR 2500K-3500KAgentic AI | Apache Spark | Direct Preference Optimization | Distributed Computing | Embedding architecturesSenior-level Full TimeIND - Bengaluru - Esko-Graphics India …1mo ago
-
Robotics & Reinforcement Learning Engineer EUR 60K-84KActor-critic | Actuator modeling | Behavior Cloning | C++ | Control SystemsAnnual leave | Early Friday finish | Flexible working hours | Free coffee and tea | Permanent full-time contractSenior-level Contract Full TimeBarcelona, CT, Spain1mo ago