Find jobs in AI/ML, Data Science and Big Data
12 results
for Proximal Policy Optimization
(Skill/Tech stack)
-
Mid-level Full Time北京 R4h ago
-
Staff Software Engineer, Generative AI, Core ML USD 207K-300KAI Feedback | Computer Vision | Data Processing | Deep learning | Digital TwinSenior-level Full TimeMountain View, CA, USA1d ago
-
Machine Learning Engineer (Post-Training) EUR 57K-84KAWS | Data Pipelines | Data-parallel | DeepSpeed | Direct Preference OptimizationSenior-level Full TimeParis, France1d ago
-
Decision Intelligence Engineer - Next Best Action USD 129K-177KA3C | Backtesting | Bellman Equation | Conservative Q Learning | Constraint Mapping401k retirement savings plan | Medical, dental, and vision benefits | Occasional travel | Remote work | Time offSenior-level Full TimeRemote US, United States R6d ago
-
Senior-level Full Time北京、上海7d ago
-
DDP | Deep learning | Direct Preference Optimization | Distributed Training | DockerSenior-level Full TimePangyo (Software Dream Center), South Korea8d ago
-
大模型应用算法工程师/专家 CNY 240K-480KC++ | Computer Vision | Deep learning | Direct Preference Optimization | Human Computer DialogueSenior-level Full Time上海、北京9d ago
-
Tech Lead Manager- MLRE, ML Systems USD 264K-331KCUDA | Distributed Systems | Flash Attention | GRPO | Human FeedbackCommuter stipend | Generous PTO | Health, dental and vision coverage | Learning and development stipend | Retirement benefitsSenior-level Full TimeSan Francisco, CA; New York, NY9d ago
-
Agent RL Infra Engineer USD 224K-356KAI Feedback | Active Learning | Cluster management | Continuous Learning | Data CurationSenior-level Full TimeUS, CA, Santa Clara, United States11d ago
-
Applied Reinforcement Learning Engineer USD 150K-160KActor-critic | Agent systems | BCQ | Behavioral cloning | CQLEqual opportunity employer | Hybrid remote work | Research publications opportunityMid-level Full TimeRemote Work( USA), United States R14d ago
-
Automated testing | Cryptography | Direct Preference Optimization | Distributed Systems | DockerSenior-level Full TimeRemote R22d ago
-
Senior AI Research Scientist (6240) USD 170K-270KAdversarial Learning | Attention Networks | Dash | Data Preprocessing | Data WranglingHybrid work schedule | Professional development programs | Travel for training and team buildingSenior-level Full TimeSan Jose, CA, US1mo ago