Find jobs in AI/ML, Data Science and Big Data
2 results
for Sparse Reward
(Skill/Tech stack)
-
大语言模型后训练/Agentic算法工程师 CNY 180K-360KAgentic RL | DAPO | Distributed Training | Evaluation | Function CallingEntry-level Full Time上海、北京9h ago
-
AI Research Engineer - RL Manipulation CHF 123K-176KCredit Assignment | Domain Randomization | Exploration | Imitation Learning | Model-based reinforcement learningBias for action | Collaborative team | Independent ownershipSenior-level Full TimeZürich, Zurich, Switzerland1mo ago