Find jobs in AI/ML, Data Science and Big Data
1 result
for Generalized Reward Policy Optimization
(Skill/Tech stack)
-
Senior Principal Machine Learning Engineer (Fulfilment) SGD 182K-240KDecision Processes | DeepSpeed | Direct Preference Optimization | Distributed Training | Dynamic ModelsBirthday leave | Confidential Assistance Programme | FlexWork | Medical insurance | Parental leaveExecutive-level Full TimeSingapore, Singapore21d ago