Find jobs in AI/ML, Data Science and Big Data
17 results
for Direct Preference Optimization
(Skill/Tech stack)
-
AWS | Agent Orchestration | Autogen | Autonomous Agents | Direct Preference OptimizationBicycle subsidy | Corporate discounts | Corporate pension plan | Digital meal vouchers | Educational budgetSenior-level Full TimeBerlin, Germany12h ago
-
Staff Software Engineer, Generative AI, Core ML USD 207K-300KAI Feedback | Computer Vision | Data Processing | Deep learning | Digital TwinSenior-level Full TimeMountain View, CA, USA1d ago
-
Mid-level Full TimeBengaluru, Karnataka, India1d ago
-
Machine Learning Engineer (Post-Training) EUR 57K-84KAWS | Data Pipelines | Data-parallel | DeepSpeed | Direct Preference OptimizationSenior-level Full TimeParis, France1d ago
-
Senior Applied Scientist USD 180K-230KDirect Preference Optimization | Distributed Training | Human Feedback | LLM-as-a-Judge | Language ModelsSenior-level Full TimePalo Alto2d ago
-
DDP | Deep learning | Direct Preference Optimization | Distributed Training | DockerSenior-level Full TimePangyo (Software Dream Center), South Korea8d ago
-
Senior Applied Scientist USD 142K-270KDiffusion Models | Direct Preference Optimization | Fine Tuning | Human Feedback | Inference accelerationSenior-level Full TimeSeattle, United States8d ago
-
大模型应用算法工程师/专家 CNY 240K-480KC++ | Computer Vision | Deep learning | Direct Preference Optimization | Human Computer DialogueSenior-level Full Time上海、北京9d ago
-
Senior Applied AI Manager USD 170K-234KAgent systems | Agentic Systems | Curriculum learning | Data Deduplication | Data mixingSenior-level Full TimeSan Mateo, CA9d ago
-
Agent RL Infra Engineer USD 224K-356KAI Feedback | Active Learning | Cluster management | Continuous Learning | Data CurationSenior-level Full TimeUS, CA, Santa Clara, United States11d ago
-
Applied Reinforcement Learning Engineer USD 150K-160KActor-critic | Agent systems | BCQ | Behavioral cloning | CQLEqual opportunity employer | Hybrid remote work | Research publications opportunityMid-level Full TimeRemote Work( USA), United States R14d ago
-
校招-Ai研究科学家-大语言模型/视觉语言模型算法与后训练(博士优先) CNY 500K-500KAdapters | Direct Preference Optimization | Fine Tuning | Flax | Function designNone Full Time上海20d ago
-
Agent Orchestration | Amazon Web Services | Auto Planning | Autogen | Direct Preference OptimizationBicycle subsidy | Corporate discounts | Corporate pension plan | Digital meal vouchers | Educational budgetSenior-level Full TimeBerlin, Germany21d ago
-
Benchmark design | Computer Vision | Deep learning | Direct Preference Optimization | Evaluation metricsCar to go subscriptions | Free parking | Learning opportunities | On site bakery | On-site restaurantsMid-level Full TimeJerusalem21d ago
-
Staff AI Engineer, Model Post-Training and Alignment USD 196K-268KBenchmarking | Deep learning | Direct Preference Optimization | Fine Tuning | Generalized Reward Policy OptimizationCompany events | Comprehensive healthcare | Education subsidy | Learning and development programs | Meal allowancesSenior-level Full TimeAPAC21d ago
-
Automated testing | Cryptography | Direct Preference Optimization | Distributed Systems | DockerSenior-level Full TimeRemote R22d ago
-
Senior AI Research Scientist (6240) USD 170K-270KAdversarial Learning | Attention Networks | Dash | Data Preprocessing | Data WranglingHybrid work schedule | Professional development programs | Travel for training and team buildingSenior-level Full TimeSan Jose, CA, US1mo ago