大语言模型后训练/Agentic算法工程师
Tasks
- Build LLM Agent for automotive cockpit scenarios
- Build training and interaction environments for agents
- Design reward functions and reward modeling
- Develop multi tool calling and function calling
- Implement tool systems and user simulators
- Improve LLM post training and agentic RL
- Optimize problems attribution and training data pipelines
- Reproduce and track LLM post training research
- Research train optimize and evaluate vehicle dialogue systems
- Run offline evaluation and online analysis loops
Perks/Benefits
- N/A
Skills/Tech-stack
Agentic RL | DAPO | Distributed Training | Function Calling | GRPO | Java | Language Processing | Long Term Task Learning | Machine Learning | Memory | Multi-Modal | Natural Language | Natural Language Processing | On Policy | On policy Distillation | OpenRLHF | PPO | Planning | Policy Distillation | Preference Learning | Python | RLHF | RLVR | React | Reflection | Reinforcement Learning | Reward Modeling | Sparse Reward | Sparse Reward Modeling | TRL | Tool Integrated Reasoning | Trajectory Optimization | TypeScript | VeRL
Education
Regions
Countries
States
Related jobs
-
Entry-level Full Time北京3h ago
-
大模型 Infra 研发实习生(Agentic RL 方向) CNY 25K-37KAlerting | Asynchronous programming | Concurrency | Data pipeline | Distributed SystemsEntry-level Internship深圳3h ago
-
大模型 Infra 研发实习生(Agentic RL 方向) CNY 25K-37KAsynchronous programming | Concurrency | Distributed Systems | Docker | GitFlexible work schedule | Internship opportunity | MentorshipEntry-level Internship深圳3h ago
-
Entry-level Full Time北京、上海3h ago
-
AGI 服务端资深工程师-Talkie&星野 CNY 180K-300KData Engineering | Dify | Distributed Systems | Go | Inference OptimizationMid-level Full Time北京、上海3h ago
-
Mid-level Full TimeBeijing, China19h ago
-
Mid-level Full TimeChina Shanghai21h ago
-
Asynchronous programming | Dashboards | Data Observability | Data Validation | DatabasesMid-level Full TimeChina, Shanghai21h ago
-
Senior-level Full TimeWuxi, Jiangsu, China1d ago
-
Entry-level Internship上海2d ago
-
Entry-level Full Time上海2d ago
-
数据算法工程师(实习生) CNY 25K-37KC++ | Data Generation | Data Modeling | Data Transformation | Data cleaningInternshipEntry-level Internship上海2d ago
-
Llm算法实习生(具身大脑方向) CNY 25K-37KAgentic RL | Data Annotation | Fine Tuning | Human Feedback | LLM AgentEntry-level Internship深圳3d ago
-
Llm算法实习生(具身大脑方向) CNY 25K-37KAgentic RL | Data Modeling | LLM Agent | Language Models | Large Language ModelsInternship experience | Mentorship | Research collaborationEntry-level Internship深圳3d ago
-
Llm算法实习生(具身大脑方向) CNY 25K-37KEmbodied AI | Fine Tuning | Human Feedback | LLM Agents | Language ModelsEntry-level Internship深圳3d ago
-
nlp算法工程师-2027届 CNY 25K-37KDeep learning | DeepSpeed | Fine Tuning | Information Retrieval | Language ProcessingEntry-level Internship武汉3d ago
-
AI Agent Engineer(Embededd Software Tooling)_ETAS CNY 240K-480KAgent architecture | C++ | Deep learning | Edge AI | Embedded SoftwareSenior-level Full TimeShanghai, Shanghai, China3d ago
-
AI Application Development Engineer CNY 240K-360KAgent systems | Artificial Intelligence | Computer Vision | Debugging | Image ProcessingSenior-level Full TimeShenzhen, Guangdong Province, China3d ago
-
AI Application Development Engineer CNY 180K-300KAgent systems | Artificial Intelligence | Computer Vision | Deep learning | Image ProcessingEntry-level Full TimeShenzhen, Guangdong Province, China3d ago
-
Access Control | Alerting | BigQuery | DBT | Data GovernanceAutonomy | Growth opportunities | High-impact work | TrustMid-level Full TimeChina3d ago
-
Senior-level Full TimeChina Shanghai3d ago
-
APIs | AWS | Agentic Workflows | Azure | Cloud platformSenior-level Full TimeChina, Shanghai3d ago
-
Senior Platform AI Engineer - Silicon Co-Design Group CNY 360K-540KAuthentication | Authorization | C# | C++ | CachingComprehensive benefits package | Family benefitsSenior-level Full TimeChina, Shanghai3d ago
-
Entry-level Full TimeCN-Beijing-Office, China3d ago
-
Mid-level Full TimeShanghai, CN, 2012033d ago