大模型算法工程师(开放域对话)
Tasks
- Apply DPO
- Apply GRPO
- Apply PPO
- Build end to end dialogue data pipeline
- Clean and deduplicate raw corpus
- Deploy low latency inference on edge or cloud
- Develop LLM algorithms for open domain dialogue
- Evaluate dialogue using offline evaluation
- Implement multi turn state tracking DST
- Improve intent recognition and personalization
- Improve multi turn dialogue decision making using agentic RL
- Optimize base model with SFT
- Optimize models with RLHF
- Perform prompt engineering
- Reduce hallucinations in tool use
- Run offline evaluation and online A B testing
- Support model quantization distillation and inference acceleration
- Train reward model for reinforcement learning
Perks/Benefits
- N/A
Skills/Tech-stack
A/B | A/B Testing | Agentic RL | B testing | DPO | Data Deduplication | Data cleaning | DeepSpeed | Distributed Training | Fine Tuning | Function Calling | GRPO | Inference Optimization | Knowledge Distillation | Language Models | Large Language Models | Model Quantization | OpenRLHF | Policy Optimization | Prompt engineering | Proximal Policy Optimization | Python | RLAIF | RLHF | React | Reinforcement Learning | Reward Modeling | Supervised Fine Tuning | TRL | Thought Intermediate Result | VLLM | VeRL
Education
Related jobs
-
具身智能-强化学习(灵巧操作方向) 实习生 CNY 25K-37KActor-critic | Diffusion Models | Distributed Training | Embodied intelligence | Flow matchingEntry-level Full Time Internship深圳8h ago
-
DPO | Deep learning | Diverse Preference Optimization | Learning algorithms | Machine LearningMid-level Full Time上海11h ago
-
算法工程师-大模型数据方向 CNY 240K-360KAutomated Evaluation | Clustering | Corpus Synthesis | Data Augmentation | Data GovernanceSenior-level Full Time上海11h ago
-
数据开发工程师(Ai知识方向) CNY 180K-300KContent governance | Data Governance | Data Quality | Data Quality Metrics | ETLMid-level Full Time上海11h ago
-
Mid-level Full Time上海11h ago
-
Senior-level Full Time上海11h ago
-
Mid-level Full Time上海11h ago
-
大语言模型后训练/Agentic算法工程师 CNY 180K-360KAgentic RL | DAPO | Distributed Training | Evaluation | Function CallingEntry-level Full Time上海、北京11h ago
-
Senior-level Full Time上海11h ago
-
Associate Director, Data and Analytics CNY 280K-360KApache Airflow | Automated testing | BigQuery | CI/CD | Cloud ComposerMid-level Full TimeGuangzhou, Guangdong, China19h ago
-
Entry-level Full TimeSuzhou, Jiangsu, China21h ago
-
AWS | Access Controls | Agile | Azure | CI/CDCareer growth opportunities | Continuous training | High-end technology access | Inclusive workplaceMid-level Full TimeCHN – Chengdu - Commercial, China1d ago
-
Senior System Software Engineer, Robotics CNY 144K-240KARM architecture | C# | C++ | CUDA | DeterminismSenior-level Full TimeChina, Shanghai1d ago
-
C plus plus | C# | Camera Calibration | Camera Synchronization | Camera systemsMid-level Full TimeShenzhen, Guangdong, China1d ago
-
Machine Learning Engineer CNY 216K-300KAndroid | C# | C++ | Embedded Systems | Inference OptimizationMid-level Full TimeShanghai, Shanghai, China1d ago
-
C plus plus | CUDA | Code generation | Compiler design | Domain-specific languageSenior-level Full TimeChina, Shanghai1d ago
-
Senior-level Full Time上海1d ago
-
Mid-level Full Time深圳1d ago
-
Mid-level Full Time东莞1d ago
-
Ai算法工程师 CNY 180K-300KConvolutional Neural Networks | Data Mining | Data Warehouse | Data cleaning | Data labelingMid-level Full Time东莞1d ago
-
Ai 院--多模态团队--多模态理解算法研究员-强化学习方向 CNY 240K-480KDPO | Data Preprocessing | Data cleaning | DeepSpeed | Distributed TrainingSenior-level Full Time北京 R1d ago
-
AI院-GLM团队-AI-Native 全栈工程师(偏后端) CNY 180K-300KAPI Design | API design and implementation | Cloud Native | Data Processing | Database operationsMid-level Full Time北京1d ago
-
Mid-level Full Time杭州1d ago
-
AI院--训练Infra工程师 CNY 180K-300KComputer Vision | Distributed Training | Language Models | Language Processing | Large Language ModelsMid-level Full Time北京1d ago
-
Mid-level Full Time北京1d ago