aijobs.net

Director, Reinforcement Learning & Agentic Post-Training

Paris, France

EUR 151K-200K (estimate) Executive-level Full Time

Apply Save
Found 6d ago
Tasks
Perks/Benefits
Skills/Tech-stack

AI Feedback | API Integration | Distributed Training | Environment Design | Evaluation | Experiment tracking | Fine Tuning | GRPO | Human Feedback | Language Models | Large Language Models | Learning from Human Feedback | Megatron | Megatron-LM | NEMO | NVIDIA Nemo | NVIDIA Nemotron | Observability | Offline Reinforcement Learning | PPO | Policy Optimization | Preference optimization | PyTorch | Python | Ray | Reinforcement Learning | Reinforcement Learning from AI Feedback | Reinforcement Learning from Human Feedback | Rejection Sampling | Reproducibility | Reward Modeling | Reward shaping | Rollback Safety | Supervised Fine Tuning | Tool use | VLLM

Education

N/A

Roles

Director | Director Machine Learning | Director Reinforcement Learning

Regions

Europe

Countries

France

States

Île-de-France, FR

Cities

Paris, Île-de-France, FR

Apply Save
Language: en Views: 0 Clicks: 0 Saves: 0

Related jobs