Machine Learning Engineer - Reinforcement Learning
Tasks
- Build evaluation frameworks for model and agent performance
- Create data pipelines for preference synthetic and human feedback data
- Design LLM powered agent environments for decision making
- Design and iterate reward functions for agent behaviors
- Detect reward exploiting and proxy metric optimization
- Develop tooling for testing debugging and deployment of AI workflows
- Document experiments results and learnings
- Experiment with reinforcement learning RLHF RLAIF reward shaping and policy optimization
- Fine tune and evaluate LLMs for domain reasoning
- Improve reward models prompts tools and feedback loops
- Review LLM traces and rollouts for failure modes and reward hacking
Perks/Benefits
- N/A
Skills/Tech-stack
Data Pipelines | Evaluation | Fine Tuning | Human Feedback | LLM Fine-tuning | Learning from Human Feedback | Machine Learning | Policy Optimization | Prompt engineering | PyTorch | Python | Reinforcement Learning | Reinforcement Learning from Human Feedback | Reward Modeling | Reward shaping | Synthetic data
Education
N/A
Related jobs
-
Apache Spark | CI/CD | Cloudera | Confluence | DB2Two days telework per weekSenior-level FreelanceHauts-de-Seine, France16h ago
-
Research Scientist, Machine Learning EUR 60K-76KA/B | A/B Testing | B testing | Data pipeline | Deep learningMid-level Full TimeParis, France19h ago
-
Automl | Blazor | C# | Docker | Ensemble learningRestaurant vouchers | Training opportunity | Transport reimbursementEntry-level Full Time InternshipValbonne, Provence-Alpes-Côte d'Azur, France22h ago
-
Computer Vision | Convolutional Neural Networks | Embedded Systems | Feature Matching | IMUMeal tickets | Stage Opportunity for Hiring | Transportation coverageEntry-level Full Time InternshipToulouse, Occitanie, France22h ago
-
C++ | Computer Vision | Contour detection | Convolutional Neural Networks | Image SegmentationMeal tickets | Transportation reimbursementEntry-level Full Time InternshipSèvres, IDF, France22h ago
-
Data Modeling | Data Processing | Machine Learning | Predictive Modeling | PythonTickets restaurant | Transportation allowanceEntry-level Full Time InternshipSèvres, IDF, France22h ago
-
3D Geometry | 3D Modeling | Blender | C++ | Computer VisionMeal tickets | Transport allowanceEntry-level Full Time InternshipToulouse, Occitanie, France22h ago
-
ASP.Net Core | Agentic AI | CSS | Copilot Studio | CsharpCareer Development Programs | Training and developmentMid-level Full TimeSurvilliers, France23h ago
-
Stage - Data Science-Agentic Ai - F/H EUR 16K-16KAPI Integration | Agentic AI | Data Classification | Data extraction | Data organizationCareer development | Global employer branding | Inclusive work environment | Learning opportunities | Paid trainingEntry-level Full TimeCrolles, France1d ago
-
Architecte Data GenAI EUR 48K-66KCI/CD | Dashboarding | Data Pipelines | Data Quality | Data TestingCareer coaching | Continuous learning | Discounted activities | Employee stock purchase plan | Flexible mobilitySenior-level Full TimePau, FR R1d ago
-
AWS | Agentic AI | Anomaly Detection | Automated response | BenchmarkingCISO Level Stakeholder Exposure | FTE Flexibility | Flexible work schedule | Remote-first work model | Research-driven environmentSenior-level Full TimeFrance1d ago
-
Data Scientist / Ingénieur IA expérimenté (H/F) EUR 15K-15KCUDA | ChromaDB | Deep learning | Faiss | LLM orchestrationContinuous learning | Inclusive work environment | Internal communities | Training programsMid-level Full TimeOllioules, France1d ago
-
C# | GNSS | Integrity monitoring | Ionosphere | JavaContinuous learning | Inclusive workplace | Work-life balanceEntry-level Full TimeToulouse Champollion, France1d ago
-
Developpeur Back-End AI/LLM H/F EUR 60K-65K.NET | ASP.Net Core | Anthropic | Azure DevOps | Azure OpenAICareer development plan | Diversity and inclusion focus | Events and seminars | Internal mobility | Learning budgetSenior-level Full TimeCourbevoie, IDF, France1d ago
-
Bash | Data Processing | Docker | GCP | Infrastructure as CodeAsynchronous work culture | Career growth opportunities | Friendly work environment | Remote-friendly setupMid-level Full TimeLyon, France1d ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerAsynchronous culture | Friendly laid-back atmosphere | Remote/distributed work | Support risk intuition hustle | Work on life-changing productMid-level Full TimeLille, France1d ago
-
Senior Data Engineer EUR 47K-55KAWS | Airbyte | Amazon Web Services | CI/CD | ClickHouseExtra time off | Meal vouchers | Positive work environment | Remote work | Welcoming teamSenior-level Full TimeLyon, France R1d ago
-
Mid-level Full TimeVilleurbanne, France1d ago
-
AI Engineer - H/F/NB EUR 40K-57KAmazon Web Services | Anthropic | Azure OpenAI | Cloud platform | Google CloudMid-level Full TimeVilleurbanne, France1d ago
-
Data Analysis | Git | Mathematical Modeling | Physics modeling | PythonFlexible remote work days | Inclusive workplace | International team | Time for training and conferencesEntry-level Full TimePalaiseau, France R1d ago
-
AI Engineer EUR 48K-48KAPI Development | FastAPI | FinOps | Google Cloud | LLM orchestrationHybrid work | Labs | Mentorship | Skill based career path | Technical eventsMid-level Full TimeStrasbourg1d ago
-
Data Scientist IA H/F EUR 40K-50KAPI | Data Science | Generative AI | Information Retrieval | Language ModelsCareer growth opportunities | Internal mobility | Training opportunitiesEntry-level Full TimeTours, Centre-Val de Loire, France1d ago
-
Applied Mathematics | CBC | CPLEX | Energy economics | Energy storageMid-level Full TimePALAISEAU-LE NEXT(FRA), PALAISEAU, France R1d ago
-
Mid-level Full TimeParis1d ago
-
Entry-level ApprenticeshipParis, France1d ago