Machine Learning Engineer - Reinforcement Learning
Tasks
- Build evaluation frameworks for model and agent performance
- Create data pipelines for preference synthetic and human feedback data
- Design LLM powered agent environments for decision making
- Design and iterate reward functions for agent behaviors
- Detect reward exploiting and proxy metric optimization
- Develop tooling for testing debugging and deployment of AI workflows
- Document experiments results and learnings
- Experiment with reinforcement learning RLHF RLAIF reward shaping and policy optimization
- Fine tune and evaluate LLMs for domain reasoning
- Improve reward models prompts tools and feedback loops
- Review LLM traces and rollouts for failure modes and reward hacking
Perks/Benefits
- N/A
Skills/Tech-stack
Data Pipelines | Evaluation | Fine Tuning | Human Feedback | LLM Fine-tuning | Learning from Human Feedback | Machine Learning | Policy Optimization | Prompt engineering | PyTorch | Python | Reinforcement Learning | Reinforcement Learning from Human Feedback | Reward Modeling | Reward shaping | Synthetic data
Education
N/A
Related jobs
-
Apache Spark | CI/CD | Cloudera | Confluence | DB2Two days telework per weekSenior-level FreelanceHauts-de-Seine, France17h ago
-
Research Scientist, Machine Learning EUR 60K-76KA/B | A/B Testing | B testing | Data pipeline | Deep learningMid-level Full TimeParis, France20h ago
-
Machine Learning Engineer EUR 38K-67KAgentic Systems | CI/CD | DVC | Deep learning | GitLabEqual parental leave | Flexible working hours | Hybrid work policy | Mental health services | Mentorship programMid-level Full TimeParis, France22h ago
-
Automl | Blazor | C# | Docker | Ensemble learningRestaurant vouchers | Training opportunity | Transport reimbursementEntry-level Full Time InternshipValbonne, Provence-Alpes-Côte d'Azur, France23h ago
-
Computer Vision | Convolutional Neural Networks | Embedded Systems | Feature Matching | IMUMeal tickets | Stage Opportunity for Hiring | Transportation coverageEntry-level Full Time InternshipToulouse, Occitanie, France23h ago
-
C++ | Computer Vision | Contour detection | Convolutional Neural Networks | Image SegmentationMeal tickets | Transportation reimbursementEntry-level Full Time InternshipSèvres, IDF, France23h ago
-
Data Modeling | Data Processing | Machine Learning | Predictive Modeling | PythonTickets restaurant | Transportation allowanceEntry-level Full Time InternshipSèvres, IDF, France23h ago
-
3D Geometry | 3D Modeling | Blender | C++ | Computer VisionMeal tickets | Transport allowanceEntry-level Full Time InternshipToulouse, Occitanie, France23h ago
-
ASP.Net Core | Agentic AI | CSS | Copilot Studio | CsharpCareer Development Programs | Training and developmentMid-level Full TimeSurvilliers, France1d ago
-
Stage - Data Science-Agentic Ai - F/H EUR 16K-16KAPI Integration | Agentic AI | Data Classification | Data extraction | Data organizationCareer development | Global employer branding | Inclusive work environment | Learning opportunities | Paid trainingEntry-level Full TimeCrolles, France1d ago
-
CDI - GenAI Engineer (Intermédiaire / Senior) EUR 55K-62KAPIs | Agentic AI | FastAPI | Gemini | LLM APIAdditional sick leave for child illness | Career development support | Cultural discounts platform | Discounted sports membership | Equipment providedMid-level Full TimeParis, Île-de-France, France R1d ago
-
Architecte Data GenAI EUR 48K-66KCI/CD | Dashboarding | Data Pipelines | Data Quality | Data TestingCareer coaching | Continuous learning | Discounted activities | Employee stock purchase plan | Flexible mobilitySenior-level Full TimePau, FR R1d ago
-
AWS | Agentic AI | Anomaly Detection | Automated response | BenchmarkingCISO Level Stakeholder Exposure | FTE Flexibility | Flexible work schedule | Remote-first work model | Research-driven environmentSenior-level Full TimeFrance1d ago
-
Data Scientist / Ingénieur IA expérimenté (H/F) EUR 15K-15KCUDA | ChromaDB | Deep learning | Faiss | LLM orchestrationContinuous learning | Inclusive work environment | Internal communities | Training programsMid-level Full TimeOllioules, France1d ago
-
C# | GNSS | Integrity monitoring | Ionosphere | JavaContinuous learning | Inclusive workplace | Work-life balanceEntry-level Full TimeToulouse Champollion, France1d ago
-
Data Engineer Databricks EUR 47K-55KAgile | Apache Spark | Azure | Azure Data | Azure Data FactoryCommunity events | Flexible learning resources | Inclusive culture | Mentorship | Training and developmentSenior-level Full TimeNantes, FR1d ago
-
Tech Lead Databricks EUR 48K-59KAWS | Agile | Azure Data | Azure Data Factory | Azure Data LakeCareer development | Training opportunitiesSenior-level Full TimeToulouse, FR1d ago
-
Tech Lead Databricks EUR 48K-59KAWS | Agile | Apache Spark | Azure | Azure DataCertification support | Mentoring culture | Training opportunitiesSenior-level Full TimeLille, FR1d ago
-
Developpeur Back-End AI/LLM H/F EUR 60K-65K.NET | ASP.Net Core | Anthropic | Azure DevOps | Azure OpenAICareer development plan | Diversity and inclusion focus | Events and seminars | Internal mobility | Learning budgetSenior-level Full TimeCourbevoie, IDF, France1d ago
-
Bash | Data Processing | Docker | GCP | Infrastructure as CodeAsynchronous work culture | Career growth opportunities | Friendly work environment | Remote-friendly setupMid-level Full TimeLyon, France1d ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerAsynchronous culture | Friendly laid-back atmosphere | Remote/distributed work | Support risk intuition hustle | Work on life-changing productMid-level Full TimeLille, France1d ago
-
Senior Data Engineer EUR 47K-55KAWS | Airbyte | Amazon Web Services | CI/CD | ClickHouseExtra time off | Meal vouchers | Positive work environment | Remote work | Welcoming teamSenior-level Full TimeLyon, France R1d ago
-
Mid-level Full TimeVilleurbanne, France1d ago
-
AI Engineer - H/F/NB EUR 40K-57KAmazon Web Services | Anthropic | Azure OpenAI | Cloud platform | Google CloudMid-level Full TimeVilleurbanne, France1d ago
-
Data Analysis | Git | Mathematical Modeling | Physics modeling | PythonFlexible remote work days | Inclusive workplace | International team | Time for training and conferencesEntry-level Full TimePalaiseau, France R1d ago