Agent Post-Training Research
Tasks
- Build evals and environments to identify model failures
- Debug model failures and define hypotheses experiments and fixes
- Decide integrations and fixes for major model runs
- Design and run experiments to improve agent behavior
- Develop training signal with data mixtures objectives and synthetic data
- Improve training and launch reliability observability and reproducibility
- Own post training improvements end to end
- Translate product signal into model improvements
Perks/Benefits
- N/A
Skills/Tech-stack
AI Feedback | Agent systems | Calibrated Reasoning | Data Pipelines | Deep learning | Experiment design | Factuality | Function Calling | Grader Systems | Human Feedback | Instruction following | Language Models | Large Language Models | Learning from Human Feedback | Machine Learning | Model Evaluation | Multi-Agent | Multi-Agent Systems | Post-training | RLAIF | RLHF | Reinforcement Learning | Reinforcement Learning from AI Feedback | Reinforcement Learning from Human Feedback | Software Engineering | Statistical Analysis | Synthetic data | Tool use
Education
Regions
Countries
States
Related jobs
-
Featured Feat. Associate Director, Data Labs USD 167K-167KAWS | Cloud Computing | Compute Infrastructure | Data Analysis | LLM GovernanceConference speaking opportunities | Hybrid work schedule | Media appearancesSenior-level Full TimeWashington, District of Columbia, 20004, United … R12h ago
-
Founding Engineer USD 110K-160KAPIs | Automated Evaluation | Fine Tuning | Infrastructure | Language ProcessingMid-level Full TimeSan Francisco, CA, US12h ago
-
AWS | Agentic AI | Azure | CI/CD | Cloud platform401k | Medical | Paid sick leaveMid-level ContractSouth San Francisco, United States12h ago
-
Senior-level Full TimeMiami, New York, San Francisco12h ago
-
Research Scientist - LLM Training System as a Service - Global Frontier Tech Recruitment Program - 2027 Start (PhD) USD 202K-368KCUDA | Distributed Training | GPU Performance | GPU Performance Optimization | Language ModelsEntry-level Full TimeSan Jose, California, United States13h ago
-
Senior Databricks Forward Deployed Engineer - GPS USD 155K-306KAirflow | CI/CD | DBT | Data Modeling | DatabricksMentorship | Professional development | Travel for client workSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Atlanta, Georgia, …13h ago
-
Lead Databricks Forward Deployed Engineer - GPS USD 189K-372KAPI | AWS | Agent Bricks | Airflow | Apache SparkSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Atlanta, Georgia, …13h ago
-
Entry-level Full TimeArlington/Rosslyn, Virginia, United States13h ago
-
Databricks Senior Consultant USD 124K-207KAWS | Azure | Business Intelligence | Cloud Computing | Cloud platformSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Sacramento, California, …13h ago
-
Delivery Senior Consultant, Data Engineering and Gen AI USD 155K-265K.NET | AWS | Agile | Angular | AzureMentorship opportunities | Professional development | Travel reimbursementSenior-level Full TimeGilbert, Arizona, United States; Lake Mary, …13h ago
-
AI and Data Science Engineer (TS/SCI Poly) USD 93K-170KAPIs | Artificial Intelligence | CI/CD | Cloud Platforms | ContainerizationMid-level Full TimeMcLean, Virginia, United States13h ago
-
Azure Databricks Developer USD 126K-198KApache Spark | Azure Data | Azure Data Factory | Azure Data Lake | Azure Data Lake StorageSenior-level Full TimeLouisville, Kentucky, United States13h ago
-
Sr. Data Scientist USD 131K-198KMachine Learning | Python | R | SQL | Statistical modelingHybrid work schedule | Limited travelSenior-level Full TimeRaleigh, North Carolina, United States14h ago
-
Senior Data Scientist - Government & Public Services USD 131K-218KClass imbalance | Cloud Computing | Data Exploration | Data Preparation | Data leakageSenior-level Full TimeArlington/Rosslyn, Virginia, United States14h ago
-
Delivery Senior Consultant, Data Engineering and Gen AI USD 155K-265K.NET | AWS | Agile | Angular | AzureMentorship | Professional development | Travel opportunitiesSenior-level Full TimeGilbert, Arizona, United States; Lake Mary, …14h ago
-
Generative AI Engineer III - Federal Health USD 110K-218KArtificial Intelligence | Data Engineering | Data Pipelines | Data Validation | DockerMentorship opportunities | Professional developmentSenior-level Full TimeArlington/Rosslyn, Virginia, United States14h ago
-
Data Engineer III (Secret Clearance Required) USD 107K-179KAWS | Anomaly Detection | Artificial Intelligence | Classification | ClusteringProfessional developmentSenior-level Full TimeArlington/Rosslyn, Virginia, United States14h ago
-
Lead AI and Data Solutions Engineer II USD 134K-224KAmazon Web Services | Apache Spark | Application Programming | Application Programming Interfaces | Cloud platformMentorship | Professional developmentSenior-level Full TimeSacramento, California, United States; Tempe, Arizona, …14h ago
-
Machine Learning Engineer, Ads Creative USD 194K-355KData Analysis | Deep learning | Machine Learning | Model Training | TargetingSenior-level Full TimeSan Jose, California, United States14h ago
-
Research Engineer - MSL FAIR Foundations USD 141K-208KAudio Processing | Benchmarking | Code review | Data Pipelines | Deep learningMid-level Full TimeMenlo Park, CA | Seattle, WA …15h ago
-
Performance & Capacity Engineer - Planning Optimization USD 147K-208KAI Models | Agent Orchestration | Artificial Intelligence | Bias Mitigation | Bin packingSenior-level Full TimeBellevue, WA | Menlo Park, CA …15h ago
-
Data Scientist, Product Analytics USD 178K-204KArtificial Intelligence | Data Mining | Experimentation | Forecasting | Key Performance IndicatorsMid-level Full TimeSunnyvale, CA | Menlo Park, CA … R15h ago
-
Senior Forward Deployed Engineer, GenAI, YouTube USD 155K-225KAgent systems | C++ | CRM Integration | Developer Operations | Distributed SystemsSenior-level Full TimeSan Bruno, CA, USA; Chicago, IL, …15h ago
-
Software Engineer, Applied AI USD 147K-211KArtificial Intelligence | Continuous Deployment | Continuous integration | Data Processing | DebuggingMid-level Full TimeSunnyvale, CA, USA15h ago
-
Threat Modeler Lead, CBRNE, DeepMind USD 174K-253KBiological Risk | Biological Risk Analysis | CBRNe Risk Analysis | Data Analysis | Dual Use Risk AssessmentSenior-level Full TimeNew York, NY, USA; London, UK15h ago