Applied Reinforcement Learning Engineer
Tasks
- Build RLHF and post training pipelines
- Convert human labeled traces into RL training data
- Design RL environments for enterprise workflows
- Design reward functions and verifiers
- Implement multi step tool calling agent workflows
- Train LLM-based agents with reinforcement learning
- Translate RL research into production systems
Perks/Benefits
Skills/Tech-stack
Actor-critic | Agent systems | BCQ | Behavioral cloning | CQL | DPO | Decision Processes | Deep Q Networks | Direct Preference Optimization | Double Deep Q Networks | Dreamer | Dueling Deep Q Networks | Eligibility Traces | GAIL | Gymnasium | Hierarchical reinforcement learning | Human Feedback | IQL | JAX | Learning from Human Feedback | Markov Decision Processes | Model-based reinforcement learning | MuZero | Multi-Agent | Multi-Agent Systems | Offline Reinforcement Learning | OpenAI Gym | Policy Gradient | Policy Optimization | Preference optimization | Proximal Policy Optimization | PyTorch | Python | Q-learning | Reinforcement Learning | Reinforcement Learning from Human Feedback | Reward Modeling | Rllib | Soft Actor Critic | Stable Baselines | Temporal Difference Learning | TensorFlow | Trust Region Policy Optimization | World Models
Education
Related jobs
-
Forward Deployed AI Engineer/Data Scientist USD 78K-195KA/B | A/B Testing | B testing | Chatbot Platforms | Clustering401k matching | Basic life insurance | Employee stock purchase plan | Health, dental, vision coverage | Long-term disabilityMid-level Full TimeUnited States (Remote) R14h ago
-
Lead Machine Learning Engineer USD 225K-260KCloud processing | Computer Science | Computer Vision | Data Augmentation | Data PreprocessingRemote work optionSenior-level Full TimeUSA (remote) R19h ago
-
Staff Software Engineer, Platform Integrations USD 200K-230KAPI Contract | API Design | Autoscaling | CI/CD | Container OrchestrationDental benefits | Flexible PTO | Health benefits | Parental leave | Visa supportSenior-level Full TimeRemote US R22h ago
-
Staff Software Engineer, Data Engineering USD 193K-253KAWS | Airflow | Amplitude | BigQuery | CI/CD401k plan | Annual cash bonus | Dental insurance | Equity grants | Flexible time offSenior-level Full TimeRemote, USA R23h ago
-
Senior-level Full TimeRemote - USA R23h ago
-
Sr. Embedded & Compute Software Developer USD 130K-160KC# | C++ | CUDA | DO-178 | Debugging401k matching | Dental insurance | Employee assistance program | Health insurance | Paid HolidaysSenior-level Full TimeRemote (United States); Canada R1d ago
-
Sr. Machine Learning Engineer, Off-board Perception USD 185K-230KAWS | Active Learning | Airflow | Azure | CI/CDSenior-level Full TimeBay Area / Remote R1d ago
-
Senior MLOps/AI Engineer USD 165K-185KAPI Development | AgenticAI Ops | Automation | Containerization | Data Pipelines401k matching | Dental insurance | Disability insurance | Employee stock purchase plan | FSASenior-level Full TimeWork From Home, United States R1d ago
-
AWS | Containerization | Data Modeling | Data Performance Optimization | Data Pipelines401k plan | Dental insurance | Life insurance | Medical insurance | Paid HolidaysSenior-level Full TimeMinnesota R1d ago
-
AWS | Cloud Computing | Containerization | Data Modeling | Data integration401k plan | Life insurance | Medical, dental & vision coverage | Paid Holidays | Parental leaveSenior-level Full TimeMassachusetts R1d ago
-
AWS | Containerization | Data Modeling | Data Pipelines | Data integration401k retirement plan | Dental insurance | Life insurance | Medical insurance | Paid time offSenior-level Full TimeIllinois R1d ago
-
AWS | Cloud Computing | Data Modeling | Data pipeline | Distributed Systems401k plan | Life insurance | Medical, dental & vision coverage | Paid Holidays | Parental leaveSenior-level Full TimeIdaho R1d ago
-
AWS | Containers | Data Modeling | Data performance | Data performance tuning401k match | Dental insurance | Life insurance | Medical insurance | Paid HolidaysSenior-level Full TimeColumbia R1d ago
-
AWS | Containerization | Data Modeling | Data Performance Optimization | Data performance401k plan | Dental insurance | Life insurance | Medical insurance | Paid HolidaysSenior-level Full TimeColorado R1d ago
-
AWS | Cloud Computing | Data Modeling | Data pipeline | Data pipeline optimization401k plan | Comprehensive health insurance | Dental insurance | Life insurance | Paid HolidaysSenior-level Full TimeConnecticut R1d ago
-
AWS | Data Modeling | Data Performance Optimization | Data performance | Data pipeline401k plan | Dental insurance | Life insurance | Medical insurance | Paid HolidaysSenior-level Full TimeFlorida R1d ago
-
AWS | Clinical data | Clinical data integration | Containers | Data Modeling401k match | Life insurance | Medical, dental & vision coverage | Paid Holidays | Paid time offSenior-level Full TimeCalifornia R1d ago
-
AWS | Containerization | Data Modeling | Data Performance Optimization | Data performance401k plan | Life insurance | Medical, dental & vision coverage | Paid Holidays | Paid vacation daysSenior-level Full TimeArizona R1d ago
-
Senior Software Engineer II, Data Platform USD 192K-242KAccess Control | Apache Airflow | Apache Flink | Apache Hadoop | Apache HiveAnnual refresh grants | Equity grants | Flex first work model | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
Staff Software Engineer, Applied AI (Forward Deployed) USD 209K-307KAI Agents | AWS | Artificial Intelligence | Azure | Cloud infrastructureSenior-level Full TimeNew York - Hybrid R1d ago
-
API Security | APIs | AWS Lambda | Anthropic | Anthropic MCP)Bonus | Company funded usage budget | Equity RSUs | Remote workSenior-level Full TimeUnited States (Remote) R1d ago
-
Predictive Analytics Engineer USD 84K-162KAI | Cloud Monitoring | Cloud logging | Cloud platform | DockerEmployee resource groups | Fertility treatments support | Flexible family care days | Health insurance | Paid HolidaysMid-level Full TimeUnited States R1d ago
-
Python Engineer with Data Focus (Remote) USD 100K-130KAutomated testing | Azure DevOps | CI/CD | Databricks | DjangoFull-time | Fully remote | Long-term | No timezone shiftingSenior-level Full TimeFlorida, Aventura, United States of America R1d ago
-
Robotics Software Engineer USD 145K-200KC++ | Camera Calibration | GPU Computing | Image Segmentation | Multi Camera401k | Cell phone reimbursement | DCFSA | Dental insurance | Employee assistance programMid-level Full TimeSan Francisco || Oakland, CA R1d ago
-
Data Engineer, Analytics USD 191K-235KA/B | A/B Testing | B testing | Big Data | ClusteringTelecommuting allowedSenior-level Full TimeMenlo Park, CA | Remote, US R1d ago