Applied Reinforcement Learning Engineer 2
USD 150K-300K Mid-level Full Time
Tasks
- Architect multi step reasoning agents with tool calling and closed learning loops
- Build end to end pipelines from human labeled traces to RL training data
- Design and build RL environments for enterprise workflows
- Design reward functions verifiers and validation frameworks
- Train LLM based agents using PPO GRPO DPO and RLHF
- Translate RL research into production systems
Perks/Benefits
- N/A
Skills/Tech-stack
ActorCritic | BCQ | BehavioralCloning | CQL | DQN | Deep ReinforcementLearning | DirectPreferenceOptimization | DistributedTraining | DomainRandomization | DoubleDQN | Dreamer | DuelingDQN | GAIL | Gymnasium | HierarchicalReinforcementLearning | IQL | JAX | LargeLanguageModels | MarkovDecisionProcess | ModelBasedReinforcementLearning | MuZero | MultiAgentSystems | OfflineReinforcementLearning | OpenAI Gym | OptionsFramework | PPO | PolicyGradient | PreferenceLearning | PyTorch | Python | Q-learning | ReinforcementLearning | ReinforcementLearningFromHumanFeedback | RewardModeling | Rllib | SAC | SimToReal | Simulation | StableBaselines | TD Lambda | TRPO | TensorFlow | Tooluse | WorldModels
Education
Related jobs
-
ML Engineer, Surrogate Modeling (Vehicle Engineering) USD 125K-175KActive Learning | Adaptive Sampling | CFD | Continuous integration | Data Pipelines401k retirement plan | Employee stock purchase plan | Life insurance | Long-term disability insurance | Long-term incentivesEntry-level Full TimeHawthorne, CA3h ago
-
Senior Software Engineer (Search / Retrieval) USD 180K-240KBM25 | Distributed Systems | Elasticsearch | Entity recognition | Language ProcessingFlexible work environment | Remote work opportunitySenior-level Full TimePalo Alto, California5h ago
-
Software Engineer - Developer Products (AI) USD 170K-240KAPI Design | APIs | CLIs | Data Structures | Data Structures and AlgorithmsEmployee benefits package | Remote-friendly work environmentSenior-level Full TimeSan Francisco, California5h ago
-
Senior Machine Learning Engineer, Computer Vision USD 150K-200KAWS | Agile | Airflow | Azure | CI/CD401-k plan | Healthcare benefits | Life insurance | Long-term disability | On-site collaborationSenior-level Full TimeSeattle, Washington, United States8h ago
-
Business Analytics Lead - Predictive Modeling USD 130K-140KA/B | A/B Testing | B testing | Business Intelligence | Dashboard Development401k retirement savings plan | Dental coverage | Equity participation | Flexible spending account | Health savings accountSenior-level Full TimeNew York, NY10h ago
-
Quantitative Engineer USD 140K-155KAI Assistant | API Design | AWS | CI/CD | Credit facility401k | Dental insurance | Fitness fund | Health insurance | Learning and development fundSenior-level Full TimeRemote - USA R10h ago
-
C# | C++ | Digital Twin | HMI/SCADA | Industrial AutomationTravel 15 to 30 percentSenior-level Full TimeMiddleton, MA, United States12h ago
-
Software Engineer II, Computational Platform USD 124K-154KAPIs | AWS | Cloud Networking | Data Modeling | Docker401k plan | Commuter support | Company-provided laptop | Flexible paid time off | Holiday payMid-level Full TimeRemote; Watertown, Massachusetts, United States R13h ago
-
Data Platform Engineer USD 130K-175KAccess Control | Alerting | CI/CD | Cloud services | Data LineageMid-level Full TimeMilwaukee, WI13h ago
-
Bash | Cloud platform | Data Processing | Docker | GCPAsynchronous work culture | Bonus | Equity | Friendly work environment | Remote-friendly cultureMid-level Full TimeNew York, NY, USA16h ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Distributed team | Opportunity for product impact | Supportive management cultureMid-level Full TimeBoston, MA, USA16h ago
-
Perception Engineer USD 166K-220K3D Reconstruction | C++ | Camera Calibration | Camera systems | Computer VisionSenior-level Full TimeSeattle, Washington, United States16h ago
-
Computer Vision Engineer USD 166K-220K3D Reconstruction | C++ | Camera Calibration | Camera systems | Computer VisionSenior-level Full TimeSeattle, Washington, United States16h ago
-
Bash | Data Processing | Docker | GCP | LinuxMid-level Full TimeAustin, TX, USA17h ago
-
Bash | Data Processing | Docker | GCP | Infrastructure as CodeAsynchronous culture | Opportunity for high impact | Remote-friendlyMid-level Full TimeAtlanta, GA, USA17h ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerFriendly atmosphere | Opportunities for high impact | Professional growth | Remote friendly asynchronous cultureMid-level Full TimeSan Diego, CA, USA17h ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerAsynchronous culture | Bonus | Equity | Laid back environment | Remote-friendlyMid-level Full TimePhoenix, AZ, USA17h ago
-
Bash | Data Processing | Docker | GCP | Infrastructure as CodeMid-level Full TimeMenlo Park, CA, USA17h ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudMid-level Full TimeSanta Clara, CA, USA17h ago
-
Bash | Data Processing | Docker | GCP | LinuxAsynchronous culture | Bonus | Competitive compensation | Equity | Friendly work environmentMid-level Full TimeBoulder, CO, USA17h ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous work culture | Flexible management style | Portfolio and LinkedIn submissionMid-level Full TimeCharlotte, NC, USA17h ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous work culture | Competitive bonus | Equity compensation | Flexible remote distributionMid-level Full TimeAnn Arbor, MI, USA17h ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerAsynchronous work culture | Bonus | Equity | Flexible remote distributed setting | Friendly laid-back atmosphereMid-level Full TimeColumbus, OH, USA17h ago
-
Bash | Cloud infrastructure | Cloud infrastructure as code | Cloud platform | Data ProcessingAsynchronous culture | Distributed team | Portfolio support | Remote workMid-level Full TimeMiami, FL, USA17h ago
-
Bash | Data Processing | Docker | GCP | LinuxAsynchronous culture | Remote workMid-level Full TimeFort Lauderdale, FL, USA17h ago