Agent RL Infra Engineer
USD 224K-356K Senior-level Full Time
Tasks
- Build observability for training runs
- Build reinforcement learning cookbooks and blueprints
- Collaborate on safe deployment of training outputs
- Design reinforcement learning training loops
- Design verifiable reward environments
- Ensure security and governance compliance
- Evaluate and adapt reinforcement learning approaches
- Integrate NeMo microservices for end to end data flywheel workflows
- Integrate distributed training on GPU infrastructure
- Lead data curation and active learning strategies
- Operationalize training backends as production services
Perks/Benefits
- N/A
Skills/Tech-stack
AI Feedback | Active Learning | Cluster management | Continuous Learning | Data Curation | Data Flywheel | DeepSpeed | Direct Preference Optimization | Distributed Training | FSDP | GPU Cluster | GPU Cluster Management | Go | Group Relative Policy Optimization | Gym | Hugging Face | Hugging Face Accelerate | Job orchestration | ML Ops | Megatron | Microservices | NEMO | Observability | Pipeline Automation | Policy Optimization | Preference optimization | Proximal Policy Optimization | Python | Reinforcement Learning | Reinforcement Learning from AI Feedback | Reward Modeling | Rust | Safety constraints
Education
Regions
Countries
States
Cities
Related jobs
-
Data Engineer USD 130K-140KAPI first | API-first design | Agile | Automated testing | CI/CDPublic trust clearance support | Remote work | US citizen requirementSenior-level Full TimeWork from home, VA, United States R8h ago
-
Principal Systems Engineer - Embedded Tactical Software USD 146K-189KC++ | Change Control | Compliance | Configuration Management | Embedded SoftwareActive Top Secret clearance | Onsite work | Travel up to 25 percentSenior-level Full TimeArlington, VA, United States8h ago
-
AI Developer – Model Creation & Full Stack (Python) USD 130K-165KAWS | Angular | Azure | CI/CD | Deep learningRemote work consideredMid-level Full TimeWork from home, VA, United States R8h ago
-
Junior Data Engineer USD 70K-110KData Manipulation | Data Transformation | PySpark | Python | SQLHybrid remote onsite work | Obtain TS SCI clearance | US citizenship requiredEntry-level Full TimeFAIRFAX, VA, United States8h ago
-
Junior Software Engineer USD 74K-105KAPI Integration | AWK | AWS | Bash | C++Ability to obtain TS/SCI clearance | Onsite work environment | US government program mission focusEntry-level Full TimeSpringfield, VA, United States8h ago
-
Data Engineer (UAP, EEB) USD 140K-165KApache Kafka | Apache Spark | CI/CD | Containerization | Data GovernanceSenior-level Full TimeWork from home, VA, United States R8h ago
-
Data Engineer (UAP, EEB) USD 140K-165KApache Spark | CI/CD | Cloud Data | Cloud data ingestion | Cloud platformRemote workSenior-level Full TimeWork from home, VA, United States R8h ago
-
Artificial Intelligence | C++ | Data Visualization | Data integration | Databases401k match | Medical, dental & vision coverage | PTOSenior-level Full TimeBurke, VA, United States9h ago
-
GTM AI Engineer USD 158K-230KAPIs | Cloud N/A | Data Retrieval | Evaluation Frameworks | Hallucination reductionSenior-level Full TimeNew York, NY, US10h ago
-
Agile | Azure | Data Modeling | Data Warehousing | Data pipelineAgile environment | Remote workSenior-level ContractLincoln, United States R12h ago
-
Senior-level Full TimeCharlotte, United States12h ago
-
AIOps | Anomaly Detection | C# | C++ | Chaos Engineering401k match | Dental insurance | Life insurance | Long-term disability | Medical insuranceSenior-level Full TimeNew York12h ago
-
Apache Flink | CSS | Distributed Systems | Docker | Go401k match | Dental insurance | Life insurance | Medical insurance | Paid time offSenior-level Full TimeNew York12h ago
-
Dataset Construction | Efficient Inference | Human Feedback | Instruction Tuning | Language ModelsSenior-level Full TimeSeattle, Washington, United States13h ago
-
Continuous Learning | Data Engineering | Efficient Inference | Human Feedback | Instruction TuningSenior-level Full TimeSan Jose, California, United States13h ago
-
AI Research Engineer USD 177K-251KBenchmarks | Data Pipelines | Data Versioning | Evaluation | Fine TuningCross-functional collaboration | End-to-end ownership | High autonomyMid-level Full TimeBellevue, WA | Menlo Park, CA …13h ago
-
Data Engineer USD 185K-196KApache Spark | Artificial Intelligence | CSS | Data Governance | Data ModelingMid-level Full TimeMenlo Park, CA13h ago
-
Operations Platform Engineer USD 153K-200KAPI Design | Alerting | Anomaly Detection | Backpressure | BufferingSenior-level Full TimeRedmond, WA13h ago
-
Entry-level Full TimeBellevue, WA13h ago
-
Entry-level Full TimeMenlo Park, CA13h ago
-
Privacy Engineer USD 194K-217KApache Airflow | Apache Spark | Automated testing | C plus plus | Continuous DeploymentEntry-level Full TimeMenlo Park, CA13h ago
-
Software Engineer III, AI/ML GenAI, Google Cloud Compute USD 147K-211KAudio generation | C++ | Computer Vision | Data Processing | Data StorageSenior-level Full TimeKirkland, WA, USA13h ago
-
Senior Software Engineer, AI/ML, Google Cloud AI USD 174K-252KC++ | Data Processing | Data Structures | Data Structures and Algorithms | DebuggingSenior-level Full TimeSunnyvale, CA, USA13h ago
-
Software Engineer, TPU Inference, AI/ML USD 147K-211KCloud Computing | Compilers | GPU | GPU Programming | InferenceMid-level Full TimeKirkland, WA, USA13h ago
-
Software Engineer, Machine Health USD 147K-211KAnalysis and Design | C++ | Data Processing | Data analytics | Distributed SystemsMid-level Full TimeSunnyvale, CA, USA13h ago