Senior Software Engineer, RL Post-Training Frameworks
USD 184K-356K Senior-level Full Time
Tasks
- Architect RL post training infrastructure
- Build distributed RL training inference rollout loops
- Coordinate actor critic and reward models across heterogeneous hardware
- Design distributed systems failure recovery and recovery approaches
- Implement fault tolerance elastic scaling and fast restarts
- Improve open source RL frameworks
- Integrate CPU driven rollout workloads for tool use code execution and agent environments
- Partner with framework owners and research teams
- Tune performance for GPUs CPUs and LPUs
Perks/Benefits
Skills/Tech-stack
Actor Based Programming | C# | C++ | Consistency models | DPO | DeepSpeed | Distributed Systems | FSDP | FSDP2 | Failure recovery | GRPO | High Performance | High-Performance Computing | High-performance inference | Infiniband | Kubernetes | LLM post training | MOE | Megatron-LM | Mixed Precision | NCCL | NVLink | PPO | Performance Computing | Pipeline parallelism | Post-training | PyTorch | Python | Quantization aware training | RLHF | Ray | Reinforcement Learning | Reinforcement Learning for LLM Post Training | Reward Modeling | Service boundaries | Task-based programming | Tensor Parallelism | TensorRT-LLM | VLLM
Education
Regions
Countries
States
Cities
Related jobs
-
Featured Feat. Applied AI Engineer - Bay Area USD 211K-263KArtificial Intelligence | C plus plus | C# | Embeddings | Feature Engineering401k | Comprehensive health and wellness benefits | Learning and development opportunities | Unlimited time offMid-level Full TimeHQ (San Francisco)24d ago
-
Staff AI/ML Engineer USD 240K-270KAWS | Agentic Workflows | Azure | Data Curation | Deep learning401k | Commuter benefits | Dog-friendly office | Equity | FSA benefitsSenior-level Full TimeNew York City, NY8h ago
-
Staff AI/ML Engineer USD 240K-270KAWS | Agentic Workflows | Cloud platform | Deep learning | Foundation Models401k | Commuter benefits | Dog-friendly office | Equity | Flexible spending accountSenior-level Full TimeSan Francisco, CA8h ago
-
Senior AI/ML Engineer USD 240K-270KAWS | Agentic Workflows | Azure | Data Curation | Deep learning401k | Commuter benefits | Dog-friendly office | Equity | FSA benefitsSenior-level Full TimeNew York City, NY8h ago
-
Member of Technical Staff (Storage) USD 185K-200KAI Assisted Development | C++ | Concurrency Control | Data replication | Distributed SystemsDental insurance | Flexible time off | Life and disability insurance | Medical insurance | Mental wellbeing benefitsSenior-level Full TimeNew York, NY R10h ago
-
Architecture Review | Assembly | C# | C++ | Code review401k retirement plan | Company shuttles | Dental insurance | Employee stock purchase plan | Life insuranceSenior-level Full TimeRedmond, WA10h ago
-
Assembly | C# | C++ | Convex Optimization | Distributed Systems401k retirement plan | Dental insurance | Disability insurance | Employee stock purchase plan | Life insuranceSenior-level Full TimePalo Alto, CA10h ago
-
Assembly | C# | C++ | Convex Optimization | Distributed Systems401k | Dental insurance | Employee stock purchase plan | Life insurance | Medical insuranceSenior-level Full TimePalo Alto, CA10h ago
-
Assembly | C# | C++ | Convex Optimization | Debugging401k | Company shuttle | Dental insurance | Disability insurance | Employee discountsSenior-level Full TimeRedmond, WA10h ago
-
Senior Software Engineer - Data Platform USD 130K-220KAWS Lambda | AWS RDS | Airflow | Amundsen | Apache HiveHealth insurance | Parental leave | Professional development stipend | Remote workSenior-level Full TimeRemote - US R10h ago
-
Ai Engineer USD 100K-150KAI Agents | API Development | AWS | AWS Bedrock | Agentic Workflows401k | Commuter benefits | Dental insurance | Disability coverage | EAPMid-level Full TimeColumbia, MD, United States11h ago
-
Senior-level Contract Full TimeReston, VA, United States12h ago
-
Senior AI/ML Engineer: Python & Scientific Computing USD 175K-250KCelery | Distributed Computing | Flask | GraphQL | High PerformanceSenior-level Full TimeSan Francisco12h ago
-
Senior Data Engineer (Remote) USD 155KAgile | Apache Spark | BigQuery | Cassandra | Data Governance401k match | Dental insurance | Employee assistance program | Employee stock purchase plan | Flexible scheduleSenior-level Full TimeWork From Home, United States R14h ago
-
Senior AI Operations Engineer USD 170K-180KAI infrastructure | Azure | CI/CD | Cloud infrastructure | Container Engine for Kubernetes401k match | Employee assistance program | Employee stock purchase plan | Flexible schedule | Flexible spending accountSenior-level Full TimeWork From Home, United States R14h ago
-
Staff Software Engineer USD 190K-230KData Engineering | Data Pipelines | Database Design | Distributed Computing | ETLSenior-level Full TimeSan Francisco, California, United States15h ago
-
Data Engineer USD 89K-167KAmazon S3 | Apache Airflow | Apache Spark | DBT | Data ModelingDental insurance | Health insurance | Vision insuranceMid-level Full TimeRemote, USA R16h ago
-
Senior Data Engineer | Bankrate USD 100K-210KAPIs | AWS Lambda | AWS S3 | Airflow | Amazon EC2401k matching | Eastern Standard Time schedule | Employee assistance program | Flexible paid time off | Flexible spending accountsSenior-level Full TimeUnited States R16h ago
-
Senior Lead AI Software Engineer USD 216K-324KAgentic Workflows | Alerting | Architectural patterns | Coding assistants | Cost OptimizationSenior-level Full TimeBoston, MA17h ago
-
API Development | Airflow | Automated retraining | CI/CD | Cloud PlatformsEquityMid-level Full TimeNaples, United States17h ago
-
Adversarial Machine Learning | Anomaly Detection | Cloud Security | Machine Learning | PythonSecurity clearance premiumsMid-level Full TimeNaples, United States17h ago
-
AI Automation Engineer USD 96K-156KAPI Development | API Integration | Billing automation | Compliance Automation | ConnectwiseOnsite work authorization guidance required | Training and enablement support | Work with cross-functional teamsMid-level Full TimeGrand Rapids, Michigan17h ago
-
API Design | AWS | AWS Cloud | AWS Cloud Development Kit | AWS cloud developmentSenior-level ContractGlendale, United States18h ago
-
Mid-level Full TimeUS-Kansas-Wichita18h ago
-
Senior-level Full TimeCincinnati, OH, United States18h ago