Staff Software Engineer - AI Research Infrastructure
New York City, New York; San Francisco, California
USD 199K-270K Senior-level Full Time
Tasks
- Build CI testing infrastructure for research code
- Build services for scheduling and orchestration
- Convert experimental workloads into robust repeatable pipelines
- Create abstractions for job submission and management
- Design infrastructure for large scale experiments
- Develop monitoring and observability for workloads
- Develop workflows that reduce iteration time
- Improve research developer productivity tooling
- Mentor engineers on compute infra and AI systems
Perks/Benefits
- N/A
Skills/Tech-stack
Backend Services | CI | Cluster management | Data Pipelines | Distributed Systems | Distributed Training | Fine Tuning | GPU Computing | High Performance | High-Performance Computing | Job Scheduling | Kubernetes | Model Evaluation | Model Parallelism | Monitoring | Observability | Performance Computing | Ray | Resource Management | Slurm | Testing
Education
Roles
Regions
Countries
States
Related jobs
-
Featured Feat. Associate Director, Data Labs USD 167K-167KAWS | Cloud Computing | Compute Infrastructure | Data Analysis | LLM GovernanceConference speaking opportunities | Hybrid work schedule | Media appearancesSenior-level Full TimeWashington, District of Columbia, 20004, United … R4d ago
-
Delivery Senior Consultant, Software Engineering Solutions, Identity & Gen AI Engineer USD 155K-265KAI Agents | AWS | Access Management | Ansible | AuthenticationHybrid work model | Onsite up to 5 days per week | Professional training and development | Travel opportunitiesSenior-level Full TimeAtlanta, Georgia, United States; Charlotte, North …6h ago
-
AI Agents | AI Risk Management Framework | Access Management | Amazon Web Services | AnsibleSenior-level Full TimeArlington/Rosslyn, Virginia, United States; Baltimore, Maryland, …6h ago
-
Business Support Engineer USD 159K-223KAPI Development | AWS | Agent Orchestration | Azure | Bias Mitigation24 7 Oncall RotationMid-level Full TimeMenlo Park, CA8h ago
-
Business Support Engineer USD 147K-203KAPI troubleshooting | AWS | Azure | Data Analysis | Debugging24 7 Oncall RotationMid-level Full TimeMenlo Park, CA8h ago
-
Business Support Engineer USD 141K-197KAI Agents | API troubleshooting | AWS | Agent Orchestration | Azure24 7 Oncall Rotation | Cross-functional collaborationSenior-level Full TimeMenlo Park, CA8h ago
-
Software Engineer, Machine Learning RecSys USD 170K-240KAgent Orchestration | BERT | Bias Mitigation | C++ | Code ReviewsSenior-level Full TimeSunnyvale, CA8h ago
-
Senior Software Engineer, Sensor AI/ML, Watch Software USD 174K-253KC# | C++ | Data Processing | Debugging | Deep learningSenior-level Full TimeMountain View, CA, USA8h ago
-
Staff Software Engineer, Cooling Optimization USD 207K-301KC++ | Control Theory | Data Structures | Data center | Data center architectureSenior-level Full TimeSunnyvale, CA, USA8h ago
-
Senior Software Engineer, DeepMind USD 221K-253KAlgorithms | Audio Processing | C++ | Cause analysis | Data StructuresBonus | Equity | Hybrid scheduleSenior-level Full TimeMountain View, CA, USA R8h ago
-
AI Application Engineer USD 144K-209KAI Agents | API Development | Artificial Intelligence | Data Pipelines | Data QualityBonus | Employee benefits | Equity | Health insurance | Paid time offSenior-level Full TimeAustin, TX, USA8h ago
-
Staff Software Engineer, AI/ML, Search Ads USD 207K-301KC++ | Data Processing | Data Quality | Debugging | Distributed ComputingBonuses | Comprehensive health insurance | Equity | Paid time off | Retirement planSenior-level Full TimeMountain View, CA, USA8h ago
-
Senior Software Engineer, Database Internals, AlloyDB USD 174K-253KC# | C++ | Code optimization | Compute Technologies | Concurrency ControlEntry-level Full TimeSunnyvale, CA, USA; Kirkland, WA, USA8h ago
-
Software Engineer III, Generative AI, Search Health USD 147K-211KA/B | A/B Testing | B testing | Benchmarking | Computer VisionSenior-level Full TimeMountain View, CA, USA8h ago
-
Staff Software Engineer, Network Health USD 207K-301KAnomaly Detection | Automated remediation | Data Processing | Data Structures | Data Structures and AlgorithmsSenior-level Full TimeSunnyvale, CA, USA8h ago
-
Senior Developer Relations Engineer, AI Infrastructure USD 163K-237KC++ | Deep learning | Developer Content | GPU | Google KubernetesSenior-level Full TimeSunnyvale, CA, USA; Cambridge, MA, USA8h ago
-
Solutions Engineer – Robotics & Autonomous Driving USD 152K-500K2D Imaging | 3D Imaging | Autonomous Driving | C++ | CameraCommuter stipend | Generous PTO | Global travel | Health, dental, vision coverage | Learning and development stipendSenior-level Full TimeSan Francisco, CA; New York, NY16h ago
-
Senior AI Engineer I USD 123K-215KAWS | Agent Orchestration | Evaluation and monitoring | GCP | GRPCSenior-level Full TimePhoenix, AZ, United States17h ago
-
Senior Software Engineer, Storage USD 217K-303KC++ | Caching | Cassandra | Data Storage | Distributed Systems401k employer match | Dental insurance | Equity compensation | Generous time off | Medical insuranceSenior-level Full TimeRemote - United States R18h ago
-
Machine Learning Research Engineer USD 161K-189KAWS | Azure | Bias Variance | Bias-Variance Tradeoff | C++Flexible hybrid work model | Mental health counseling | Mentorship programs | Paid parental leave | Paid volunteer time offMid-level Full TimeNew York, US, New York18h ago
-
Sr. Application Software Engineer, Data Analytics USD 160K-225KAngular | C# | CI/CD | Computer Vision | Continuous integrationExtended hours | Travel | Weekend workSenior-level Full TimeBastrop, TX19h ago
-
Mid-level Full TimeEl Segundo21h ago
-
Senior-level Full TimeSan Francisco, California21h ago
-
Foundation Model Data, Software Engineer USD 213K-263KApache Beam | Apache Flume | C++ | CI/CD | Distributed Systems401k match | Baby bonding leave | Dental insurance | Disability insurance | Health insuranceSenior-level Full TimeMountain View, CA, USA; San Francisco, …21h ago
-
Senior-level Full TimeOnsite - Austin, TX21h ago