Staff Software Engineer, Model LifeCycle
Tasks
- Build and maintain APIs and abstractions for model training and deployment
- Build fine tuning systems
- Create agent execution infrastructure
- Design managed model lifecycle platform
- Design versioning lineage and reproducibility features
- Develop reinforcement learning training workflows
- Implement dataset model and experiment lifecycle management
- Implement distributed training pipelines
- Optimize training runtimes scheduling and storage
Perks/Benefits
- 401k match
- Cell phone stipend
- Commuter benefits
- Dental insurance
- Employer HSA contributions
- Health insurance
- Long-term disability
- Mental health support
- Paid life insurance
- Paid parental leave
- Paid time off
- Professional development
- Restricted stock units
- Short-term disability
- Tuition reimbursement
- Vision insurance
- Volunteer time off
Skills/Tech-stack
API Design | Checkpointing | Distributed Training | Failure recovery | Fine Tuning | GPU Optimization | Golang | Inference Optimization | Language Models | Large Language Models | LoRA | Managed databases | Multimodal AI | Object storage | Open Source | Policy Optimization | Preference optimization | Private Cloud | PyTorch | Python | Reinforcement Learning | Scheduling | Version control | Virtual Private Cloud
Education
Bachelor of Engineering | Bachelor of Science | Master of Science
Roles
Regions
Countries
States
Related jobs
-
Partner Engineering GenAI - US USD 133K-203KAPIs | Artificial Intelligence | C plus plus | Claude | Cloud ComputingSenior-level Full TimeMenlo Park, CA | Seattle, WA …3h ago
-
Machine Learning Performance Modeling Architect USD 173K-249KC# | C++ | Data Visualization | Heterogeneous computing | Image qualitySenior-level Full TimeSunnyvale, CA3h ago
-
Mid-level Full TimeSunnyvale, CA | Burlingame, CA3h ago
-
Robotics Engineer - Logistics and Material Flow USD 170K-240KAGV | Automation | Branching | C++ | Computer ScienceSenior-level Full TimeFremont, CA3h ago
-
Software Developer, Scaled Ops AI Acceleration Team USD 147K-203KAI infrastructure | Data Mining | Fine Tuning | Hack | JavaScriptSenior-level Full TimeSunnyvale, CA | Austin, TX | …3h ago
-
Automated testing | C++ | CSS | Debugging | GraphQLSenior-level Full TimeMenlo Park, CA3h ago
-
Robotics Control Engineer - Manipulation USD 170K-240KABB Rapid | AI Motion Planning | Adaptive Control | C++ | Cause analysisSenior-level Full TimeMenlo Park, CA | Fremont, CA3h ago
-
Robotics Manipulation Engineer USD 170K-240KAdaptive Control | Automation | C++ | Deep learning | GazeboSenior-level Full TimeFremont, CA3h ago
-
Software Engineer - Language (Technical Leadership) USD 213K-293KASR | Benchmarking | C# | C++ | Conversational AISenior-level Full TimeMenlo Park, CA | Seattle, WA …3h ago
-
Code review | Contamination Checking | Data Generation | Data Pipelines | Data ProcessingEntry-level Full TimeMenlo Park, CA3h ago
-
Business Support Engineer USD 136K-197KCall Support | Cloud Computing | Data Analysis | Data Mining | Docker24x7 on-call rotationEntry-level Full TimeMenlo Park, CA3h ago
-
Business Support Engineer USD 159K-223KCloud Computing | Data Analysis | Data Mining | Distributed Systems | Docker24x7 on-call rotation | Cross-functional team collaboration | Global partner supportSenior-level Full TimeMenlo Park, CA3h ago
-
Senior-level Full TimeMenlo Park, CA | New York, …3h ago
-
Research Engineer, Media Data Research - MSL FAIR USD 170K-251KComputer Vision | Data Curation | Data Generation | Data Scaling Laws | Data mixingSenior-level Full TimeMenlo Park, CA3h ago
-
Staff Software Engineer, Torch TPU USD 207K-300KCUDA | Computer Vision | Data Processing | Debugging | Distributed SystemsSenior-level Full TimeSunnyvale, CA, USA4h ago
-
C++ | Compilers | Custom Kernels | Data Processing | Data StructuresSenior-level Full TimeMountain View, CA, USA4h ago
-
Technical Solutions Engineer, Cloud AI, Google Cloud USD 150K-218KAI Model Training | AI model | Apache Beam | Apache Hadoop | Apache SparkSenior-level Full TimeSunnyvale, CA, USA; Austin, TX, USA4h ago
-
Principal AI Engineer - Core Platform USD 250K-290KAWS | Agents SDK | Anomaly Detection | Automated testing | Classification401k match | Company-provided phone | Health insurance | Hybrid work | PTOSenior-level Full TimeNew York, New York, United States10h ago
-
Engineer, AI Dev Tools USD 90K-120KAPI Integration | Agent architecture | Artificial Intelligence | Containerization | Data Modeling401k | Dental insurance | Health insurance | Hybrid work | Paid HolidaysMid-level Full TimeMinnetonka, MN, US13h ago
-
Engineer, AI Dev Tools USD 90K-120KAPI Integration | Agent architecture | Containerization | Data pipeline | Docker401k | Dental insurance | Health insurance | Paid Holidays | Paid time offMid-level Full TimeFoxboro, MA, US13h ago
-
Staff AI Engineer USD 210K-235KAgent systems | Agentic AI | Anthropic API | Anthropic Claude | Automated Evaluation401k | Career growth | Disability and life insurance | Equipment provided | Flexible vacation policySenior-level Full TimeRemote (United States) R13h ago
-
Checkpointing | Cloud Networking | Failure recovery | Golang | Human Feedback401k match | Cell phone stipend | Commuter benefits | Dental insurance | HSA employer contributionsSenior-level Full TimeSan Francisco, CA - US15h ago
-
Principal Engineer, AI Model LifeCycle USD 260K-326KAdapters | Checkpointing | DPO | DeepSpeed | Distributed TrainingCell phone stipend | Commuter benefits | Dental insurance | Health insurance | Mental health wellness supportSenior-level Full TimeSan Francisco, CA - US15h ago
-
AWS Glue | AWS Lambda | Amazon SQS | Apache Spark | Ataccama401k match | Dental insurance | Employee assistance program | Employee stock purchase plan | Health insuranceSenior-level Full TimeIrving - 6011 Connection, United States15h ago
-
Senior Software Engineer, Data Platform USD 164K-227KAccess Control | Airflow | Amazon Kinesis | Amazon Redshift | Apache Flink401k match | Community volunteer time | Commuter benefit | Company-paid days off | Dental insuranceSenior-level Full TimeSan Francisco, CA, USA R15h ago