Engineering Manager, Model Serving
San Francisco, CA
USD 250K-300K Mid-level Full Time
Found 1d ago
Tasks
- Build automation and tooling for operations
- Ensure availability and performance SLAs for inference and training
- Lead incident response and reliability improvements
- Mentor team members and support hiring
- Partner with teams to improve system reliability and efficiency
Perks/Benefits
Skills/Tech-stack
Automation | Configuration Management | Deployment Architecture | Incident Management | Inference frameworks | Kubernetes | ML Serving | ML inference | ML inference frameworks | ML serving systems | Monitoring | Multi-tenant | Multi-tenant SaaS | Multi-tenant SaaS platforms | SaaS platforms | Serving systems | System Reliability
Education
N/A
Regions
Countries
States
Language: en |
Views: 2 |
Clicks: 0
Related jobs
-
Data Engineer - Forecasting systems USD 95K-240KAWS | Data Modeling | Data Pipelines | Data Processing | Data VisualizationDental insurance | Health insurance | Paid Holidays | Paid personal days | Paid sick daysSenior-level Full TimeHouston, TX, United States6h ago
-
AWS | Airflow | Apache Spark | Azure | ContainerizationBonus | Dental | Medical | Paid parental leave | Paid time offSenior-level Full TimeFrisco15h ago
-
AWS Cloud Engineer USD 113K-188KAWS | CloudFormation | CloudTrail | CloudWatch | CodeBuildInclusive culture | Professional development | Work flexibilityMid-level Full TimeArlington/Rosslyn, Virginia, United States16h ago
-
Senior MLOps Engineer - Artificial Intelligence USD 160K-240KAWS | Algorithms | Argo Workflows | Azure | BuildpacksBenefits | Bonus | Health insurance | Paid Holidays | Paid time offSenior-level Full TimeNew York16h ago
-
Software Engineer, Ads - Core Infra - USDS USD 212K-360KAutomation | Backend Development | Data Processing | DevOps | Infrastructure ManagementSenior-level Full TimeLos Angeles, California, United States17h ago
-
Member of Technical Staff - GPU Infrastructure USD 180K-300KAnsible | Bash | CUDA | GPU Architecture | InfinibandSenior-level Full TimeSan Francisco1d ago
-
Sr. Data Engineer – EHR Data Migration USD 95K-130KAI | AKS | Azure | Azure Blob | Azure DataDiscretionary bonus | Flexible spending account | Health insurance | Holidays | Paid time offSenior-level Full TimeWestborough, Massachusetts, United States1d ago
-
Lead Machine Learning Engineer USD 190K-260KApache Spark | Cloud Native | Cloud-native infrastructure | Docker | Feature EngineeringHealthcare coverage | Inclusive environment | Remote flexibility | Self-managed PTO | Transportation subsidiesSenior-level Full TimeSeattle, WA1d ago
-
Senior DevOps Engineer USD 180K-275KAWS | Ansible | ArgoCD | Azure | BashContinuing education | Dental insurance | Flexible PTO | Health insurance | Health subscriptionSenior-level Full TimeAustin, TX1d ago
-
Lead Big Data Engineer - PySpark USD 156K-194KAWS | Azure | Data Assessment | Data Modeling | Data Pipeline DevelopmentCollaborative environment | Company culture | Professional development opportunitiesSenior-level Full TimeSeattle, WA, United States1d ago
-
Lead AI/ML Engineer USD 157K-220KAudio streaming | CI/CD | Containerization | Distributed Architectures | Kubernetes401k matching | Bereavement leave | Dental insurance | Fitness stipend | Jury duty leaveSenior-level Full TimeNew York1d ago
-
IT Infrastructure Engineer II USD 140K-185KAWS | Ansible | Application Firewalls | CI/CD | Cloud SecurityHealth and wellness resources | Wellness Fridays | Work-life balance supportSenior-level Full TimeRemote - United States R1d ago
-
AI | Cloud Native | Container Orchestration | Container Runtime | Distributed SystemsSenior-level Full TimeSeattle, Washington, United States1d ago
-
Site Reliability Engineer - Data (Seattle) USD 177K-341KAutomation | Cloud infrastructure | Flink | Kubernetes | Monitoring ToolsMid-level Full TimeSeattle, Washington, United States1d ago
-
Site Reliability Engineer, AI Applications USD 136K-359KAutomation | Capacity Planning | Documentation | Incident Response | MonitoringMid-level Full TimeSan Jose, California, United States1d ago
-
Infrastructure Engineer Intern (Compute Infrastructure - Cloud-Native )- 2026 Summer (MS/BS) USD 129K-246KAI | Cloud Computing | Docker | Kubernetes | MicroservicesDevelopment workshops | Hands-on experience | Industry exposure | Social eventsEntry-level InternshipSan Jose, California, United States1d ago
-
AI | CPU | Container Management | GPU | GoSenior-level Full TimeSan Jose, California, United States1d ago
-
Software Engineer III, Infrastructure, Cloud AI USD 147K-211KAI | Algorithms | C++ | Data Structures | Distributed SystemsSenior-level Full TimeSunnyvale, CA, USA1d ago
-
API frameworks | Algorithms | Angular | Artificial Intelligence | Cloud infrastructureBenefits | Bonus | EquitySenior-level Full TimeSunnyvale, CA, USA1d ago
-
Senior Software Engineer, Cloud Dataproc, Open Source USD 174K-255KData Analysis | Debugging | Distributed Systems | Flink | HiveBenefits | Bonus | EquitySenior-level Full TimeSunnyvale, CA, USA1d ago
-
AWS | Automation | Data analytics | Distributed Systems | EC2Career growth opportunities | Inclusive cultureMid-level Full TimeSeattle, Washington, USA2d ago
-
Senior Engineering Manager- AI/ML USD 170K-230KCommunication | Deep learning | Distributed Systems | Docker | KubernetesSenior-level Full TimeRemote, United States R2d ago
-
Enterprise Data Platform Engineer – Washington, DC USD 169K-299KAWS | CI/CD | Cloud Computing | Data Governance | DatabricksCollaborative work environment | Flexible work arrangements | Health insurance | Professional development opportunitiesSenior-level ContractFalls Church, VA, US | VA, …2d ago
-
AWS | Azure | CI/CD | Deep learning | DockerHybrid or remote work | Professional growth opportunitiesSenior-level Full TimeSan Jose, California, United States2d ago
-
AI/ML DevOps Engineer (TS/SCI with Poly Required) USD 164K-274KCloud AWS | Docker | Elasticsearch | Git | JavaFlexible work hours | Health insurance | Tuition assistanceSenior-level Full TimeChantilly, Virginia, United States2d ago