MLOps Support Team Lead
USD 150K-208K (estimate) Senior-level Full Time
Tasks
- Create and maintain runbooks and playbooks
- Define MLOps support operating model
- Define on call rotations and coverage
- Drive automation and self healing improvements
- Drive corrective actions
- Establish SLAs and SLOs
- Implement monitoring for pipelines and data flows
- Implement observability for infrastructure and compute
- Improve instrumentation logging and alerting
- Improve time to detect and time to resolve
- Lead global MLOps support team
- Manage major incident escalation
- Manage support processes across partners and stakeholders
- Monitor model performance and drift
- Own production ML reliability
- Perform root cause analysis
- Reduce repeat incidents
- Run incident triage and resolution
- Standardize support intake triage and resolution
- Support onboarding into standardized support model
- Track operational metrics and service health
Perks/Benefits
- N/A
Skills/Tech-stack
AWS | Azure | Bash | Bias monitoring | Cause analysis | Cloud Platforms | Cloud platform | Data Integrity | Databricks | DevOps | Docker | Google Cloud | Google Cloud Platform | Grafana | Incident Management | Kubernetes | MLOps | MLflow | Machine Learning | Model Drift | Model Monitoring | Monitoring | New Relic | Observability | Power BI | Python | Reliability Engineering | Root Cause Analysis | Root cause | SQL | Service Level | Service Level Agreement | Service Level Objective | Site Reliability | Site Reliability Engineering | Time To Resolve | Time to Detect
Education
N/A
Related jobs
-
Lead AI Engineer - Financial Inclusion USD 134K-190KAgent Development | Azure | Claude | Entra ID | Google ADKCareer growth opportunities | Coaching partnerships | Family-friendly policies | Flexible working arrangements | Fully remoteSenior-level Full TimeNairobi R18d ago
-
Mid-level Full TimeNairobi27d ago
-
Machine Learning Operations Specialist - CIMMYT USD 125K-185KAlerting | CI/CD | Data Governance | Data Preprocessing | DatabricksCross institutional collaboration | Knowledge sharing | Training and capacity buildingMid-level Full TimeNairobi, Kenya27d ago
-
Mid-level Full TimeNairobi, Nairobi1mo ago
-
Albumentations | CNN | Computer Vision | Image Segmentation | Image classificationMid-level Full TimeNairobi1mo ago