Software Engineer III - AI/ML Platform Operations - Remote
USD 105K-140K Senior-level Full Time
Tasks
- Build dashboards health metrics alerts logging runbooks
- Communicate technical concepts to technical non technical audiences
- Design automation monitoring observability tooling
- Enhance CI CD deployment automation infrastructure as code
- Ensure platform reliability scalability performance security
- Establish operational standards SLOs governance
- Influence platform strategy architecture operational processes
- Lead AI platform operations
- Lead incident management and escalation
- Manage model release process and model lifecycle
- Perform root cause analysis and drive corrective preventive actions
- Provide technical mentorship and operational guidance
- Support deployment monitoring maintenance lifecycle management
- Troubleshoot performance availability deployment integration issues
Perks/Benefits
- 401k matching
- Annual bonus eligibility
- Career growth and mentorship
- Inclusion and Belonging Culture
- Remote work flexible workplace
Skills/Tech-stack
AWS | AWS Bedrock | Alerting | Amazon SageMaker | CI/CD | Cause analysis | CloudWatch | Datadog | Docker | GitHub Actions | Grafana | Incident Management | Infrastructure as Code | Java | JavaScript | Jenkins | Kubernetes | Lifecycle Management | Logging | MLOps | Maven | Model Governance | Model Lifecycle | Model Lifecycle Management | Model Monitoring | Monitoring | Node.js | Observability | OpenTelemetry | Palantir Foundry | Prometheus | Python | Root Cause Analysis | Root cause | Service Level | Service-Level Objectives | Splunk | TypeScript | “as-code”
Related jobs
-
Featured Feat. Associate Director, Data Labs USD 167K-167KAWS | Cloud Computing | Compute Infrastructure | Data Analysis | LLM GovernanceConference speaking opportunities | Hybrid work schedule | Media appearancesSenior-level Full TimeWashington, District of Columbia, 20004, United … R6d ago
-
AI orchestration | Alerting | Caching | Distributed Systems | DockerDental insurance | Medical insurance | Paid time off | Savings plan options | Vision insuranceSenior-level Full TimeSan Francisco, CA, United States R7h ago
-
A/B | A/B Testing | APIs | Anthropic | B testingDental insurance | Medical insurance | PTO | Remote | Savings plan optionsMid-level Full TimeSan Francisco, CA, United States R7h ago
-
Alerting | Batching | Caching | Distributed Systems | DockerDental insurance | Medical insurance | Paid time off | Remote work | Savings planMid-level Full TimeSan Francisco, CA, United States R7h ago
-
Senior Software Engineer USD 221K-253KAlgorithm Design | Audio technologies | C++ | Cause analysis | Code ReviewsBonus | Equity | Health benefits | Hybrid work scheduleSenior-level Full TimeMountain View, CA, USA R8h ago
-
Principal Data Engineer USD 160K-170KAWS | Amazon Redshift | Apache Airflow | BigQuery | Cloud platform401k match | Dental insurance | Health insurance | Paid time off | Remote workSenior-level Full TimeRemote (United States) R19h ago
-
AI/ML Engineer - School USD 101K-163KAI Safety | AI safety evaluation | AWS Bedrock | AWS ECS | AWS LambdaMid-level Full TimeVirtual US IL, United States R19h ago
-
Senior Cloud Database Engineer (Hybrid - Seattle, WA) USD 142K-220KASM | AWR | AWS DMS | AWS RDS | Amazon AuroraDisability insurance | EAP Resources | Hybrid work location | Life insurance | Medical/Vision/DentalSenior-level Full TimeSeattle WA, United States R19h ago
-
AI Solutions Engineer USD 195K-280KAPIs | Authentication | CI/CD | Containerization | DBTDental insurance | ESPP | Flexible spending accounts | Health insurance | Remote work flexibilityMid-level Full TimeRemote US R20h ago
-
Senior Data Engineer USD 185K-200KAWS Glue | Amazon Athena | Amazon Bedrock | Amazon OpenSearch | Amazon S3401k company match | Comprehensive health coverage | Equity purchase option | Flexible work schedule | Generous time offSenior-level Full TimeLos Angeles Office R20h ago
-
AI Agents | API | Computational Science | GitHub | LLMEmployer-matched 401k | Health insurance | Paid Holidays | Paid time offEntry-level Full TimeRedmond, WA, United States R21h ago
-
AI/ML Platform Software Developer USD 160K-220KAPIs | Agentic AI | Backend Development | Cloud Native | Continuous MonitoringSenior-level Full TimeRemote (United States) R22h ago
-
Android | C# | C++ | Csharp | ONNXMid-level Full TimeMountain View, CA, US; Redmond, WA, … R23h ago
-
Lead AI Engineer USD 198K-261KAgentic Frameworks | CI/CD | Cloud Platforms | Containers | Fine TuningSenior-level Full TimeChicago, Illinois, USA R23h ago
-
Analytics Engineer, Payments USD 123K-180KBI | BigQuery | Cloud Storage | DBT | FirestoreDental insurance | Health insurance | Performance bonus | Remote work | Stock optionsMid-level Full TimeNew York R1d ago
-
Automated testing | CI/CD | Computer Vision | Continuous integration | DeepStreamRelocation bonus | Remote-friendly | Team onsite opportunities | Travel stipendMid-level Full TimeNY, SF or Remote R1d ago
-
Associate Forward Deployed Engineer, Enterprise Accounts USD 120K-160KAPI Integration | Computer Vision | Data Analysis | Language Models | Large Language ModelsCareer growth opportunities | Hybrid work schedule | Mentorship | Ownership and autonomy | Work in fast-paced environmentMid-level Full TimeSan Francisco Bay Area R1d ago
-
Senior-level Full TimeRemote - United States R1d ago
-
Senior-level Full TimeRemote - United States R1d ago
-
Staff Applied Scientist, AdTech USD 120K-150KAWS | Ad Ranking | Collaborative Filtering | Content-Based Recommendation | Content-basedSenior-level Full TimeUnited States (remote) R1d ago
-
Lead Applied Scientist, Marketing USD 120K-150KAWS | Ad Ranking | Ad Ranking Algorithms | Collaborative Filtering | Content-Based FilteringSenior-level Full TimeUnited States (remote) R1d ago
-
Computational Designer USD 95K-118KC# | C++ | Computational Geometry | Computer Graphics | Data Pipelines401k plan | Dental insurance | Education assistance | Fertility support | Flexible time offMid-level Full TimePortland, OR, US R1d ago
-
AI/ML Engineer, Senior - WFH1659 USD 300K-400KBias detection | Binary serialization | CPU Inference | Class imbalance | Data AnalysisSenior-level Full TimeReston, VA - Remote R1d ago
-
Senior Software Engineer, Data Foundation USD 189K-256KDjango | Go | High Volume | High-volume APIs | Python401k | Enhanced parental leave | Generous vacation | Holiday Breaks | Medical, dental & vision coverageSenior-level Full TimeSan Francisco, US (Hybrid) R1d ago
-
AWS S3 | Benchmarking | Debugging | Document AI | Document processingDirect collaboration with founding teams | Rapid Career Growth Opportunities | Remote work | Visa sponsorship availableSenior-level Full TimeSan Francisco, CA; Onsite R1d ago