Staff Software Engineer, Inference Platform
Tasks
- Architect active active systems with rapid failover
- Collaborate with ML product infrastructure and cloud teams
- Design inference platform orchestration
- Design service boundaries and failure domains
- Develop k8s operators and custom resource definitions
- Drive reliability with SLOs and resilience
- Implement observability (metrics, logging, tracing, alerting)
- Improve latency throughput capacity efficiency
- Lead production incidents and incident response
- Plan capacity and perform post incident improvements
- Write and review production code on critical paths
Perks/Benefits
- N/A
Skills/Tech-stack
Active/Active | Alerting | C++ | CI/CD | Debugging | Distributed Systems | Failover | Go | Incident Response | Kubernetes | Logging | MTLS | Metrics | Networking | Observability | SLO | Security Certificates | TLS | Tracing
Education
N/A
Roles
Backend | Backend Engineer | Engineer | Software Engineer | Staff Software Engineer
Regions
Countries
States
Cities
Related jobs
-
Senior-level Full TimeCanada R7h ago
-
Senior-level Full TimeUnited States R8h ago
-
Senior-level Full TimeUnited States R8h ago
-
Software Engineer, Inference Platform USD 220K-260KC++ | CI/CD | Debugging | Distributed Systems | GoMid-level Full TimeSunnyvale, CA10h ago
-
Senior Software Engineer - AI Integrations USD 170K-240KAWS | Alerting | C++ | CSS | Continuous integrationSenior-level Full TimeMountain View, CA12h ago
-
Lead Data Engineer USD 170K-220KAccess Control | Automation | BI Tooling | Backups | Business Intelligence401k | Accessories allowance | Education stipend | Equity tax advisory service | Financial Wellness WebinarsSenior-level Full TimeNew York, NY13h ago
-
Corporate AI Engineer USD 154K-200KAPI Integration | Access Control | Data Quality | Embeddings | Generative AIHybrid work schedule | Volunteer time offMid-level Full TimeAddison, TX (Hybrid); Bellevue, WA (Hybrid); … R14h ago
-
AI Solutions Engineer USD 110K-156KAPI Integration | AWS | Agentic AI | CI/CD | Cloud platformOccasional travel | Security clearance eligibilitySenior-level Full TimeRaleigh, NC, US16h ago
-
Embedded Software Engineer USD 110K-145KArtifactory | Atlassian Bamboo | Bash | Bitbucket | C++On-site workMid-level Full TimeLafayette, CO17h ago
-
AI Software Engineer-Senior USD 135K-200KAgile | CI/CD | Confluence | Data Engineering | Data PipelinesSenior-level Full TimeAnnapolis Junction, Maryland, United States17h ago
-
Mid-level Full TimeAnnapolis Junction, Maryland, United States17h ago
-
AI Analytic Software Engineer-Senior USD 150K-195KAWS Amplify | AWS Bedrock | Docker | Elasticsearch | Elasticsearch QuerySecurity ClearanceSenior-level Full TimeAnnapolis Junction, Maryland, United States17h ago
-
Analytics Engineer USD 120K-160KAPI Development | AWS | Azure | CI/CD | Containerization401k | Dental insurance | Medical insurance | Paid Holidays | Paid time offMid-level Full TimeBrooklyn, New York, United States18h ago
-
FDE Data Engineer- Space USD 123K-166KABAC | Apache Airflow | Apache Flink | Apache Hudi | Apache IcebergHealth benefits | Remote work | Travel for site workMid-level Full TimeUnited States - Remote R19h ago
-
Senior FDE Data Engineer USD 148K-201KABAC | Airflow | Apache Iceberg | Apache Kafka | Argo WorkflowsCompany paid premiums | Health savings account | Medical/Dental/Vision | Remote work | Travel to classified sitesSenior-level Full TimeUnited States - Remote R19h ago
-
DevOps Engineer USD 150K-170KAWS | Bash | CI/CD | DAST | Dependency ScanningDental insurance | Health insurance | LTD insurance | Life insurance | One Medical membershipSenior-level Full TimeUnited States21h ago
-
AI Observability | AWS | Azure | CI/CD | Cost ControlCareer advancement | Fully remote work | Professional development opportunities | Work-life balanceSenior-level Full TimeCanada R21h ago
-
AWS Glue | AWS Lambda | Amazon Redshift | Amazon Web Services | AnsibleContinuous learning | Flexible working hours | Hybrid work model | Professional development | Retirement pension and savingsSenior-level Full TimeCanada1d ago
-
Amplitude | Analytics engineering | BI | Cloud platform | Data ModelingCo-working access | Employer paid group insurance premiums | Generous parental leave | Health spending account | Pre IPO equity upsideSenior-level Full TimeCanada1d ago
-
API Integration | Agent Orchestration | Bias Mitigation | C plus plus | C#Senior-level Full TimeMenlo Park, CA | Seattle, WA …1d ago
-
Software Engineer III, AI/ML, Proxybidder ML USD 147K-211KC++ | Data Processing | Debugging | JAX | KerasBenefits | Bonus target | EquitySenior-level Full TimeNew York, NY, USA1d ago
-
Staff Software Engineer, Agent-Centric Data and APIs USD 207K-301KC++ | CSS | Data Engineering | Data Structures | Data Structures and AlgorithmsSenior-level Full TimeSan Francisco, CA, USA1d ago
-
Algorithms | C++ | Debugging | Embedded Systems | Embedded operating systemsBonus | Equity | Health insurance | Paid time off | Retirement planSenior-level Full TimeMountain View, CA, USA1d ago
-
Senior Software Engineer, AI/ML GenAI, Core USD 174K-253KC++ | Computer Vision | Data Processing | Data Storage | Data StructuresHealth insurance | Paid time off | Parental leave | Retirement plansSenior-level Full TimeSan Jose, CA, USA1d ago
-
Senior Software Engineer, Motion Algorithms USD 174K-253KBluetooth | C++ | Data Analysis | Data Structures | Data VisualizationSenior-level Full TimeMountain View, CA, USA1d ago