ML Platform Engineer
Tasks
- Build autoscaling and capacity management
- Collaborate with ML and product teams for model releases
- Design model serving platforms
- Develop deployment workflows canary releases shadow testing rollback
- Document operational procedures and performance characteristics
- Drive end-to-end observability
- Implement caching prompt deduplication response reuse
- Implement multi tenant routing rate limiting quality of service
- Implement security controls request signing content filtering abuse detection
- Integrate model serving with API gateways identity systems observability platforms
- Operate high availability AI services
- Optimize inference performance
- Perform incident response
- Support productionization of AI serving research
- Tune GPU utilization memory management KV cache
Perks/Benefits
- N/A
Skills/Tech-stack
API Gateway | Abuse detection | Autoscaling | C++ | Caching | Canary Deployment | Capacity Planning | Cloud Platforms | Content Filtering | Distributed Systems | GPU Architecture | Go | High Throughput | High-Throughput Systems | Identity Systems | KV cache | Kubernetes | LLM Inference | Low Latency | Low-Latency Systems | Memory Management | Metrics | Multi-tenant | Multi-tenant routing | Observability | Performance Engineering | Python | Quality of Service | Rate Limiting | Request Signing | Rust | Shadow testing | Structured Logging | Tenant Routing | TensorRT-LLM | Tracing | VLLM
Education
Roles
Related jobs
-
Machine Learning Engineer V USD 231K-382KAWS | Agent Orchestration | Automated testing | Azure | CI/CDBonus eligibility | Disability insurance | Life insurance | Paid parental leave | Paid time offSenior-level Full TimeRemote, United States R11h ago
-
Senior AI Engineer USD 145K-181KAWS | Alerting | Azure | Docker | Embeddings401k match | Commuter benefits | Dental | Healthcare | Remote friendly workplaceSenior-level Full Time3750 Market Street, Philadelphia, PA, United … R1d ago
-
AWS | AWS CDK | Access Control | Airflow | Athena401k plan | Health insurance | Paid Holidays | Paid time off | Phone stipendSenior-level Full TimeSan Carlos - Hybrid R1d ago
-
Sr AI Engineer - Agentic Systems USD 166K-205KAI Safety | API Integration | Agent Orchestration | Artificial Intelligence | Distributed SystemsSenior-level Full TimeAnywhere, US R1d ago
-
Applied AI Specialist, Commercial Customer Success USD 105K-142KAPI Integration | Accuracy Monitoring | Automated testing | CRM | Evaluation FrameworksRemote workSenior-level Full TimeRemote - US R1d ago
-
Principal Software Engineer, Data Infrastructure USD 295K-345KAWS | Airflow | Chaos Engineering | Data Catalog | Distributed SystemsEquity compensation | Health benefits | Onsite work flexibilitySenior-level Full TimeSan Mateo, CA, United States R1d ago
-
Data Warehouse Software Engineer I USD 70K-80KApache Airflow | Cloud Composer | Clustering | Data Lakes | Data Marts401k match | Dental insurance | Disability insurance | Health insurance | Life insuranceMid-level Full TimeRemote - United States R1d ago
-
Airflow | Auction design | BigQuery | Budget Optimization | Experimentation401k employer match | Coaching support | Family planning support | Flexible vacation | Gender-affirming careSenior-level Full TimeRemote - United States R1d ago
-
Software Engineer ll, Data Platform USD 110K-120KAPI | Amazon Athena | Amazon EMR | Amazon S3 | Apache Airflow401k match | Company holidays | Disability insurance | Employee assistance program | Flexible spending accountMid-level Full TimeUnited States R1d ago
-
AI Engineer USD 100K-197KARIMA | Amazon SageMaker | Bias Mitigation | Computer Vision | Deep learningMid-level Full TimeUSA - Remote R1d ago
-
Principal AI Platform Engineer USD 190K-225KACR | API Integration | Alerting | Audit Logging | Azure401k match | Career growth professional development | Employee assistance program | Low-cost medical dental vision | Paid HolidaysSenior-level Full TimeRemote (United States) R1d ago
-
Senior Data Engineer-JT0224 USD 120K-183K.Net Core | .Net Framework | Apache Airflow | Azure | Azure Data401k match | Career growth opportunities | Dental insurance | Employee resource groups | Health insuranceSenior-level Full TimeRemote, United States R1d ago
-
Senior Software Engineer, Data Products USD 165K-235KAPIs | Data Pipelines | Data Transformation | Data Warehousing | DatabricksSenior-level Full TimeRemote - US R1d ago
-
AI Integrations Staff Engineer USD 150K-230KAgent SDK | Artificial Intelligence | Backend Development | Data Modeling | Evaluation Frameworks401k contribution | Company retreats | Dental insurance | Employee referral program | EquitySenior-level Full TimeRemote R1d ago
-
Sr. Software Engineer, Machine Learning, tvScientific USD 155K-320KAWS | Adtech | Bandit Algorithms | Causal Inference | Causal LiftSenior-level Full TimeSan Francisco, CA, US; Remote, US R1d ago
-
Bare Metal | C# | C++ | Embedded microcontrollers | Operating SystemsRemote work flexibility | Work life schedule flexibilitySenior-level Full TimeNewark, California, United States R1d ago
-
Altium Designer | BLE | Bluetooth Low Energy | C# | C++401k matching | Corporate discounts | Dental insurance | Disability insurance | Employee assistance programMid-level Full TimeLos Angeles, CA, US R1d ago
-
Bluetooth Low Energy | C# | C++ | Cryptography | Firmware validation401k matching | Corporate discounts | Dental insurance | Disability insurance | Employee assistance programSenior-level Full TimeLos Angeles, CA, US R1d ago
-
Data Science, Advisor USD 135K-216KAPI | AWS | AWS Bedrock | AWS Glue | Amazon KinesisActive secret clearance | Remote work | Travel as neededSenior-level Full TimeUnited States R2d ago
-
Data Architecture, Senior Advisor USD 146K-234KAWS | Access Control | Azure | CI/CD | Cloud Computing100 percent remote | Active clearance optionSenior-level Full TimeUnited States R2d ago
-
Data Architecture, Lead Associate USD 112K-179KAWS | Airflow | Azure | CI/CD | DBT100 percent remote | Active clearance supportSenior-level Full TimeUnited States R2d ago
-
Business Data Engineer USD 140K-170KAPIs | AWS | Data Automation | Data Ingestion | Data PipelinesFlexible working hours | Vacation policyMid-level Full TimeSan Jose, California or Remote R2d ago
-
Senior Software Engineer - Data Platform USD 115K-145KAWS | Apache Airflow | Apache Spark | Azure | CI/CD401k plan | Coaching therapy professional development | Flexible spending account | Flexible vacation policy | Healthcare coverageSenior-level Full TimeUnited States R2d ago
-
Data Engineer USD 83K-158KAngular | Continuous Delivery | Continuous integration | Data Visualization | Data WarehousingMid-level Full TimeTwo Destiny Way, Westlake TX, United … R2d ago
-
Staff Machine Learning Engineer USD 172K-306KA/B | A/B Testing | Approximate Nearest Neighbor | Approximate Nearest Neighbors | B testingSenior-level Full TimeSan Jose, United States R2d ago