Staff Machine Learning Engineer, ML Infrastructure - Online
Tasks
- Automate model packaging and deployment workflows
- Build and improve model serving systems
- Build autoscaling and self healing capabilities
- Design online inference infrastructure for low latency and reliability
- Develop model deployment and validation infrastructure
- Implement dynamic batching and request scheduling
- Implement traffic splitting and rollback
- Improve observability for online ML systems
- Lead architectural improvements for scalability and cost efficiency
- Optimize inference performance
- Set up canary testing and A B experimentation
Perks/Benefits
- Commute subsidy
- Disability insurance
- Employee stock ownership
- Generous vacation
- Health insurance
- Life insurance
- Mental health & wellbeing programs
- Retirement plan
- Training and development programs
Skills/Tech-stack
A/B | A/B Experimentation | Autoscaling | Caching | Canary testing | Deployment Automation | Distributed Systems | Dynamic batching | Error Rate Monitoring | Error rate | GKE | GPU Acceleration | GPU Kernel | GPU kernel optimization | Google Kubernetes | Google Kubernetes Engine | Inference Server | Kernel optimization | Kubernetes | Kubernetes Engine | Latency optimization | Model Validation | Model compilation | Monitoring | NVIDIA Triton | NVIDIA Triton Inference | NVIDIA Triton Inference Server | Observability | PyTorch | Python | Quantization | Rate monitoring | Ray | Ray Serve | Rollback | Runtime tuning | TensorFlow Serving | Throughput Optimization | Torchserve | Traffic splitting | Triton Inference | Triton Inference Server
Education
N/A
Related jobs
-
(Sr) Cloud & Data Engineer CNY 192K-240KAWS | Automation | CI/CD | Container Security | Data ModelingMid-level Full TimeBeijing, Beijing, CN3h ago
-
Entry-level Full Time广州13h ago
-
Senior-level Full Time上海、北京14h ago
-
None Full Time淄博14h ago
-
None Full Time济南14h ago
-
【27届实习】Ai实习生(可转正) CNY 36K-48KAmazon Web Services | Computer Vision | Deep learning | Docker | KubernetesEntry-level Internship淄博、济南、青岛14h ago
-
Mid-level Full Time北京 R16h ago
-
Miclaw-端云协同调度专家 (Hybrid AI Architect) CNY 240K-360K5G | API Integration | Claude 3.5 | Distributed Systems | GPT-4oHybrid workSenior-level Full Time北京 R16h ago
-
Java开发工程师(大数据方向) CNY 180K-360KApache Flink | Apache Spark | Data pipeline | Distributed Systems | IO ProgrammingMid-level Full Time武汉16h ago
-
Apache Airflow | Apache Flink | Apache Spark | Automated testing | Data LakeCommute subsidy | Competitive retirement pension plans | Employee resource groups | Employee stock ownership | Generous vacation personal daysSenior-level Full TimeShanghai, China1d ago
-
Airflow | CUDA | Data Lake | Data Warehouse | FlinkCommute subsidy | Competitive retirement pension plans | Employee resource groups | Employee stock ownership | Generous vacation personal daysSenior-level Full TimeShanghai, China1d ago
-
A/B | A/B Testing | Autoscaling | B testing | Canary testingCommute subsidy | Competitive retirement pension plans | Employee resource groups | Employee stock ownership | Generous vacationSenior-level Full TimeShanghai, China1d ago
-
Senior Data Engineer, Content Management Systems (China) CNY 144K-240KAPI Integration | AWS | Access Control | Alibaba Cloud | CI/CDAnnual medical check-up | Flexible benefits | Long service award | Medical and life insurance | Paid time offSenior-level Full TimeChina - Shanghai1d ago
-
AWS | Apache Airflow | Apache Kafka | Apache Spark | AzureMid-level Full TimeCN-Shenzhen-HyQ, China1d ago
-
Entry-level Internship Part TimeShanghai (JingAn), China1d ago
-
Mid-level Full Time深圳1d ago
-
Entry-level Internship北京1d ago
-
Senior Software Engineer - Machine Learning CNY 360K-600KData Analysis | Data Visualization | Deep learning | Experimentation | Fraud DetectionCareer progression | Collaborative culture | Competitive compensation | Global growth opportunitiesSenior-level Full TimeShenzhen, China1d ago
-
AI intern CNY 28K-50KAutomated testing | Continuous integration | Deep learning | Generative AI | JavaEntry-level InternshipBeijing,Beijing,China2d ago
-
Intelligent Test Automation & GenAI Tool Engineer CNY 360K-540KAgent systems | C# | C++ | CI/CD | ConfluenceSenior-level Full TimeShanghai, Shanghai, China2d ago
-
Senior Data Engineer CNY 360K-600KActive Directory | Agile | Apache Spark | Azure Active Directory | Azure CosmosHybrid work environment | Inclusion support | Professional growth | Wellbeing supportSenior-level Full TimeChengdu, Manulife Information and Technology Center, …2d ago
-
Sr. AI Process Engineer, Seller Compliance CNY 360K-600KAWS | CI/CD | Data Pipelines | Deployment | Feature StoreSenior-level Full TimeShanghai, CHN3d ago
-
Ai数据后端工程师(实习生) CNY 25K-37KAlgorithms | Data Structures | Distributed Systems | ELK | GoInternship | Learning opportunities | MentorshipEntry-level Internship上海3d ago
-
Entry-level Full Time上海3d ago
-
数据算法工程师(实习生) CNY 25K-37KC++ | Computer Vision | Data Generation | Data Preprocessing | Data cleaningInternshipEntry-level Internship上海3d ago