高级Ai运维工程师
Tasks
- Build AI platform monitoring system
- Build deploy and manage AI development environment
- Build deploy and manage AI inference environment
- Build deploy and manage AI training environment
- Collaborate with AI and business teams to gather requirements
- Create incident response plan
- Ensure 24/7 platform stability
- Implement resource management strategy
- Maintain and optimize AI platform architecture
- Manage Docker and Kubernetes deployments
- Manage deep learning framework environments
- Monitor platform status with Prometheus
- Optimize resource utilization
- Provide technical support for model training and deployment
- Respond quickly to production incidents
- Schedule and allocate AI compute resources
- Search and analyze logs with ELK
- Set resource priority for AI training tasks
- Troubleshoot system failures and performance bottlenecks
- Visualize metrics with Grafana
Perks/Benefits
- N/A
Skills/Tech-stack
Docker | ELK | Grafana | Incident Response | Kubernetes | Linux | MXNet | Performance Tuning | Prometheus | PyTorch | Resource Management | Resource scheduling | Systems Troubleshooting | TensorFlow
Education
Roles
AI | AI DevOps Engineer | AI Operations Engineer | DevOps Engineer | Engineer | Operations Engineer
Related jobs
-
Entry-level Full Time北京3h ago
-
Senior-level Full Time北京4h ago
-
Senior-level Full TimeChina19h ago
-
Research Intern (AI Agent) CNY 25K-37KAgent systems | Embodied AI | Language Models | Large Language Models | Memory-augmented systemsEntry-level Full Time Internship深圳1d ago
-
具身智能算法实习生 (Manipulation) CNY 25K-37KCLIP | Computer Vision | Deep learning | Diffusion Model | Fine TuningEntry-level Internship深圳1d ago
-
校招-Ai研究科学家-大语言模型/视觉语言模型算法与后训练(博士优先) CNY 500K-500KAdapters | Direct Preference Optimization | Fine Tuning | Flax | Function designNone Full Time上海1d ago
-
Activation Function | Architecture Design | Automated testing | CI/CD | Computer VisionBirthday off | Flexible working hours | Local holidays | Onsite work | Paid vacationMid-level Full TimeShenzhen1d ago
-
AI Software Engineer - Intern CNY 28K-50KC++ | Generative AI | Graph Structure Transformation | Hugging Face | Hugging Face TransformersOn-site workEntry-level Full Time InternshipCHN - Minhang, China1d ago
-
Machine Learning Engineer Lead CNY 300K-500KAPI Development | AWS | Asynchronous programming | CI/CD | Cloud platformAnnual Medical Checkup | Flexible benefits | Life insurance | Long service award | Medical insuranceSenior-level Full TimeChina-Shanghai (Tianshan-W-Rd)1d ago
-
Machine Learning Engineer Lead CNY 300K-500KAPI Development | AWS | Anomaly Detection | Artificial Intelligence | Asynchronous programmingAnnual Medical Checkup | Family care leave | Flexible benefits | Life insurance | Long service awardSenior-level Full TimeChina-Shanghai (Tianshan-W-Rd)1d ago
-
多模态大模型算法工程师(Vlm / 自动驾驶方向) CNY 180K-264KAgent systems | Autoregressive models | BEV | Behavior Modeling | C++Entry-level Full Time北京、苏州2d ago
-
Mid-level Full TimeBeijing, Beijing, CN; Suzhou, Jiangsu, CN2d ago
-
Entry-level Full Time广州3d ago
-
Entry-level Full Time广州3d ago
-
Principal Software Engineer - Core Infrastructure Team CNY 240K-480KAPI Design | Automation | C# | C++ | Database DesignSenior-level Full TimeBeijing, Beijing, CN; Suzhou, Jiangsu, CN3d ago
-
Algorithm Engineer Internship CNY 28K-50KAgent Frameworks | Artificial Intelligence | Computer Vision | Containerization | Deep learningEntry-level Full Time InternshipShenzhen Brion office, China3d ago
-
AI Analytical Automation Engineer CNY 180K-360KCloud Platforms | Docker | Fine Tuning | Git | LLM Fine-tuningMid-level Full TimeChengdu - China4d ago
-
Mid-level Full TimeChengdu - China4d ago
-
多模态大模型数据算法实习生 CNY 25K-37KComputer Vision | Data Analysis | Data Processing | Linux | Machine LearningEntry-level Internship北京5d ago
-
Entry-level Internship上海5d ago
-
Data Engineer ETL 工程师 CNY 204K-300KApache Flink | BERT | BigQuery | Conversational AI | Data WarehouseCommunity building | Innovation focus | Regional exposure | Team collaborationMid-level Full TimeShanghai, Shanghai, China5d ago
-
PyTorch Framework engineer CNY 60K-60KC# | C++ | Deep learning | Distributed Algorithms | Performance optimizationOn-site workEntry-level Full TimeCHN - Minhang, China5d ago
-
AI Engineer CNY 216K-264KAPIs | Chroma | Data Preprocessing | Embedding Models | Financial RegulationsMid-level Full TimeShenzhen6d ago
-
Entry-level Internship合肥8d ago
-
Senior AI Software Engineer(Hybrid Role) CNY 240K-480KAI | AWS | Agile | CI/CD | DockerFlexible-hybrid work | Global projects | Professional growth opportunitiesSenior-level Full TimeWuhan, Hubei, China8d ago