模型部署与推理优化工程师
Tasks
- Adapt inference engine to target hardware
- Automate model packaging remote push on device loading and validation
- Build one click model deployment toolchain
- Create inference performance benchmark suite
- Deploy VLA models to edge devices
- Implement hot model updates and version rollback
- Improve inference performance
- Monitor latency accuracy and resource usage after updates
- Optimize models with quantization, pruning, distillation
- Research and apply efficient inference frameworks and optimization algorithms
Perks/Benefits
- N/A
Skills/Tech-stack
C++ | Edge inference | Inference Performance | Inference Performance Optimization | Model Distillation | Model Pruning | Model Quantization | ONNX Runtime | Performance optimization | Python | TensorRT
Education
Bachelor of Engineering | Bachelor of Science | Master of Science
Related jobs
-
数据库开发工程师 CNY 240K-420KC++ | Caching | Database Internals | Distributed Systems | Distributed consistencyEntry-level Full Time北京 R10h ago
-
Entry-level Internship上海 R18h ago
-
Entry-level Full Time北京 R2d ago
-
Entry-level Full Time北京 R2d ago
-
Entry-level Full Time北京 R2d ago
-
Mid-level Full Time北京 R2d ago
-
Mid-level Full Time北京 R2d ago
-
Mid-level Full Time北京 R17d ago
-
AWS | Azure | JavaScript | NoSQL | Node.jsFast-paced environment | Remote workMid-level Full TimeHangzhou R27d ago
-
AWS | Agile | Azure | Blockchain | CursorRemote workMid-level Full TimeShenzhen R27d ago
-
Senior Firmware Engineer CNY 300K-390KAlgorithms | Automated testing | C++ | CI/CD | ContainerizationFlexible work schedule | Global teamwork opportunitySenior-level Full TimeChina - Sichuan - Chengdu - … R1mo ago
-
Field Application Engineer (Machine Learning) CNY 417K-540KC/C++ | CUDA | Customer support | Development Process | DockerFamily leave | Medical/Dental/Vision | Paid time off | Stock option | Training and developmentSenior-level Full TimeChina - Remote R1mo ago