VLA训练infra算法工程师 - XiaomiRobotics
Tasks
- Build high throughput data pipeline
- Design data format and shard strategy
- Design distributed training方案
- Implement VLA model training framework
- Implement operator fusion for performance
- Optimize mixed precision training
- Support large scale experiment tracking and visualization
Perks/Benefits
- N/A
Skills/Tech-stack
BF16 | C++ | CUDA | DeepSpeed | Distributed Training | FP8 | FSDP | FlashAttention | GPU Computing | Infiniband | Linux | Megatron | Mixed Precision | NCCL | PyTorch | Python | RDMA | Triton
Education
N/A
Related jobs
-
Entry-level Full Time北京 R2h ago
-
Entry-level Full Time北京 R4h ago
-
Entry-level Full Time北京 R5h ago
-
Mid-level Full Time北京 R5h ago
-
Mid-level Full Time北京 R15d ago
-
AWS | Azure | JavaScript | NoSQL | Node.jsFast-paced environment | Remote workMid-level Full TimeHangzhou R24d ago
-
AWS | Agile | Azure | Blockchain | CursorRemote workMid-level Full TimeShenzhen R24d ago
-
Senior Firmware Engineer CNY 300K-390KAlgorithms | Automated testing | C++ | CI/CD | ContainerizationFlexible work schedule | Global teamwork opportunitySenior-level Full TimeChina - Sichuan - Chengdu - … R1mo ago
-
Field Application Engineer (Machine Learning) CNY 417K-540KC/C++ | CUDA | Customer support | Development Process | DockerFamily leave | Medical/Dental/Vision | Paid time off | Stock option | Training and developmentSenior-level Full TimeChina - Remote R1mo ago