Deep Learning Compiler Engineer - CUDA
Tasks
- Design DSL for tile aware GPU programming model
- Implement core compiler for tile aware GPU programming model
- Integrate compiler with AI ML frameworks
- Integrate solutions into DSL and compiler stack
- Investigate next-generation GPU architectures
- Optimize compiler architecture for performance
- Perform performance analysis for AI LLM workloads
Perks/Benefits
Skills/Tech-stack
AI/ML | AI/ML Integration | C# | C++ | Compiler design | Computer Architecture | DSL | Distributed communication | GPU Architecture | Kernel programming | LLVM | ML integration | MLIR | Multi-GPU | Parallel Computing | Performance Analysis | TVM | Triton
Education
Related jobs
-
Deep Learning Performance Architect, CUTLASS DSL Testing CNY 360K-600KAutomated testing | Code Coverage | GPU Computing | MLIR | PythonSenior-level Full TimeChina, Shanghai17h ago
-
Mid-level Full TimeChina, Shanghai17h ago
-
C# | C++ | Debugging | Deep learning | Generative AISenior-level Full TimeChina, Shanghai17h ago
-
ADC interfacing | Altium | Amplifier | Analog circuit | Analog circuit designMid-level Full TimeChina, Shanghai17h ago
-
校招-机器人感知算法开发工程师(目标检测方向) CNY 240K-360K3D Reconstruction | C++ | Camera Calibration | Cloud processing | Coordinate TransformationNone Full Time上海、合肥、北京20h ago
-
Mid-level Full Time广州20h ago
-
Mid-level Full Time上海23h ago
-
Entry-level Internship上海1d ago
-
Asset Management - AI Algorithm Engineer - Associate/VP CNY 300K-420KDeep learning | Fine Tuning | Java | Langchain | Language ModelsExecutive-level Full TimeShanghai, China1d ago
-
Senior-level Full TimeChina Shanghai1d ago
-
Senior-level Full TimeShenzhen, Guangdong, China1d ago
-
R&D – IoT Robotics Engineer CNY 360K-600KC++ | CI/CD | Camera pipeline | Control Systems | Data GenerationSenior-level Full TimeShenzhen, Guangdong, China1d ago
-
Llm实习生 CNY 36K-48KC++ | Deep learning | Language Models | Language Processing | Large Language ModelsEntry-level Internship上海1d ago
-
Entry-level Full Time北京1d ago
-
Entry-level Internship北京1d ago
-
Entry-level Internship北京、上海2d ago
-
Behavior Cloning | C++ | Cloud processing | Computer Vision | ControlEntry-level Internship北京、上海 R2d ago
-
Senior-level Full Time上海2d ago
-
【算法】多模态/大模型算法专家(上海) CNY 240K-480KAgent Frameworks | C++ | Computer Vision | Language Models | Language ProcessingSenior-level Full Time上海2d ago
-
Mid-level Full TimeShenzhen, Guangdong, China3d ago
-
Mid-level Full TimeBeijing, Beijing, China3d ago
-
AI Solution Software Engineer CNY 240K-480KAndroid | C# | C++ | Computer Architecture | ConcurrencySenior-level Full TimeShanghai, Shanghai, China3d ago
-
Mid-level Full TimeShanghai, Shanghai, China3d ago
-
Android | C# | C++ | Device driver | DisplayPortMid-level Full TimeShanghai, Shanghai, China3d ago
-
Senior-level Full TimeChina Shanghai3d ago