Software Engineer, AI and DL Kernel Libraries
Tasks
- Analyze performance profile and optimize workloads
- Build deep learning library abstractions
- Build just in time compilation and code generation
- Collaborate with compilers and GPU architecture teams
- Contribute to open source inference ecosystems
- Design implement optimize GPU kernels
- Develop production AI inference software
- Integrate and improve LLM inference runtimes
Perks/Benefits
- N/A
Skills/Tech-stack
API Design | Apache TVM | C# | C++ | CUDA | CUDA C++ | CUDNN | Code generation | Code optimization | GPU Performance | GPU performance modeling | JAX | Just-in-Time | Just-in-time compilation | Linear Algebra | MLIR | ONNX | Performance Modeling | Performance Profiling | PyTorch | Python | System Architecture | TensorFlow | TensorRT | Triton
Education
Bachelor of Engineering | Bachelor of Science | Master of Science
Related jobs
-
Forward Deployed AI Engineer CNY 72K-96KAWS | Agile | Amazon Redshift | BigQuery | Cloud platformTravel up to 50 percentEntry-level Full Time Internship北京5h ago
-
Mid-level Full Time北京 R5h ago
-
Mid-level Full Time Temporary北京5h ago
-
Mid-level Full Time北京 R5h ago
-
Mid-level Full Time杭州5h ago
-
Regional Data & AI Engineer, Operations, Asia Pacific CNY 300K-380KArtificial neural networks | Clustering | Data Architecture | Data Governance | Data ModelingMid-level Full TimeShanghai, CN17h ago
-
[Pricing Data Engineering ] Staff Data Engineer I CNY 120K-180KAWS | Algorithms | Amazon EMR | Apache Airflow | Apache SparkSenior-level Full TimeShanghai, China19h ago
-
Magnetic Recording Algorithm Development Engineer CNY 144K-240KAlgorithm Development | Automated Test | Automated Test Equipment | C# | C++Senior-level Full TimeShenzhen, Guangdong Province, China21h ago
-
Mid-level Full TimeWuxi - Ximei Road, China (Mainland)23h ago
-
Senior AI Training Performance Engineer CNY 144K-240KC++ | CUDA | Computer Architecture | Deep learning | GPU ArchitectureSenior-level Full TimeChina, Shanghai23h ago
-
Mid-level Full TimeShenzhen, Guangdong, China23h ago
-
数据开发工程师 CNY 120K-180KBI | Data Governance | Data Quality | Data Warehousing | Data quality monitoringMid-level Full Time深圳1d ago
-
MiMo-大模型训练框架开发工程师 CNY 240K-480KC++ | CI/CD | DeepSpeed | Distributed Training | GPU Memory OptimizationEntry-level Full Time北京 R1d ago
-
Senior-level Full Time北京1d ago
-
Entry-level Full Time北京 R1d ago
-
机器人VLA算法研究员 - XiaomiRobotics CNY 500K-500KAction Generation | Computer Vision | Data pipeline | Deep learning | Diffusion ModelsEntry-level Full Time北京1d ago
-
Mid-level Full Time北京 R1d ago
-
具身智能算法工程师-模型 CNY 500K-500KDeep learning | Distributed Training | IQL | Inference Optimization | Isaac LabMid-level Full Time北京 R1d ago
-
Mid-level Full Time北京1d ago
-
Entry-level Internship上海1d ago
-
Machine Learning Engineer CNY 300K-380KArtifact tracking | Data Lineage | Data Pipelines | Distributed Systems | DockerFitness Events | Free meals | Hybrid working | Paid time off | Volunteer opportunitiesMid-level Full TimeShanghai, China1d ago
-
Principal Feature Engineer CNY 360K-600KARM | C++ | Camera Calibration | Case Development | Computer VisionSenior-level Full Time5-8F TOWER C, 788 JINZHONG ROAD, …1d ago
-
Senior-level Full Time5-8F TOWER C, 788 JINZHONG ROAD, … R1d ago
-
AI/ML Scientist CNY 300K-420KAPI Development | Application development | Computer Vision | Containerization | Data PrivacyEntry-level Full TimeCNSGH18 - Shanghai - No. 757 …1d ago
-
AI Full Stack Engineer CNY 246K-349KAPI Development | Backend API | FastAPI | Frontend Development | JavaScriptFun workplace | Teamwork cultureMid-level Full TimeSJ China Shanghai1d ago