Engineering Manager, LLM Performance
Tasks
- Design implement optimize LLM inference features
- Develop software for LLM deployment and developer experience
- Improve LLM inference performance on NVIDIA datacenter architectures
- Lead team for LLM inference performance
- Plan projects deliver milestones coordinate cross functional teams
- Tune performance using inference benchmarks
Perks/Benefits
- N/A
Skills/Tech-stack
API Development | C++ | CUDA | GPU Architecture | LLM Inference | Performance Tuning | Python | SGLang | System Performance | System performance tuning | TensorRT-LLM | VLLM
Education
Roles
Regions
Countries
States
Related jobs
-
Senior Product Manager - Agentic Data Analytics USD 208K-379KCPU GPU | CPU GPU Tradeoffs | Cost estimation | Data Governance | Data analyticsSenior-level Full TimeUS, CA, Santa Clara R2d ago
-
C plus plus | CUDA | DMA Buffer | Databases | DriversSenior-level Full TimeUS, CA, Santa Clara8d ago
-
Senior Manager, Engineering - AI Developer Tools USD 272K-431KAgile | Automation | Go | JavaScript | PythonSenior-level Full TimeUS, CA, Santa Clara R11d ago
-
Manager, Next-Gen AI Cluster Validation USD 224K-356KAnsible | Cluster architecture | Deep learning | Distributed Systems | GoMid-level Full TimeUS, CA, Santa Clara R13d ago