aijobs.net

Engineering Manager, Model Inference

SF Office

USD 220K-270K Mid-level Full Time

Apply Save
Found 12h ago
Tasks
Perks/Benefits
Skills/Tech-stack

APIs | Attention Mechanism | Batching | Distributed Systems | Docker | Expert parallelism | FlashAttention | GPU Performance | GPU performance analysis | Grouped Query Attention | Incident Response | Kernel Fusion | Kubernetes | Multi-head attention | Observability | Performance Analysis | Pipeline parallelism | PyTorch | Quantization | Real Time | Real-time Systems | Tensor Parallelism | TensorFlow | TensorRT | Time Systems | Transformer | VLLM

Education

Bachelor of Engineering | Master of Science | PhD

Roles

Engineering | Engineering Manager | Machine Learning Engineering Manager | Manager

Regions

North America

Countries

United States

States

California, US

Cities

San Francisco, California, US

Apply Save
Language: en Views: 0 Clicks: 0 Saves: 0

Related jobs