Senior Performance Analyst, Inference
Tasks
- Build and maintain competitive pricing models
- Build reproducible inference benchmarks
- Create deal specific competitive analyses for enterprise prospects
- Design standardized benchmark suites for inference workloads
- Measure tokens per second time to first token latency under concurrency TCO
- Monitor industry announcements and pricing changes
- Partner with sales to recommend pricing for contracts
- Synthesize findings into competitive briefs for sales and product
- Track third party benchmarking sources and validate measurements
Perks/Benefits
- N/A
Skills/Tech-stack
Attention Mechanism | CUDA | Flash Attention | GPU kernel optimization | KV cache | Kernel optimization | Quantization | SGLang | TensorRT | TensorRT-LLM | Transformer | Triton | VLLM
Education
N/A
Regions
Countries
States
Cities
Related jobs
-
Research Analyst, Center for Computational Quantum Physics and Center for Computational Mathematics USD 75K-75KBF16 | Benchmarking | C++ | CI/CD | CUDASenior-level Full Time162 Fifth Avenue, New York, NY, …6d ago