aijobs.net

Senior Inference Engineer, AIConfigurator for Dynamo

US, CA, Santa Clara, United States

USD 184K-356K Senior-level Full Time

Apply Save
Found 4d ago
Tasks
Perks/Benefits
Skills/Tech-stack

Batching | Distributed Systems | Expert parallelism | GPU Computing | High Performance | High-Performance Computing | Inference Server | KV cache | Kubernetes | LLM Inference | Latency Estimation | Machine Learning | Machine Learning Infrastructure | Memory Management | NCCL | NIXL | NVSHMEM | Performance Computing | Pipeline parallelism | Prefill Decode | Prefill decode disaggregation | Python | Rust | SGLang | Tensor Parallelism | TensorRT-LLM | Triton Inference | Triton Inference Server | VLLM

Education

Bachelor of Science | Master of Science | PhD

Roles

Engineer | Inference Engineer | Senior Inference Engineer

Regions

North America

Countries

United States

States

California, US

Cities

Santa Clara, California, US

Apply Save
Language: en Views: 2 Clicks: 1 Saves: 0

Related jobs