aijobs.net

Principal Engineer – Gen AI Platform Inferencing Engineering

142019-NC-300 South Brevard, Charlotte, United States

USD 305K Senior-level Full Time

Apply Save
Found 21h ago
Tasks
Perks/Benefits
Skills/Tech-stack

Autoscaling | CUDA | CUDA MIG | Concurrency Control | Continuous batching | Grafana | Helm | Inference Server | KServe | Knative | Kubernetes | Kueue | Kustomize | Load Testing | Machine Learning | Model Deployment | NCCL | Nvidia Dynamo | Observability | OpenShift | OpenShift AI | Prefix caching | Prometheus | Python | Quantization | Run-AI | SGLang | Speculative decoding | TensorRT | TensorRT-LLM | Triton Inference | Triton Inference Server | VLLM | Volcano

Education

N/A

Roles

Engineer | Principal | Principal Engineer

Regions

North America

Countries

United States

States

North Carolina, US

Cities

Charlotte, North Carolina, US

Apply Save
Language: en | Views: 0 | Clicks: 0 | Saves: 0

Related jobs