Forward Deployed Engineer, AI Inference (vLLM and Kubernetes)
Tasks
- Collect and communicate field feedback to engineering teams
- Debug model architecture, hardware accelerators, and Kubernetes networking issues
- Deploy and configure LLM-D and vLLM on Kubernetes
- Optimize inference systems for performance and latency
- Travel to customer sites as needed
- Write production-quality code in Python, Go, YAML
Perks/Benefits
- Disability benefit
- Medical/Dental/Vision
- Paid time off & holidays
- Parental leave
- Retirement 401k
- Stock purchase plan
- Tuition reimbursement
Skills/Tech-stack
AI Inference | Bare Metal | Bare-metal clusters | Cloud deployment | Custom Resources | Distributed Systems | GPU tuning | Go | Helm | Ingress | K8s primitives | Kubernetes | Language Models | Large Language Models | Model caching | Networking | Performance Benchmarking | Python | Terraform
Related jobs
-
Senior AI Solution Engineer USD 160KAI workflows | API Integration | Agentic Systems | Automation | Data CurationRemote work | Significant onsite travelSenior-level Full TimeRemote (United States) R2d ago
-
Forward Deployed Engineer USD 130K-217KAI Agents | API Design | Automation | CLI Development | CLI toolsSenior-level Full TimeRemote - United States R24d ago
-
Principal Sustaining & Forward Deployed Engineer USD 139K-225KAWS | Alerting | Automation | CI/CD | Data PipelinesCell phone allowance | Development support | Equity | Health coverage | Home-office allowanceSenior-level Full TimeRemote US R24d ago
-
AI Pipelines | AI frameworks | Agent systems | CI/CD | CrewAIRemote work | Travel opportunitySenior-level Full TimeUnited States - Remote R1mo ago
-
Forward Deployed Engineer USD 125K-150KAPI Design | Best practices | Cloud architectures | Data Pipelines | Distributed SystemsEquity | In-person work near Union Square NYCMid-level Full TimeNew York, NY · Remote R1mo ago