Research Engineer - LLM/VLM Inference Optimization (Seed Infra)
San Jose, California, United States
USD 244K-450K Mid-level Full Time
Tasks
- Apply low precision computation
- Build inference performance optimization techniques
- Build streaming inference
- Conduct performance analysis
- Design high performance LLM and VLM inference systems
- Develop CUDA kernels
- Develop compiler level optimizations
- Develop inference engines and serving frameworks
- Develop model toolchains
- Implement parallel computing
- Implement speculative decoding
- Optimize graph fusion
- Optimize high concurrency requests
- Optimize large model inference
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Compiler optimization | Graph optimization | High concurrency | Low-precision computing | Parallel Computing | Performance Analysis | Precision computing | Speculative decoding | Streaming inference
Education
N/A
Roles
Related jobs
-
Machine Learning Systems Engineer USD 144K-192KCUDA | Data loading | Distributed Training | Gradient Computation | Kernel Fusion401k match | Dental insurance | Health Accounts | Health insurance | Health savings accountSenior-level Full TimeBoston, Massachusetts, United States R1d ago
-
Senior Robotics Software Engineer USD 150K-199KC++ | CUDA | Collision detection | Computer Vision | LinuxDental insurance | Medical insurance | Paid time off | Vision insuranceSenior-level Full TimeOakland, CA1d ago
-
Machine Learning Systems Engineer USD 144K-192KCUDA | Kernel Fusion | NVIDIA Nsight | PyTorch | PyTorch Profiler401k match | Dental insurance | Health insurance | Health savings account | Life insuranceSenior-level Full TimeRemote U.S. R1d ago
-
Machine Learning Systems Engineer USD 144K-192KCUDA | Data loading | Distributed Training | Kernel Fusion | NsightMedical Dental Vision 401k with company match Health Savings Account Life Insurance Pet InsuranceSenior-level Full TimeLas Vegas, Nevada, United States R1d ago
-
Machine Learning Systems Engineer USD 144K-192KCUDA | Kernel Fusion | Nsight | Profiling tools | PyTorch401k match | Dental insurance | Health insurance | Health savings account | Life insuranceSenior-level Full TimePittsburgh, Pennsylvania, United States R1d ago
-
Senior-level Full TimePalo Alto1d ago
-
Agile | C++ | CUDA | Confluence | Cython401k plan | Adoption reimbursement | Commuter benefits | Critical caregiving leave | Critical illness insuranceSenior-level Full Time106312-NY-150 E 42nd, New York, United …2d ago
-
Senior Machine Learning Engineer USD 161K-246K3D Reconstruction | CI/CD | CUDA | Camera Geometry | Computer Vision401k | Dental insurance | EAP | Life insurance | Medical insuranceSenior-level Full TimeBoston, Massachusetts, United States2d ago
-
Lead Perception Engineer USD 175K-235K3D Reconstruction | C++ | CUDA | Computer Vision | Coordinate TransformSenior-level Full TimeWoburn, Massachusetts, United States2d ago
-
C# | C++ | Embedded Systems | Finite State Transducers | Language ProcessingMid-level Full TimeBoston, Massachusetts2d ago
-
Staff Forward Deployed Engineer USD 195K-239KArtificial Intelligence | Benchmarking | CUDA | CUDA Interconnect | Continuous batchingEmployee assistance program | Flexible time off | Hybrid work | LinkedIn Learning | Local Employee MeetupsSenior-level Full TimeSeattle4d ago
-
Staff Forward Deployed Engineer USD 195K-239KArtificial Intelligence | Benchmarking | CUDA | Continuous batching | CrewAIConference reimbursement | Employee assistance program | Employee stock purchase program | Flexible time off | LinkedIn LearningSenior-level Full TimeSan Francisco R4d ago
-
Senior / Staff ML Training Optimization Engineer USD 141K-249KBazel | C++ | CPU Profiling | CUDA | CUDA kernelsCatered meals | Dental insurance | Flexible hours | Health insurance | SnacksSenior-level Full TimeRemote US & Canada R4d ago
-
Senior Software Engineer, Perception Platform USD 170K-215KAPI Design | C++ | CI/CD | CUDA | Computer Vision401k match | Dental insurance | Flexible PTO | Free lunch daily | Medical insuranceSenior-level Full TimeColumbus, Ohio4d ago
-
Senior Research Scientist - Machine Learning System USD 212K-387KCUDA | Deep learning | Distributed Systems | GPU Performance | GPU Performance OptimizationSenior-level Full TimeSan Jose, California, United States4d ago
-
Senior Machine Learning Engineer, Performance USD 174K-252KData Analysis | Data Visualization | Debugging | Machine Learning | Parallel ComputingSenior-level Full TimeSunnyvale, CA, USA4d ago
-
AWQ | Audio codecs | Audio streaming | Autoscaling | Chunked prefill401k matching | Annual offsites | Dental coverage | Employer-paid training | Healthcare benefitsMid-level Full TimeSan Francisco, CA4d ago
-
Senior-level Full TimeBelmont, CA, US, 940025d ago
-
AI Inference Engineer - Speech USD 151K-332KAsynchronous execution | Attention Mechanism | Automatic Speech Recognition | BEAM Search | C#Hybrid workMid-level Full TimeSan Jose (CA), United States5d ago
-
Forward Deployed Engineer (Inference & Post-Training) USD 270K-300KDPO | GRPO | KV cache | LoRA | Pipeline parallelismEquity | Health insurance | Remote work flexibilitySenior-level Full TimeSan Francisco5d ago
-
Staff Software Engineer, Inference USD 188K-275KBF16 | C++ | CUDA | Distributed Systems | FP8401k employer match | Dental insurance | Employee stock purchase program | Flexible PTO | Flexible spending accountSenior-level Full TimeSunnyvale, CA / Bellevue, WA5d ago
-
Research Engineer, Training & Inference USD 200K-450KC++ | CUDA | Cutlass | Distributed Training | FSDP401k matching | Employer-paid health insurance | Health Savings Account (HSA) | Unlimited PTOEntry-level Full TimePalo Alto5d ago
-
Senior Research Engineer – AI/ML USD 110K-161KC++ | Computer Vision | Distributed Computing | Distributed Systems | Intelligent agentsSenior-level Full TimeRaleigh, North Carolina, United States5d ago
-
Research Engineer – AI/ML USD 100K-258KAgentic AI | C plus plus | Cloud Computing | Computer Vision | Distributed ComputingEmployee-owned company | Stable work environmentMid-level Full TimeRaleigh, North Carolina, United States5d ago
-
Staff Software Engineer, TPU Performance USD 207K-300KCUDA | Code generation | Compiler optimization | Data Processing | DebuggingSenior-level Full TimeMountain View, CA, USA5d ago