Research Engineer - LLM/VLM Inference Optimization (Seed Infra)
San Jose, California, United States
USD 244K-450K Mid-level Full Time
Tasks
- Build compiler level optimized inference pipelines
- Collaborate with teams to improve model toolchains and ecosystem
- Design high performance inference systems for large scale LLMs and VLMs
- Develop CUDA kernels and low precision inference computation
- Develop and optimize inference engines and serving frameworks
- Optimize inference throughput with streaming and speculative decoding
- Perform performance analysis and identify bottlenecks
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | CUDA kernels | Compiler optimization | Graph Fusion | High Performance | High-Performance Computing | Inference Optimization | Low Precision | Low-precision computing | Parallel Computing | Performance Analysis | Performance Computing | Precision computing | Speculative decoding | Streaming inference
Education
N/A
Related jobs
-
Quantitative Developer (Fintech) USD 100K-150KAudit trails | Backtesting | C++ | Cloud Native | Cloud-native ArchitecturesBenefits | Career growth | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
AI/ML Engineer, Mid USD 77K-176KAWS | Automated retraining | CI/CD | CUDA | Computer VisionDependent care | Paid leave | Professional development | Tuition assistance | Work-life programsMid-level Full TimeUSA, OH, Wright Patterson AFB (4180 …1d ago
-
Computer Vision Software Engineer, Senior USD 112K-257KAgile | C++ | CD pipelines | CI/CD | CI/CD pipelinesDependent care | Paid leave | Professional development | Tuition assistance | Work-life programsSenior-level Full TimeUSA, OH, Wright Patterson AFB (4180 …1d ago
-
Computer Vision Software Engineer, Mid USD 69K-158KC plus plus | CLIP | CUDA | Computer Vision | Contrastive LearningDependent care | Paid leave | Professional development | Tuition assistance | Work-life programsMid-level Full TimeUSA, OH, Wright Patterson AFB (4180 …1d ago
-
Staff AI engineer USD 167K-250KAI Evaluation | AWS | Agent Orchestration | Caching | Data PipelinesEquity participation | Flexible working hours | Hybrid work culture | Unlimited time offSenior-level Full TimeSan Francisco2d ago
-
ML Engineer, Generative Video USD 175K-275KAutoregressive models | CUDA | Debugging | Deep learning | Diffusion Models401k match | Catered lunch | Commuter benefits | Dinner stipend | Grubhub subscriptionMid-level Full TimeUnion Square, New York City2d ago
-
Distinguished Engineer - AI Data Engine USD 266K-396KAPI Design | AWS | Agentic AI | Azure | C++Employee stock purchase plan | Health insurance | Life insurance | Paid leave options | Paid time offSenior-level Full TimeSan Jose, CA, USA Office (SANJOSE)3d ago
-
Senior Embedded Software Engineer USD 140K-190KARM | Azure | Build systems | C# | CI/CDDental insurance | Health insurance | Long-term stock incentives | Paid time off | Vision insuranceSenior-level Full TimeAllen, Texas, United States4d ago
-
Cleared Storage Design Engineer — Senior USD 90K-170KAPI Development | Active IQ | CIFS | Capacity Planning | Data Fabric ManagerSenior-level Full TimeQuantico, VA, US4d ago
-
Principal / Sr Principal AI Software Engineer USD 119K-222KAgile | Bitbucket | C# | C++ | Critical SystemsCompany-Paid Holidays | Disability insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeGAWR11, United States4d ago
-
Staff Machine Learning Engineer, Adobe Firefly Services USD 172K-306KAdversarial Networks | CUDA | Diffusion Models | Distributed Systems | GANsSenior-level Full TimeSeattle, United States R4d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Cache optimization | Compiler optimization | Continuous batchingMid-level Full TimeUnited States - Remote R4d ago
-
Application Software Engineer, Inference USD 135K-185KAgent Orchestration | Agent SDK | Auto Scaling | Batch scheduling | C++401k plan | Employee stock purchase plan | Long-term incentives | Medical, dental & vision coverage | Onsite Palo AltoEntry-level Full TimePalo Alto, CA4d ago
-
Deep Learning Engineer USD 161K-175K3D Vision | AWS | Automated Machine Learning | Azure | CUDAHybrid scheduleMid-level Full TimeSan Francisco HQ Office R4d ago
-
Software Engineer, Systems ML USD 141K-208KC plus plus | CUDA | Co-design | Compiler optimization | Deep learningSenior-level Full TimeBellevue, WA | Menlo Park, CA …4d ago
-
Barrier functions | C# | C++ | CUDA | Control Theory401k match | Employee assistance program | Employee scholar program | Flexible work schedules | HolidaysSenior-level Full TimeUS-CT-EAST HARTFORD-RTRC L ~ 411 Silver … R5d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | CUDA | Continuous batching | Deep learning | Distributed TrainingMid-level Full TimeUnited States - Remote R5d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | CUDA | Communication Primitives | Continuous batchingMid-level Full TimeUnited States - Remote R5d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | Continuous batching | Cutlass | DeepSpeedRemote workMid-level Full TimeUnited States - Remote R5d ago
-
Systems Engineer (M&S Analysis and Algorithms): HSV-2698 USD 100K-130KADA | Agile Scrum | Algorithm Development | Data Analysis | MATLABActive Secret clearance supportMid-level Full Time325 Bob Heath Drive, Huntsville, AL, …5d ago
-
AI Performance Optimization Engineer USD 100K-150KC++ | CUDA | Continuous batching | DeepSpeed | Distributed TrainingBenefits | Career growth | Mentorship | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
Quantitative Developer (Fintech) USD 100K-150KAudit trails | Backtesting | C++ | Columnar Databases | ConcurrencyBenefits | Full-time employment | Remote workMid-level Full TimeUnited States - Remote R5d ago
-
AI Engineer USD 165K-240KAPI Design | AWS | Agentic Workflows | Asynchronous processing | BM25401k enrollment | Gym membership stipend | Health coverage | Hybrid work environment | Paid HolidaysSenior-level Full TimeNew York5d ago
-
Data Center Operations Systems Engineer (Dallas, TX) USD 109K-145KAirflow management | Cable Management | Cable Optics | Capacity Planning | DCIM401k company match | Commuter stipend | Health, dental and vision coverage | On-site presence | Paid time offMid-level Full TimeDallas, TX - Data Center5d ago
-
Benchmarking | CI | CUDA | Chrome GPU Tracing | Compute ShadersCommute subsidy | Comprehensive health insurance | Disability insurance | Employee assistance program | Employee resource groupsSenior-level Full TimeSan Francisco, CA, USA5d ago