Research Engineer - LLM/VLM Inference Optimization (Seed Infra)
San Jose, California, United States
USD 244K-450K Mid-level Full Time
Tasks
- Apply low precision computation
- Build inference performance optimization techniques
- Build streaming inference
- Conduct performance analysis
- Design high performance LLM and VLM inference systems
- Develop CUDA kernels
- Develop compiler level optimizations
- Develop inference engines and serving frameworks
- Develop model toolchains
- Implement parallel computing
- Implement speculative decoding
- Optimize graph fusion
- Optimize high concurrency requests
- Optimize large model inference
Perks/Benefits
- N/A
Skills/Tech-stack
CUDA | Compiler optimization | Graph optimization | High concurrency | Low-precision computing | Parallel Computing | Performance Analysis | Precision computing | Speculative decoding | Streaming inference
Education
N/A
Roles
Related jobs
-
Applied Scientist - LLM Training System as a Service - Global Frontier Tech Recruitment Program - 2027 Start (PhD) USD 212K-450KCUDA | Distributed Systems | GPU Performance | GPU Performance Optimization | InferenceEntry-level Full TimeSan Jose, California, United States5h ago
-
Senior Software Embedded Engineer USD 141K-225KABI | ARM | C++ | CUDA | Embedded LinuxEmployee resource groups | Fitness programs | Learning and development | Medical/Dental/Vision | Mental wellness supportSenior-level Full TimeWashington, United States R20h ago
-
Senior Developer – AI/ML Autonomous Driving & Navigation USD 161K-240KBehavior Prediction | Behavior planning | C plus plus | CI/CD | CUDASenior-level Full TimeMelbourne, FL, United States1d ago
-
ML Engineer, I - Acceleration Team USD 132K-158KC plus plus | CUDA | Linear Algebra | Linux | Machine LearningEntry-level Full TimeAnn Arbor, MI1d ago
-
CUDA | CUDA kernel | Compiler optimization | Deployment Pipelines | Graph FusionMid-level Full TimeSeattle, Washington, United States1d ago
-
C++ | Cloud Storage | Cloud platform | Data Structures | Data Structures and AlgorithmsSenior-level Full TimeSunnyvale, CA, USA1d ago
-
Software Development Engineer - Robotics USD 100K-170KC++ | CUDA | CUDNN | GPU Acceleration | IMUCareer growth opportunities | Comprehensive benefits | MentorshipMid-level Full TimeBoston, Massachusetts1d ago
-
Autoscaling | CUDA | CUDA MIG | Concurrency Control | Continuous batching401-k plan | Disability benefits | Health benefits | Life insurance | Paid time offSenior-level Full Time142019-NC-300 South Brevard, Charlotte, United States1d ago
-
Principal AI/ML Engineer (Large Language Model) USD 114K-252KAWS | Adversarial Networks | BERT | CUDA | CUDA programmingContinuing education | Family support | Healthcare | Retirement | Time offSenior-level Full Time602 AURORA CO, United States1d ago
-
Senior Machine Learning Engineer USD 139K-227KAudio Processing | CUDA | ChromaDB | Computer Vision | Distributed Computing401k match | Continuing education support | Function health subscription | Health & wellness stipend | Health, dental, vision benefitsSenior-level Full TimeAustin, TX1d ago
-
Benchmarking | CUDA | Data parallelism | Distributed Training | Model ParallelismSenior-level Full TimeSan Jose, California, United States2d ago
-
Autotuning | Benchmarking | C++ | CUDA | Code generationSenior-level Full TimeSunnyvale, CA, USA2d ago
-
Senior Deep Learning Software Engineer, LLM Performance USD 184K-356KC# | C++ | CUDA | Inference Server | JAXEmployee benefits | EquitySenior-level Full TimeUS, CA, Santa Clara, United States2d ago
-
AI Compiler | C# | C++ | CUDA | Computer ArchitectureComprehensive benefitsSenior-level Full TimeUS, CA, Santa Clara, United States2d ago
-
C++ | CUDA | Compiler optimization | GPU Programming | JAXSenior-level Full TimeUS, CA, Santa Clara, United States2d ago
-
Chief Engineer, Data Center Engg Ops USD 111K-186KAir Handling Units | Alarm systems | Automation systems | Building Automation Systems | Building automationCareer growth | Flexible work schedule | Mentorship | Travel opportunitiesExecutive-level Full TimeFredericksburg, Virginia, USA2d ago
-
Applied Machine Learning Engineer USD 110K-165KAWS | Airflow | Apache Kafka | Azure | C++401k match | Education assistance programs | Flexible spending accounts | Health care and wellness plans | Inclusive work environmentSenior-level Full TimeChantilly, United States2d ago
-
Senior Software Engineer - Embedded Software USD 86K-165KAARCH64 | ARM | Agile | AppArmor | ArtifactoryRelocationSenior-level Full TimeUS-TX-MCKINNEY-513WD ~ 2501 W University Dr …2d ago
-
Software Engineer II - Embedded Software USD 68K-131KAgile | AppArmor | Artifactory | C# | C++RelocationMid-level Full TimeUS-TX-MCKINNEY-513WD ~ 2501 W University Dr …2d ago
-
Mid-level Full TimeSeattle (WA), United States2d ago
-
AI Research Engineer: Vision AI / VLM / Physical AI USD 140K-150K3D Reconstruction | CI/CD | CUDA | CUDA profiling | Computer VisionHealth benefits | Mentorship | Paid time off | Remote work optionSenior-level Full TimeRemote Work( USA), United States R2d ago
-
Machine Learning Engineer (LLM / Personalization) USD 165K-195KAWS | Amazon S3 | Apache Airflow | Apache Spark | CUDAHealth insurance | Hybrid work | Paid time off | Remote work | Retirement planSenior-level Full TimeNew York City3d ago
-
Computer Vision Engineer USD 191K-253K3D Reconstruction | Algorithms | C++ | CUDA | Camera Models401k retirement plan | Caregiver & wellness leave | Commuter benefits | Dental & vision coverage | Family planning & parenting supportMid-level Full TimeCosta Mesa, California, United States3d ago
-
AI/ML Engineer - ASL - Open Rank USD 144K-219KAI Driven | AI-driven engineering | Artificial Intelligence | Computer Vision | Data FusionSenior-level Full TimeAtlanta, GA3d ago
-
Benchmarking | CUDA | Communication optimization | Data parallelism | Deep learningMid-level Full TimeSeattle, Washington, United States3d ago