Research Engineer - LLM/VLM Inference Optimization (Seed Infra)

San Jose, California, United States

USD 244K-450K Mid-level Full Time

@ B...

Apply Save

Found 1mo ago

Tasks

Build compiler level optimized inference pipelines
Collaborate with teams to improve model toolchains and ecosystem
Design high performance inference systems for large scale LLMs and VLMs
Develop CUDA kernels and low precision inference computation
Develop and optimize inference engines and serving frameworks
Optimize inference throughput with streaming and speculative decoding
Perform performance analysis and identify bottlenecks

Perks/Benefits

Skills/Tech-stack

Education

N/A

Apply Save

Language: en Views: 8

Clicks: 0

Saves: 0

Related jobs

Perception Engineer III USD 155K-185K

ArduPilot | C++ | CUDA | Coordinate frames | DDS

401k company match | Annual Company Holidays | Life insurance | Medical, dental & vision coverage | Paid time off

Senior-level Full Time

United States

16h ago
AI Software Engineer USD 80K-210K

AWS Kinesis | Apache Airflow | Apache Kafka | C# | C++

401k | Equity grant | Full healthcare coverage | Unlimited PTO

Senior-level Full Time

El Segundo, CA

1d ago
ML Research Engineer - Hardware Codesign USD 185K-455K

C++ | CUDA | Floating point | Floating point numerics | Functional simulation

Hybrid work schedule | Relocation assistance

Senior-level Full Time

San Francisco

1d ago
Staff AI engineer USD 170K-250K

AI Evaluation | AWS | Agent Orchestration | Caching | Distributed Systems

Flexible working hours | Hybrid work culture | Unlimited time off

Senior-level Full Time

San Francisco

1d ago
Data Engineer Mid-Level USD 80K-145K

Analytics Platforms | Data Engineering | Data Visualization | Data integration | High Performance

Health insurance | Hybrid work | Paid time off

Mid-level Full Time

Winchester, United States

2d ago
Software Engineer, AI Specialist - Wearables AI (Technical Leadership) USD 147K-208K

C plus plus | CI/CD | Cloud Computing | Computer Vision | Deep learning

Senior-level Full Time

Burlingame, CA

2d ago
Senior Software Engineering USD 119K-261K

C# | C++ | CUDA | Csharp | Deep learning

Senior-level Full Time

United States, California, Mountain View; United …

2d ago
On-Device ML Compiler Engineer, Model Compilation, Graphics, Games and Machine Learning USD 175K-312K

C++ | CPU | CUDA | GPU | MLIR

Senior-level Full Time

Cupertino

2d ago
Senior AI Software Engineer USD 92K-209K

API Design | Access Management | Agent systems | Agentic AI | C++

401k match | Commuter benefits | Dental insurance | Flexible spending accounts | Health insurance

Senior-level Full Time

Nashville, TN, United States

2d ago
HPC Scientific Software Engineer (IT@JH Research Computing) USD 85K-149K

Ansible | C# | C++ | CMake | CUDA

Mentorship | Remote work | Training workshops

Mid-level Full Time

Baltimore, MD, United States R

3d ago
Algorithm & Analysis Engineer - EOSL - Open Rank (Junior-Mid Level) USD 76K-90K

ADA | AFSIM | C# | C++ | Data Fusion

Entry-level Full Time

Atlanta, GA

3d ago
Campus AI Research Engineer (Full-Time) USD 300K-300K

C++ | CUDA | GPU Programming | HPC | High Throughput

CPT/OPT eligible | Visa sponsorship

Entry-level Full Time

Chicago; New York

3d ago
Campus AI Research Engineer (Intern) USD 300K-300K

C# | C++ | CUDA | Data Mining | Deep learning

CPT OPT eligibility | International student support | Work visa sponsorship

Entry-level Internship

Chicago; New York

3d ago
ML Platform Engineer USD 100K-150K

API Gateway | Automated rollback | Autoscaling | C++ | Caching

Career growth | Direct W2 employment | H1B transfer support | Long-term employment | Remote work

Senior-level Full Time

Jersey City, NJ R

3d ago
Division Head - Computational Fluid Dynamics USD 203K-336K

Aerodynamics | Cavitation | Computational Fluid Dynamics | Computing Architectures | Fluid Dynamics

Hybrid work option | On-site work option | Periodic travel

Executive-level Full Time

Penn State University Park, United States

3d ago
Associate, Quantitative Developer – Prime Services USD 150K-200K

.NET | Azure | C# | Concurrent programming | Data Modeling

Career development | Health and well-being benefits | Paid time off | Retirement savings program

Senior-level Full Time

1 Vanderbilt Avenue TDS, New York, …

3d ago
Division Head - Computational Fluid Dynamics (FACULTY) USD 109K-219K

Computational Fluid Dynamics | Computing systems | Fluid Dynamics | High Performance | High-Performance Computing

Hybrid work option | Medical, dental, and vision coverage | Periodic travel | Retirement plans | Tuition discount

Executive-level Full Time

Penn State University Park, United States

3d ago
Senior AI Engineer USD 148K-189K

Benchmarking | C# | C++ | CI/CD | Distillation

Senior-level Full Time

Waterford, IE

3d ago
Machine Learning Engineer - Model Inference USD 166K-230K

C++ | Cloud infrastructure | Containers | Continuous batching | Distributed Systems

Mid-level Full Time

Cupertino

3d ago
Senior AI Infrastructure Engineer - Model Training USD 190K-260K

BF16 | C++ | CUDA | Data parallelism | DeepSpeed

401k | Dental and vision plans | Dependent care FSA | Dog-friendly office | FSA

Senior-level Full Time

Mountain View, CA

3d ago
Staff Machine Learning Engineer USD 151K-309K

A/B | A/B Testing | B testing | Canary Deployment | Capacity Planning

Hybrid work

Senior-level Full Time

Remote, US R

3d ago
Data Engineer Senior USD 115K-160K

Analytics Platforms | Data Engineering | Data Visualization | Data integration | Desktop applications

Hybrid work | Top Secret/SCI clearance required

Senior-level Full Time

Winchester, United States

4d ago
Data and Analytics Engineer USD 80K-128K

Algorithm Evaluation | Data Visualization | MATLAB | Machine Learning | Optics

Travel opportunities

Entry-level Full Time

Huntsville, AL, United States

4d ago
Backend Engineer- Inference Services USD 150K-220K

C# | C++ | Convolutional Neural Network | Git | High Performance

Mid-level Full Time

USA | Remote R

4d ago
ML Platform Engineer USD 100K-150K

API Gateway | Abuse detection | Access Management | Automated rollback | Autoscaling

Senior-level Full Time

United States - Remote R

4d ago

Research Engineer - LLM/VLM Inference Optimization (Seed Infra)

Tasks

Perks/Benefits

Skills/Tech-stack

Education

Roles

Regions

Countries

States

Cities

Related jobs