Senior Deep Learning Software Engineer, LLM Performance
US, CA, Santa Clara, United States
USD 184K-356K Senior-level Full Time
Tasks
- Analyze LLM inference latency and throughput
- Collaborate with teams on performance modeling and kernel development
- Contribute to TensorRT and Triton code
- Develop and contribute to LLM inference benchmarking frameworks
- Implement GPU accelerated deep learning inference pipelines
- Implement LLM inference serving and deployment
- Optimize LLM inference performance
- Scale LLM performance across NVIDIA accelerators
- Tune LLM VLM and GenAI models
Perks/Benefits
Skills/Tech-stack
C# | C++ | CUDA | Inference Server | JAX | LLM | OpenCL | PyTorch | Python | TensorFlow | TensorRT | Triton | Triton Inference | Triton Inference Server | VLM
Education
Regions
Countries
States
Cities
Related jobs
-
Senior Data Engineer USD 175K-215KAngular | Dashboards | Data Visualization | Microservices | NoSQLSenior-level Full TimeWashington, DC, United States4h ago
-
Agentic AI Developer - Supply Chain USD 150K-200KAPIs | Agent Orchestration | Evaluation | Event Driven | Event-driven architectureSenior-level Full TimeAuburn Hills, MI, United States5h ago
-
Mid-level Full TimeAnnapolis Junction, MD6h ago
-
Mid-level Full TimeAnnapolis Junction, MD6h ago
-
Mid-level Full TimeAnnapolis Junction, MD6h ago
-
Data Engineer - Supply Chain USD 120K-164KApache Spark | CI/CD | Data Governance | Data Lineage | Data ModelingSenior-level Full TimeAuburn Hills, MI, United States6h ago
-
API | Axon | Customer360 | Data Governance | Data ManagementSenior-level ContractAustin, United States8h ago
-
Mid-level Full TimeSan Diego, California, United States8h ago
-
Data Engineer USD 62K-62KAzure Data | Azure Data Factory | DBT | Data Factory | Data Modeling401k matching | Dental insurance | Disability insurance | Flexible spending account | Internal promotion opportunitiesEntry-level Full TimeKS, Leawood9h ago
-
Data parallelism | Deep learning | Distributed Training | Model Acceleration | Model BenchmarkingSenior-level Full TimeSan Jose, California, United States9h ago
-
Software Engineer, C/C++ SDK Performance Optimization USD 194K-355KAndroid | C# | C++ | CPU performance | Frame rateSenior-level Full TimeSan Jose, California, United States9h ago
-
Applied Scientist - Monetization Technology - Global Tech Research Program - 2027 Start (PhD) USD 113K-250KCausal Inference | Causal modeling | Deep learning | Fine Tuning | Generative AIEntry-level Full TimeSan Jose, California, United States9h ago
-
Benchmarking | CUDA | Data parallelism | Distributed Training | Model ParallelismSenior-level Full TimeSan Jose, California, United States9h ago
-
Click Through Rate | Click Through Rate Prediction | Cold Start | Conversion Rate | Conversion Rate PredictionSenior-level Full TimeSeattle, Washington, United States9h ago
-
Research Engineer - Language - MRS AI USD 117K-173KComputer Graphics | Computer Vision | Data Analysis | Deep learning | Generative AIEntry-level Full TimeMenlo Park, CA10h ago
-
Silicon Engineer, Digital Research, Quantum AI USD 163K-237KASIC development | Analog design | Cadence Genus | Cadence Innovus | Cell ModelingMid-level Full TimeGoleta, CA, USA; Mountain View, CA, …10h ago
-
Software Engineer, BigQuery Metadata USD 147K-211KBigQuery | C++ | Cloud platform | Data Storage | Database systemsMid-level Full TimeSunnyvale, CA, USA10h ago
-
Senior Software Engineer, Machine Learning, Vertex AI USD 174K-252KCloud Computing | Data Privacy | Data Processing | Debugging | Fine TuningSenior-level Full TimeSunnyvale, CA, USA10h ago
-
Software Engineer, AI/ML, Search USD 174K-252KC++ | Data Processing | Data Structures | Data Structures and Algorithms | DebuggingMid-level Full TimeMountain View, CA, USA10h ago
-
Autotuning | Benchmarking | C++ | CUDA | Code generationSenior-level Full TimeSunnyvale, CA, USA10h ago
-
Staff Data Engineer USD 114K-171KCloud Platforms | Data Modeling | Data Pipelines | Data Warehousing | Data integrationDental insurance | Health care | Paid time off | Retirement plan | Sick leaveSenior-level Full TimeResidence Based, Residence Based, US13h ago
-
Software Engineer, Infrastructure USD 150K-252KArtificial Intelligence | Language Processing | Machine Learning | Natural Language | Natural Language ProcessingEntry-level Full TimeSan Francisco Bay Area16h ago
-
Staff Partner Engineer, Azure USD 150K-206KAI Platform | AI Platform Integration | API Integration | Apache Spark | Architecture DiagramsAnnual performance bonus | EquitySenior-level Full TimeNew York City, New York; San …18h ago
-
Senior-level Full TimeNew York City, New York18h ago
-
Senior-level Full TimeNew York City, New York18h ago