LLM Inference Frameworks and Optimization Engineer
San Francisco, Singapore, Amsterdam
USD 160K-230K Mid-level Full Time
Tasks
- Analyze inference performance bottlenecks
- Apply CUDA graph optimizations
- Design distributed inference engines
- Develop model execution plans
- Implement distributed inference strategies
- Implement speculative decoding
- Optimize GPU TPU and accelerator performance
- Optimize TensorRT and TRT-LLM graphs
- Optimize end to end model serving pipelines
- Optimize inference latency and throughput
- Perform software hardware co design
- Use torch compile for model execution
Perks/Benefits
Skills/Tech-stack
C++ | CUDA | CUDA graph | Cluster scheduling | Compiler | Efficient kernels | GPU Cluster | GPU Kernel | GPU Programming | GPU cluster scheduling | GPU kernel optimization | KV cache | Kernel optimization | Mixture of Experts | Model Quantization | Pipeline parallelism | PyTorch | Python | Speculative decoding | TRT-LLM | Tensor Parallelism | TensorRT | Torch compile | Transformer | Triton | Workload Scheduling
Education
N/A
Regions
Countries
States
Related jobs
-
ANSI SQL | Agile | Amazon Redshift | Apigee | Azure401k match | Bereavement leave | Employee assistance program | Employee discount program | Health and wellbeing benefitsSenior-level Full TimeRemote - Nationwide, United States R6h ago
-
Senior-level Full TimeAnnapolis Junction, MD7h ago
-
Senior-level Full TimeAnnapolis Junction, MD7h ago
-
.NET | AWS | AWS Lambda | Agentic Workflows | Azure401k | Dental insurance | Medical insurance | PTO | Retirement benefitsSenior-level Full TimeLos Angeles, CA, United States R8h ago
-
Fullstack Engineer, AI Integrations USD 50K-70KAWS | Agile | Alerting | C++ | CSSAgile team environment | Hybrid work | MentorshipEntry-level Full TimeMountain View, CA / San Francisco, … R9h ago
-
Entry-level Full TimeMountain View, CA / San Francisco, … R9h ago
-
Ansible | C# | C++ | CI/CD | EthernetActive Secret clearance requirement | Hybrid work | Travel 5 percent domestic and internationalMid-level Full TimeLexington, MA, United States10h ago
-
API Integration | ARM | Angular | Appian | Azure DevOpsFlexible extensions contract | Hybrid work schedule | Knowledge transfer coaching | Onsite work with mission teamsSenior-level ContractAustin, United States10h ago
-
Senior-level Full TimeUSA-VA-Herndon10h ago
-
Data Science Team Leader USD 165K-165KCI/CD | Cloud platform | Docker | Google BigQuery | Google CloudSenior-level Full TimeDenver, Colorado, United States11h ago
-
Software Engineer, Systems ML USD 141K-208KC plus plus | CUDA | Co-design | Compiler optimization | Deep learningSenior-level Full TimeBellevue, WA | Menlo Park, CA …12h ago
-
Network Engineer, Foundation & Support USD 120K-184KAI Assisted Development | Automation | C# | C++ | Distributed SystemsGlobal team collaboration | Mentorship | On-the-job trainingEntry-level Full TimeDenver, CO | Reston, VA | …12h ago
-
Technical Lead, AI/ML Infrastructure USD 207K-301KC# | C++ | Compute architecture | Cryptography | Distributed SystemsSenior-level Full TimeSunnyvale, CA, USA12h ago
-
C++ | Code Quality | Data Structures | Data structures algorithms | DebuggingMid-level Full TimeSan Jose, CA, USA12h ago
-
C# | C++ | Code review | Debugging | Embedded LinuxSenior-level Full TimeSunnyvale, CA, USA12h ago
-
AI Customer Engineer III, Cloud AI, Google Cloud SGD 160K-214KAgent Development | Application Programming | Application Programming Interfaces | Audit Logging | Cloud ArchitectureSenior-level Full TimeSingapore12h ago
-
Senior-level Full TimeCenter, Center District, IL13h ago
-
Data Engineer USD 96K-137KApache Airflow | Cloud platform | DBT | Git | Google Cloud401k matching | Basic life insurance | Dental insurance | Disability coverage | Medical insuranceMid-level Full TimePiscataway, NJ, US15h ago
-
Principal AI Platform Engineer USD 104K-166KAPIs | Access Control | Audit trails | Data Engineering | Data GovernanceSenior-level Full TimeSan Francisco, CA21h ago
-
Artificial Intelligence Developer (AI) USD 114K-218KAmazon Web Services | C++ | Conda | Data Modeling | ETL401k matching | Employer Covered Dental Insurance | Employer Covered Disability Insurance | Employer Covered Vision Insurance | Employer-covered health insuranceMid-level Full TimeChantilly, VA22h ago
-
Sr. Embedded Software Engineer - Radar & DSP USD 165K-220KAgile | Anomaly Detection | C# | C++ | ClassificationHealth insurance | Onsite work | Professional development | Retirement plansSenior-level Full TimeHuntington Beach, CA22h ago
-
Data Engineer USD 125K-160KAWS | AWS AppFlow | AWS CloudFormation | AWS Glue | AWS LambdaIn-office workSenior-level Full TimeMeridian, ID, US23h ago
-
Machine Learning Engineer EUR 75K-75KADLS | Airflow | Apache Spark | Azure | DatabricksBike travel bonus | Home office facilities | Home-office allowance | Hybrid work | LaptopEntry-level Full TimeRotterdam, Blaak 8 (Kantoor functie), Netherlands23h ago
-
Team Lead, Data Science & AI Engineering, Vice President SGD 162K-195KBigQuery | CI/CD | Cloud Storage | Cloud platform | Data PipelinesSenior-level Full TimeBOS-SGP, Singapore23h ago
-
Data Engineer Senior Principal (Hybrid) USD 144K-195KAmazon S3 | Amazon Web Services | Amazon Web Services (AWS) | Apache Airflow | Apache Flink401k match | Health insurance | Hybrid work | Paid time offSenior-level Full TimeUSA NC Fort Bragg - 2929 … R23h ago