Machine Learning Engineer, Inference & Serving (Speech LLM) - San Francisco
Tasks
- Apply model compression and quantization
- Build inference engines for large language models and speech models
- Collaborate with training and backend teams
- Deploy GPU accelerated inference pipelines
- Implement continuous batching
- Integrate real time audio streaming
- Manage KV cache and stateful connections
- Optimize latency and throughput
- Support distributed multi GPU inference with autoscaling
Perks/Benefits
- 401k matching
- Annual offsites
- Dental coverage
- Employer-paid training
- Healthcare benefits
- Hybrid work
- New parent leave
- Paid Holidays
- Unlimited PTO
- Vision coverage
Skills/Tech-stack
AWQ | Audio codecs | Audio streaming | Autoscaling | Chunked prefill | Continuous batching | Distributed Systems | FP8 | GPTQ | GPU Architecture | INT8 | Inference | Inference Server | KV cache | Kubernetes | Language Models | Large Language Models | Latency optimization | Lookahead Decoding | Machine Learning | NVIDIA CUDA | NVIDIA Triton | NVIDIA Triton Inference | NVIDIA Triton Inference Server | Neural audio codecs | PagedAttention | Post-training | Post-training Quantization | SGLang | Speculative decoding | Speech Processing | Tensor Parallelism | TensorRT | TensorRT-LLM | Throughput Optimization | Time To First Audio | Time To First Token | Triton Inference Server | VLLM | WebRTC | WebSockets
Education
N/A
Roles
AI | AI Engineer | Engineer | Learning Engineer | Machine Learning Engineer
Regions
Countries
States
Related jobs
-
Director, AI Solutions Architect USD 139K-225KAPI Gateway | Agent Orchestration | BigQuery | CI/CD | Cloud MicroservicesAfter-hours support | Occasional overtime | Occasional travelSenior-level Full TimeOak Brook, IL, United States5h ago
-
Senior Data Engineer USD 165K-180KAPIs | Anomaly Detection | Azure | Azure Data | Azure Data FactorySenior-level Full TimeWork from home, VA, United States R5h ago
-
Senior DevOps Engineer ID63545 USD 135K-185KAWS | Apache Airflow | ArgoCD | Azure | BigQueryFlextime | Growth roadmaps | Mentorship | Office work options | Remote work optionsSenior-level Full TimeMiami, United States7h ago
-
Technical Architect – AI, ML & Generative AI USD 142K-240KAWS Bedrock | AWS SageMaker | Agentic AI | Apache Spark | Artificial Intelligence401k | Critical Illness Accident Hospital Indemnity Identity Theft Protection | Dental plans | Life and Accidental Death and Dismemberment | Long-term disabilitySenior-level Full TimeFrisco, United States8h ago
-
Senior Director, AI / Machine Learning Software Engineer USD 136K-300KApache Flink | Apache Spark | CI/CD | Data Lineage | Data PrivacyHealth benefits | Paid leave | Paid volunteer timeSenior-level Full TimeNew York, NY, United States8h ago
-
Computer Vision | Data Analysis | Language Models | Language Processing | Large Language ModelsSenior-level Full TimeSeattle, Washington, United States9h ago
-
Classification Algorithms | Data Analysis | Deep learning | Language Models | Language ProcessingSenior-level Full TimeSan Jose, California, United States9h ago
-
Data Engineering | Machine Learning | Machine Learning Pipelines | Python | Recommendation SystemsSenior-level Full TimeSan Jose, California, United States9h ago
-
Data Pipelines | Full Stack | Full-Stack Development | Machine Learning | PythonSenior-level Full TimeSan Jose, California, United States9h ago
-
C++ | Data Analysis | Data Manipulation | Data Processing | Deep learningSenior-level Full TimeMountain View, CA, USA10h ago
-
Algorithms | Audio Software | C++ | Debugging | Embedded SystemsSenior-level Full TimeMountain View, CA, USA10h ago
-
Software Engineer, Machine Learning USD 207K-300KC++ | Data Processing | Experimentation | Information Retrieval | Just-in-TimeSenior-level Full TimeNew York, NY, USA; Mountain View, …10h ago
-
Customer Engineer, Data Analytics, Google Cloud USD 153K-222KBatch Processing | Big Data | Cloud Architecture | Cloud platform | Customer RequirementsSenior-level Full TimeSunnyvale, CA, USA10h ago
-
C++ | Data Processing | Debugging | Information Retrieval | Language ModelsSenior-level Full TimeMountain View, CA, USA10h ago
-
Algorithms | C++ | Cloud Computing | Cloud platform | Data StructuresSenior-level Full TimeSunnyvale, CA, USA10h ago
-
Cloud Data and AI Engineer, Professional Services USD 127K-183KC++ | Capacity Planning | Cloud Databases | Data Migration | Data PipelinesTravel up to 30 percentMid-level Full TimeReston, VA, USA10h ago
-
Staff Software Engineer, ML Frameworks USD 207K-300KAPIs | Data Processing | Debugging | Fine Tuning | GPU AccelerationSenior-level Full TimeMountain View, CA, USA10h ago
-
Entry-level InternshipChicago, IL, US12h ago
-
Data Scientist USD 67K-150KA/B | A/B Testing | B testing | Clustering | Drift monitoring401k plan | AD and D insurance | Child Life Insurance | Dental insurance | Educational Assistance PlanMid-level Full TimeUnited States12h ago
-
Solution Architect (AI & Data Applications) USD 180K-247KAutogen | CI/CD | Databricks | Docker | FastAPIMentoring system | Professional development | Supportive work environmentSenior-level Full TimeJersey City, NJ, United States12h ago
-
Senior Data Engineer USD 129K-165KAWS | Airflow | CI/CD | Data Modeling | Django401k | Half-day Fridays | Medical/Dental/Vision insurance | Paid Holidays | Remote workSenior-level Full TimeChicago, IL, US R19h ago
-
Chase Modeling - Applied AI ML Senior Associate USD 171K-215KClassification | Cloud Computing | Hadoop | Language Processing | Machine LearningBackup childcare | Financial coaching | Health care coverage | Mental health support | Onsite health and wellness centersSenior-level Full TimeChicago, IL, United States20h ago
-
AWS | Artificial Intelligence | Azure AI | Data Analysis | DatabricksBackup childcare | Financial coaching | Health care coverage | Mental health support | On-site health and wellness centersSenior-level Full TimeChicago, IL, United States20h ago
-
Staff Machine Learning Engineer (Pricing) USD 215K-322KA/B Testing | API Design | AWS | Automated retraining | B testing401k matching | Dental insurance | Family planning assistance | Flexible time off | Healthcare benefitsSenior-level Full TimeSan Francisco, CA20h ago
-
Data Platform Engineer USD 205K-316KApache Airflow | Automation | Cloud infrastructure | Cloud platform | DBT401k match | Commuter benefits | Dental insurance | Employee stock options | Health insuranceMid-level Full TimeDenver, CO; New York, NY; San …21h ago