Software Engineer, Inference Platform
Tasks
- Build and operate KV cache and scheduling infrastructure
- Contribute to inference platform architecture and roadmap
- Drive improvements in throughput TTFT and cost per token
- Implement and validate disaggregated prefill and decode pipelines
- Own inference deployments end-to-end
- Participate in on-call rotation to maintain system reliability
- Partner with customers to optimize deployment configurations
- Profile and resolve bottlenecks across compute memory and communication
Perks/Benefits
Skills/Tech-stack
CUDA | Distributed Systems | Expert parallelism | GPU Compute | GPU Optimization | GPU compute parallelism | GPU memory | GPU memory hierarchies | Go | Inference Engines | JAX | Kubernetes | LLM serving | LLM serving frameworks | Memory hierarchies | Model Deployment | PyTorch | Python | Quantization tooling | Serving frameworks | Speculative decoding | Tensor and expert parallelism | Torch.compile | Triton
Education
Roles
Regions
Countries
States
Related jobs
-
API Integrations | AWS Glue | AWS Lambda | Amazon Redshift | Amazon S3Mid-level ContractLos Angeles, United States2h ago
-
Senior Software Engineer - AI Inference USD 160K-240KBatching | CUDA | Caching | Distributed Systems | High Performance401k match | Dental insurance | Life insurance | Medical insurance | Paid HolidaysSenior-level Full TimeNew York3h ago
-
CRM | Data Mining | Deep learning | Email outreach | Knowledge graphsMid-level Full TimeSan Jose, California, United States3h ago
-
Senior Software Engineer, AI/ML, Google Public Sector USD 174K-252KAlgorithms | C++ | Cloud Object Storage | Data Structures | Distributed ComputingSenior-level Full TimeReston, VA, USA4h ago
-
Staff Software Engineer, Intelligent Database Management USD 207K-300KAI | API Design | AlloyDB | Audit Logging | BigtableSenior-level Full TimeSunnyvale, CA, USA; Kirkland, WA, USA4h ago
-
Forward Deployed Engineer I, GenAI, Google Cloud USD 102K-145KAPI Development | Agent Framework | Agent systems | Cloud Computing | CrewAISenior-level Full TimeSan Francisco, CA, USA; Atlanta, GA, …4h ago
-
C++ | Data Mining | Data Processing | Deep learning | Few-Shot LearningSenior-level Full TimeMountain View, CA, USA4h ago
-
Software Engineer III, AI/ML, Google Ads USD 147K-211KC++ | Data Processing | Debugging | Information Retrieval | Language ProcessingSenior-level Full TimeMountain View, CA, USA4h ago
-
Senior Data Engineer USD 140K-165KAI/ML | Apache Airflow | CRM Integration | Cloud Data | Cloud Data PlatformsSenior-level Full TimeNY, United States6h ago
-
Senior Staff Data Engineer USD 123K-185KAPI | Automated testing | Cloud platform | Data Analysis | Data GovernanceDental insurance | Health care | Paid time off | Retirement plan | Sick leaveSenior-level Full TimeResidence Based, Residence Based, US7h ago
-
ML Engineer USD 190K-320KCost Optimization | Data Versioning | Dataset Operations | Dataset curation | Eval Frameworks401k matching | Dental insurance | Employee assistance program | Health insurance | Stock optionsSenior-level Full TimeSan Francisco9h ago
-
Senior-level Full TimeOklahoma City11h ago
-
Data Engineer USD 105K-115KBig Data | Cloud Computing | Data Modeling | Data Pipelines | Data StorageActive secret clearanceMid-level Full TimeSan Diego, CA, US11h ago
-
Principal Data Engineer USD 200K-240KAWS | Agentic Workflows | Anomaly Detection | Batch pipelines | CCPA401k plan | Commuter benefits | Flexible vacation | Life insurance | Long-term disabilitySenior-level Full TimeBoulder, Colorado or New York City, … R11h ago
-
Staff Machine Learning Engineer USD 159K-309KAWS | Airflow | Apache Spark | BigQuery | Cloud platform401k plan with company match | Commuter benefits | Disability coverage | Electric Car Charging Station | Employee assistance programSenior-level Full TimeMountain View, USA15h ago
-
Staff Machine Learning Engineer USD 152K-261KAWS | Airflow | Apache Spark | BigQuery | Cloud platform401k plan with company match | Commuter benefits | Disability insurance | Electric Car Charging Station | Employee assistance programSenior-level Full TimeMountain View, USA15h ago
-
Staff Software Engineer - AI Research Infrastructure USD 190K-270KBackend Services | C plus plus | CI/CD | Cloud infrastructure | Cluster managementSenior-level Full TimeNew York City, New York; San …15h ago
-
Staff Software Engineer - AI Research Infrastructure USD 190K-270KBackend Services | CI testing | Cluster scheduling | Data Pipelines | Distributed SystemsSenior-level Full TimeNew York City, New York15h ago
-
Senior AI Solutions Engineer, Enterprise Knowledge Work USD 260K-325KAgentic Systems | Dspy | Evaluation | LLM orchestration | LanggraphCollaborative culture | Flexible working hours | Supportive work environmentSenior-level Full TimeNew York, New York, United States; …15h ago
-
Senior AI Solutions Engineer, Software Engineering USD 260K-325KAgentic Software | Agentic Software Engineering | Agentic Workflows | Benchmarking | Code generationCollaborative culture | Five days per week | Flexible working hours | Supportive work environmentSenior-level Full TimeNew York, New York, United States; …15h ago
-
Quantum Software Engineer II USD 100K-215KAI Tooling | CUDA | Compilers | Complex linear algebra | DebuggingEntry-level Full TimeRedmond, WA, US15h ago
-
Senior Applied Scientist , Sponsored Products USD 183K-273KA/B | A/B Testing | B testing | Bandit Algorithms | Causal InferenceSenior-level Full TimeNew York, New York, USA15h ago
-
Principal, AI Platform Engineer USD 125K-187KAWS | Azure | CI/CD | Data leakage | Deterministic executionSenior-level Full TimeAtlanta, Georgia, US United States, 3034015h ago
-
Lead Data Engineer, Marketing Operations and Engineering USD 139K-257KAWS | Apache Airflow | Apache Spark | Cloud Computing | Cloud platformSenior-level Full TimeSan Jose, United States R15h ago
-
Mid-level Full TimeCharlotte, United States15h ago