Product Manager - AI Inference & Model Serving
USD 160K-275K (estimate) Mid-level Full Time
Tasks
- Create PoC playbooks and sizing guides
- Define lifecycle for inference services
- Define performance outcome metrics and improvement plans
- Drive go to market pricing packaging and reference architectures
- Lead technical discovery with platform engineering teams
- Manage observability and reliability requirements
- Own product strategy and roadmap for AI inference and model serving
- Partner on system design trade offs for runtime GPU scheduling and serving topology
- Translate findings into prioritized requirements and architecture direction
Perks/Benefits
Skills/Tech-stack
AI Inference | Autoscaling | Cache Management | Cold Start | Cold Start Optimization | Continuous batching | Dedicated Endpoints | Disaggregated serving | DynamoDB | GPU scheduling | Inference Server | KV cache | KV-cache management | Model Serving | Multi model serving | Multi-model | Network Optimization | Observability | Performance Engineering | Prefill Decode | Prefill Decode Optimization | Reliability Engineering | Routing | SGLang | Serverless | Storage Optimization | TensorRT-LLM | Triton Inference | Triton Inference Server | VLLM | Workload placement
Education
N/A
Roles
Related jobs
-
Manager, Data Quality Engineering USD 180K-247KADLS | Active Directory | Agile | Apache Spark | Azure401k matching | Adoption Assistance | Childcare tuition discounts | Company Mental Health Support | Fertility benefitsSenior-level Full TimeAnn Arbor, MI, United States12h ago
-
AI | AI Agents | Agent systems | Cloud Computing | Context engineeringSenior-level Full TimeSan Francisco, CA, USA; New York, …20h ago
-
Technical Program Manager II, AI/ML, Google Ads USD 138K-198KCross-Functional Collaboration | Cross-functional | Data analytics | Functional collaboration | Gemini ModelsMid-level Full TimeNew York, NY, USA20h ago
-
Senior Manager, Software Engineering - Remote USD 125K-200KAPI | API Gateway | Agentic Workflows | Amazon Web Services | CI/CDComprehensive benefits package | Remote work | Variable pay opportunitySenior-level Full TimeUnited States, UNITED STATES, United States R1d ago
-
AIPS | API Standards | Apigee | Authentication | Best practicesSenior-level Full TimeSeattle, WA, USA; Goleta, CA, USA1d ago
-
Senior Machine Learning Ops Engineer USD 150K-173KAWS | Airflow | Bash | Batch inference | CI/CDEmployee mentorship program | Leadership programsSenior-level Full TimeUnited States R2d ago
-
AI Services | AI orchestration | API Integration | Anthropic Claude | Azure AI401k | Dental insurance | Disability insurance | Health insurance | Life insuranceSenior-level Full TimeUS - Remote, United States R2d ago
-
Executive-level Full TimeUnited States2d ago
-
VP, Data and Analytics (Remote US) USD 240K-260KAI integration | BI tools | Data Architecture | Data Engineering | Data GovernanceExecutive-level Full TimeUnited States R3d ago
-
Cordant Data Analytics Platform Product Manager USD 85K-170KAI enablement | API Design | Analytics Platforms | Cloud Computing | Code platforms401k thrift plan | Disability programs | Flexible working hours | Life insurance | Private medical careMid-level Full TimeUS-TX-HOUSTON-575 N. DAIRY ASHFORD RD, ENERGY …3d ago
-
Senior-level Full TimeDetroit - 1001 Woodward, United States3d ago
-
AI Inference | AI Training | Artificial Intelligence | Business marketing | Business to Business marketingDental insurance | Employee assistance program | Flexible spending account | Generous time off | Health insuranceSenior-level Full TimeFTC03 - Ft. Collins, CO B-3 …3d ago
-
AI Inference | AI Training | B2B Marketing | Clinical decision support | Data PrivacyDental insurance | Employee assistance program | Flexible Paid Vacation | Flexible paid sick leave | Flexible spending accountSenior-level Full TimeFTC03 - Ft. Collins, CO B-3 …3d ago
-
Amazon Web Services | Apache Airflow | Apache Flink | Apache Kafka | Apache SparkBonus | Equity | Full-time employmentSenior-level Full TimeTexas, Texas, United States3d ago
-
AWS | Apache Airflow | Apache Flink | Apache Kafka | Apache SparkSenior-level Full TimeSeattle, Washington, United States3d ago
-
Airflow | Amazon Web Services | Apache Flink | Apache Kafka | Apache SparkSenior-level Full TimeGreenwich, Connecticut, United States3d ago
-
Backend Engineering Manager, Cloud Inference USD 256K-305KAlerts | Call Management | Capacity Planning | Compliance | Consistency401k matching | Flexible paid time off | Health insurance | Remote work | Team building eventsSenior-level Full TimeRemote, United States / Canada R5d ago
-
AWS | Agile | Automated testing | Azure DevOps | By DesignSenior-level Full TimeDes Moines, IA, United States5d ago
-
Executive-level Full TimeBoston5d ago
-
AWS | Agile | Athena | CI/CD | Code review401k with company match | Annual performance bonus | Baby bonding leave | Commuter benefits | Company holidaysExecutive-level Full TimeSan Carlos, CA6d ago
-
Data Engineering Manager USD 150K-185KAccess Control | Batch Processing | CI/CD | Data Governance | Data Lineage401k match | Company funded HSA | Dental insurance | Flexible PTO | Health insuranceSenior-level Full TimeRemote (United States) R6d ago
-
Manager Data Science, GenAI USD 118K-150KAgent Frameworks | Agile | Apache Spark | Data Processing | Data ScienceSenior-level Full Time2002 Summit Boulevard NE, Atlanta, GA, …7d ago
-
AI Development Lead - Infrastructure & Automation USD 125K-188KAgentic Workflows | Artificial Intelligence | Azure Resource | Azure Resource Manager | Bicep401k company match | AD and D insurance | Deferred compensation plan | Dental insurance | Disability insuranceSenior-level Full TimeCRC - Charlotte, NC 600 S. …7d ago
-
AI Search | API Design | AWS Bedrock | Agentic Systems | Asynchronous programmingMid-level Full TimeNew York, NY, US, 10001-8604 R8d ago
-
Software Engineering Manager, LLM Training USD 170K-277KCUDA | Containerization | Context Parallelism | Data I/O | Data parallelismEntry-level Full TimeMountain View, CA, United States8d ago