Product Manager - Inference

US, CA, Santa Clara, United States

Full Time Mid-level / Intermediate USD 144K - 258K

NVIDIA

NVIDIA on grafiikkasuorittimen keksijä, jonka kehittämät edistysaskeleet vievät eteenpäin tekoälyn, suurteholaskennan.

Posted 3 weeks ago

Inference is the fastest growing and most competitive area in Generative AI today. It is where AI models impact our daily life, and where ever bit of accuracy and performance matter for quality, safety, and cost. Inference is also constantly evolving, with new acceleration algorithms, usecases, and deployment techniques. As a Product Manager for AI Platform Inference you will be responsible for building the tools, SDKs, and libraries which enables developers' Inference deployments to thrive on NVIDIA GPUs.

As NVIDIA Product Managers, our goal is to enable developers to be successful on the NVIDIA Platform, and push the boundaries of what is possible in AI deployments! As Product Managers, we are the champions inside NVIDIA for developers looking to accelerate their deployments on GPUs. We work directly with developers inside and outside of the company to identify key improvements, create roadmaps, and stay alert on the inference landscape. We also work with NVIDIA leaders to define clear product strategy, and marketing team teams to build go-to-market plans. The Product Management organization at NVIDIA is a small, strong, and impactful group. We focus on enabling deep learning across all GPU use cases and providing great solutions for developers. We are seeking a rare blend of product skills, technical depth, and passion to make NVIDIA great for developers. Does that sounds familiar? If so, we would love to hear from you!

What you'll be doing:

Create products to help developers build better Inference deployments
Develop product strategy, roadmaps, and go-to-market plans
Collaborate with internal and external developers to build product-based roadmaps for model optimization software
Work with leadership to align with and drive company strategy

What we need to see:

Experience with Inference deployment and optimization software (ex. vLLM, SGLang, FlashInfer, TensorRT-LLM, Triton, Dynamo, TorchAO, etc.)
Demonstrable knowledge of GenAI or machine learning concepts, particularly around performance optimization, and software development and delivery
BS or MS degree in Computer Science, Computer Engineering, or similar experience (or equivalent experience)
5+ years of technical product management, or similar, experience at a technology company
Strong communication and interpersonal skills

Ways to Stand Out from the crowd:

Experience leading optimization products for Inference
Working on Open Source & Github-first developer products with deep customer interactions
Knowledge of GPU architecture, HW/SW co-design, and performance profiling

The base salary range is 144,000 USD - 258,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Apply now Apply later

Job stats: 1 0 0

Categories: Leadership Jobs Product Jobs

Tags: Architecture Computer Science Deep Learning Engineering Generative AI GitHub GPU LLMs Machine Learning Open Source TensorRT vLLM