Product Manager, AI Platform Kernels and Communication Libraries

US, CA, Santa Clara, United States

NVIDIA

NVIDIA on grafiikkasuorittimen keksijä, jonka kehittämät edistysaskeleet vievät eteenpäin tekoälyn, suurteholaskennan.

View all jobs at NVIDIA

Apply now Apply later

NVIDIA's AI Software Platforms team seeks a technical product manager to accelerate next-generation inference deployments through innovative libraries, communication runtimes, and kernel optimization frameworks. This role bridges low-level GPU programming with ecosystem-wide developer enablement for products including CUTLASS, cuDNN, NCCL, NVSHMEM, and open-source contributions to Triton/FlashInfer.

As NVIDIA Product Managers, our goal is to enable developers to be successful on the NVIDIA Platform, and push the boundaries of what is possible with their AI deployments! For Inference, we are the champions inside NVIDIA for AI developers looking to accelerate their deployments on GPUs. We work directly with developers inside and outside of the company to identify key improvements, create roadmaps, and stay alert on the inference landscape. We also work with NVIDIA leaders to define clear product strategy, and marketing team teams to build go-to-market plans. The Product Management organization at NVIDIA is a small, strong, and impactful group. We focus on enabling deep learning across all GPU use cases and providing extraordinary solutions for developers. We are seeking a rare blend of product skills, technical depth, and drive to make NVIDIA great for developers. Does that sounds familiar? If so, we would love to hear from you!

What you'll be doing:

  • Architect developer-focused products that simplify high-performance inference and training deployment across diverse GPU architectures.

  • Define the multi-year strategy for kernel and communication libraries by analyzing performance bottlenecks in emerging AI workloads.

  • Collaborate with CUDA kernel engineers to design intuitive, high-level abstractions for memory and distributed execution.

  • Partner with open-source communities like Triton and FlashInfer to shape and drive ecosystem-wide roadmaps.

What we need to see:

  • 5+ years of technical PM experience shipping developer products for GPU acceleration, with expertise in HPC optimization stacks.

  • Expert-level understanding of CUDA execution models and multi-GPU protocols, with a proven track record to translate hardware capabilities into software roadmaps.

  • BS or MS or equivalent experience in Computer Engineering or demonstrated expertise in parallel computing architectures.

  • Strong technical interpersonal skills with experience communicating complex optimizations to developers and researchers.

Ways to stand out from the crowd:

  • PhD or equivalent experience in Computer Engineering or a related technical field.

  • Contributed to performance-critical open-source projects like Triton, FlashAttention, or TVM with measurable adoption impact

  • Crafted GitHub-first developer tools with >1k stars or similar community engagement metrics

  • Published research on GPU kernel optimization, collective communication algorithms, or ML model serving architectures

  • Experience building cost-per-inference models incorporating hardware utilization, energy efficiency, and cluster scaling factors

The base salary range is 144,000 USD - 258,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Apply now Apply later
Job stats:  0  0  0

Tags: Architecture CUDA cuDNN Deep Learning Engineering GitHub GPU HPC Machine Learning Open Source PhD Research

Perks/benefits: Career development Equity / stock options

Region: North America
Country: United States

More jobs like this