Senior DGX AI Cloud Performance Analysis Tools Engineer
US, CA, Santa Clara, United States
NVIDIA
NVIDIA on grafiikkasuorittimen keksijä, jonka kehittämät edistysaskeleet vievät eteenpäin tekoälyn, suurteholaskennan.Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing efficiency and resiliency of AI workloads, as well as developing scalable AI and Data infrastructure tools and services. Our objective is to deliver a stable, scalable environment for AI researchers, providing them with the necessary resources and scale to foster innovation. We are seeking excellent Software Engineers to design and develop tools for AI application performance analysis. Your work will enable AI researchers to work efficiently with a wide variety of DGXC cloud AI systems as they seek out opportunities for performance optimization and continuously deliver high quality AI products. Join our technically diverse team of AI infrastructure experts to unlock unprecedented AI performance in every domain.
What you'll be doing:
Develop AI performance tools for large scale AI systems providing real time insight into applications performance and system bottlenecks.
Conduct in-depth hardware-software performance studies
Define performance and efficiency evaluation methodologies
Automate performance data analysis and visualization to convert profiling data into actionable optimizations
Support deep learning software engineers and GPU architects in their performance analysis efforts
Work with various teams at NVIDIA to incorporate and influence the latest technologies for GPU performance analysis
What we need to see:
Minimum of 8+ years of experience in software infrastructure and tools
BS or higher degree in computer science or similar (or equivalent experience)
Adept programming skills in multiple languages including C++ and Python
Solid foundation in operating systems and computer architecture
Outstanding ability to understand users, prioritize among many contending requests, and build consensus
Passion for “it just works” automation, eliminating repetitive tasks, and enabling team members
Ways to stand out from the crowd:
Experience in working with the large scale AI cluster
Experience with CUDA and GPU computing systems
Hands-on experience with deep learning frameworks (TensorFlow, PyTorch, JAX/XLA etc.)
Deep understanding of the software performance analysis and optimization process
NVIDIA leads the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for exceptional people like you to help us accelerate the next wave of artificial intelligence.
The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.Tags: Architecture Computer Science CUDA Data analysis Deep Learning GPU JAX ML infrastructure Python PyTorch Research TensorFlow
Perks/benefits: Career development Equity / stock options
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.