Senior Systems Engineer - AV Infrastructure Cloud Platform
US, CA, Santa Clara, United States
NVIDIA
NVIDIA on grafiikkasuorittimen keksijä, jonka kehittämät edistysaskeleet vievät eteenpäin tekoälyn, suurteholaskennan.We are seeking a motivated cloud platform Senior Systems Engineer to join our team in building and scaling our cloud-native infrastructure which enables developers to run 100s of services. You'll play a critical role in driving infrastructure innovation across our organization.
What you'll be doing:
You will be applying strong programming skills to develop cloud platform tooling and automation to enhance developer productivity and operational efficiency across our cloud infrastructure.
Lead the development of infrastructure automation frameworks and CI/CD pipelines, ensuring robust, scalable, and secure cloud-native applications deployment.
Engaging directly with engineering users to understand their needs and improve their experience by recommending robust, scalable cloud solutions.
Contribute to the design and architecture of the cloud infrastructure and networking components to meet the evolving needs of our internal developer platform.
Play pivotal role in improving cloud infrastructure and services reliability and performance
What we need to see:
BS/MS in Computer Science or Engineering (or equivalent experience) or BS/MS in STEM related field
8+ years of professional experience in related field
At least 4+ years of experience in Kubernetes-based platform tooling development
At least 4+ years of experience in cloud infrastructure automation and management
Strong programming fundamentals with expertise in Go and Python
Ability to seamlessly shift between Linux system environments to Python programming
Deep AWS expertise across core services (VPC, IAM, EC2, S3, RDS, CloudFront, EKS) with proven experience in designing and managing scalable cloud infrastructure
Comprehensive understanding of Kubernetes and Cloud Native Architecture, with hands-on experience managing large-scale production clusters
Good understanding of the SRE best practices, alerting and observability
Advanced Kubernetes workload management expertise, including traffic management, deployment strategies, observability, and security implementation
Strong Infrastructure as Code (IaC) fundamentals with experience in developing infrastructure CI/CD pipelines, automation frameworks, and IaC libraries
Ways to stand out from the crowd:
You'll be a fun and motivated teammate who enjoys a challenge and celebrates success.
Working experience with Agentic AI tools for computing infrastructure management.
Motivated self-starter with an equal balance of strong problem-solving skills and customer-facing communication skills
Excellent written and verbal interpersonal skills.
Contributions to open-source projects in the cloud-native ecosystem, particularly in areas of Kubernetes tooling, infrastructure automation, or cloud-native applications
Previous experience with building sophisticated tooling and SRE automation on the large GPU/CPU clusters.
Our 20-year expertise in visual computing includes GPU invention for graphics in diverse fields.
Today, we stand at the beginning of the new AI computing era, ignited by a new computing model, GPU deep learning. SRE's culture of diversity, intellectual curiosity, problem-solving, and openness is important to our success. Our organization brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big, and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to build an environment that provides the support and mentorship needed to learn and grow.
The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.Tags: Architecture AWS CI/CD Computer Science Deep Learning EC2 Engineering GPU Kubernetes Linux Open Source Pipelines Python Security STEM
Perks/benefits: Career development Equity / stock options
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.