Infrastructure Engineer (Linux & Networking)

USA, United States

NexGen Cloud

Discover NexGen Cloud's innovative AI cloud solutions designed for seamless scalability, high performance and robust security. Lead AI innovation with NexGen Cloud.

View all jobs at NexGen Cloud

Apply now Apply later

NexGen Cloud is a rapidly growing IaaS company focused on providing innovative cloud solutions and infrastructure services. Our GPU cloud infrastructure solutions accelerate development in industries such as Artificial Intelligence & Machine Learning, VFX & Rendering, Data Science & IoT, and Computer Aided Engineering & MDO.

We are dedicated to helping our clients navigate the complexities of the digital world and achieve success through cutting-edge, scalable, secure and affordable solutions.

At the company's heart stands a group of very talented, experienced, and motivated individuals who want to make a positive change and a lasting impact on the tech world.

Position Summary:

As an Infrastructure Engineer, you will help design, deploy, and operate the systems that power our global GPU cloud. You’ll bring deep expertise in Linux, networking, and automation to ensure our fleet is secure, scalable, and fast. This is a hands-on role ideal for engineers who love building and optimizing performance-critical infrastructure and who want to have a major impact at a rapidly scaling company. 

Key Responsibilities: 

Core Infrastructure

  • Provision and manage Linux systems (Ubuntu-based) supporting GPU servers and backend services.
  • Maintain system availability, conduct root cause analysis, and implement failover strategies.

Networking

  • Design and manage high-speed, low-latency network infrastructure across data center environments.
  • Configure firewalls, BGP, VLANs, VXLANs, and VPNs to support secure and scalable multi-tenant networking.
  • Resolve network-related incidents impacting workloads or customer environments.

Automation & Scaling

  • Build infrastructure-as-code with tools like Ansible for repeatable, scalable deployments.
  • Automate GPU driver installs, system bootstrapping, and fleet-wide patching.
  • Develop CI/CD workflows for infrastructure updates and configuration validation.

Cloud & Virtualization

  • Support containerized workloads via Kubernetes or custom orchestration systems.
  • Work with both bare-metal and virtualized GPU platforms using KVM or OpenStack-based environments.
  • Integrate with public cloud APIs or hybrid infrastructure as needed.

Monitoring & Security

  • Deploy and manage monitoring stacks (e.g., Prometheus, Grafana, ELK) to track system health and capacity.
  • Implement hardening practices, access controls, and audit trails for infrastructure components.
  • Support incident response and security investigations related to infrastructure.

Qualifications and Skills:

  • 3–5 years of experience in Linux systems administration or infrastructure engineering.
  • Strong networking knowledge: routing, switching, TCP/IP, DNS, DHCP, VLANs, BGP, VPN.
  • Proficiency with scripting languages (Bash, Python) and automation tools (Ansible, Terraform).
  • Hands-on experience with virtualization, containerization, and systems troubleshooting.
  • Familiarity with monitoring and logging systems in a production environment.
  • Strong focus on keeping good documentation 

Good to have:

  • Prior experience at a GPU cloud provider, HPC environment, or similar high-performance setting.
  • Exposure to NVIDIA GPU technologies and tooling (e.g., Nvidia GPU operator, CUDA toolkit, DCGM).
  • Experience with software-defined networking (SDN, OVS/OVN) and overlay networks (VXLAN, Calico).
  • Experience with networking products from Arista, Cisco, Mikrotik and Nvidia/Mellanox.
  • Familiarity with OpenStack private cloud environments
  • Familiarity with CMDB tools like Netbox
  • Experience with working with Internet Registries (RIPE, AURIN)
  • Knowledge of server provisioning via PXE/iPXE and out-of-band management tools (IPMI, Redfish)

What We Offer:

  • Competitive salary
  • Opportunity to work with a diverse team of talented professionals who are passionate about technology and innovation.
  • A collaborative and supportive work environment that encourages professional growth and development.
  • Exposure to cutting-edge technologies and the opportunity to make a significant impact on the future of cloud computing.

 

We encourage applications from candidates of all backgrounds and experiences. Our commitment to diversity and inclusion drives our success as a company and reflects our dedication to fostering a diverse and innovative workforce.

Join our team and become a part of the NexGen Cloud Team, where innovation, collaboration, and growth are at the heart of everything we do. If you are a passionate, talented, and motivated individual looking to make a difference, apply now!

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  2  1  0

Tags: Ansible APIs CI/CD CUDA ELK Engineering GPU Grafana HPC Kubernetes Linux Machine Learning OpenStack Python Security Terraform

Perks/benefits: Career development Competitive pay Startup environment

Region: North America
Country: United States

More jobs like this