Senior Infrastructure Engineer (OpenStack)
London, United Kingdom
NexGen Cloud
Discover NexGen Cloud's innovative AI cloud solutions designed for seamless scalability, high performance and robust security. Lead AI innovation with NexGen Cloud.NexGen Cloud is a rapidly growing IaaS company focused on providing innovative cloud solutions and infrastructure services. Our GPU cloud infrastructure solutions accelerate development in industries such as Artificial Intelligence & Machine Learning, VFX & Rendering, Data Science & IoT, and Computer Aided Engineering & MDO.
We are dedicated to helping our clients navigate the complexities of the digital world and achieve success through cutting-edge, scalable, secure and affordable solutions.
At the company's heart stands a group of very talented, experienced, and motivated individuals who want to make a positive change and a lasting impact on the tech world.
Position Summary:
We’re looking for a Senior Infrastructure Engineer with deep OpenStack and strong Kubernetes expertise to join our Infrastructure Engineering team. You’ll play a key role in shaping and scaling our GPUaaS offering, combining the flexibility of OpenStack with the automation and developer-centric capabilities of Kubernetes.
In this role, you'll design GPU-optimized Kubernetes clusters, build multi-tenant GPU infrastructure, and contribute to automation, observability, and CI/CD tooling across the platform.
Key Responsibilities:
- Cloud & Container Platform Design
Architect and deploy OpenStack and Kubernetes clusters designed for GPU scheduling, high performance, and multi-tenant workloads. - Infrastructure Automation
Automate deployment pipelines for cloud infrastructure using Terraform, Ansible, Helm, and Kubernetes Operators. - GPU Workload Enablement
Build and manage GPU-ready container runtimes, NVIDIA device plugins, and Kubernetes-native GPU provisioning frameworks. - Cluster Operations & Observability
Ensure high availability and performance of OpenStack and Kubernetes clusters using tools such as Prometheus, Grafana, Loki, and Thanos. - Security, Policy & Governance
Implement secure namespace isolation, RBAC, and network policies across OpenStack and Kubernetes layers. - Collaboration & Mentorship
Work cross-functionally with DevOps, AI, Support, and Product teams to align infrastructure services with platform goals. Provide guidance on Kubernetes best practices.
Qualifications and Skills:
- 5+ years of experience with OpenStack in production environments.
- 3+ years of experience managing production-grade Kubernetes clusters, including bare-metal or private cloud environments.
- Strong hands-on expertise with:
- Kubernetes operators, Helm, and custom resource definitions (CRDs)
- GPU orchestration in Kubernetes using NVIDIA tools
- Multi-cluster or federated Kubernetes
- Proficiency in Linux, Ceph, networking (Calico/Cilium), and infrastructure scripting (Python, Bash).
- Strong knowledge of cloud-native security, policy frameworks, and service meshes.
- Experience with CI/CD pipelines, GitOps, and infrastructure-as-code tooling (Terraform, Ansible, ArgoCD).
Good to have:
- Experience integrating Kubernetes with OpenStack.
- Prior contributions to Kubernetes SIGs or CNCF projects.
- Knowledge of GPU metering, billing, and quota enforcement.
- Familiarity with HPC environments, InfiniBand/ROCEv2 networking, or Slurm integration.
What We Offer:
- Competitive salary
- Opportunity to work with a diverse team of talented professionals who are passionate about technology and innovation.
- A collaborative and supportive work environment that encourages professional growth and development.
- Exposure to cutting-edge technologies and the opportunity to make a significant impact on the future of cloud computing.
We encourage applications from candidates of all backgrounds and experiences. Our commitment to diversity and inclusion drives our success as a company and reflects our dedication to fostering a diverse and innovative workforce.
Join our team and become a part of the NexGen Cloud Team, where innovation, collaboration, and growth are at the heart of everything we do. If you are a passionate, talented, and motivated individual looking to make a difference, apply now!
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Ansible CI/CD DevOps Engineering GPU Grafana Helm HPC InfiniBand Kubernetes Linux Machine Learning OpenStack Pipelines Python Security Terraform
Perks/benefits: Career development Competitive pay Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.