Kubernetes Engineer

Dallas, United States

G-Research

We use machine learning, big data & the most advanced tech to predict movements in financial markets.

View all jobs at G-Research

Apply now Apply later

Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips? 

G-Research is a leading quantitative research and technology firm, with offices in London and Dallas.

We are proud to employ some of the best people in their field and to nurture their talent in a dynamic, flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded. 

This is a hybrid role based in our new Dallas infrastructure hub where we work on the latest technologies in a cutting-edge environment.

The role

We are seeking a highly skilled Senior Kubernetes Engineer to join our Platform Engineering function in Dallas.

In this role, you will design, implement, and optimise GPU-accelerated container platforms at scale, enabling high-performance workloads (AI/ML, HPC, LLM training) across hybrid or on-prem environments.

You will have deep expertise with both NVIDIA and Kubernetes ecosystems, including GPU scheduling, device plugins and custom operators.

Key responsibilities of the role include:

  • Architecting and operating Kubernetes clusters optimised for GPU workloads, leveraging NVIDIA GPU Operator, Network Operator and DCGM

  • Developing, deploying and maintaining custom Kubernetes operators and controllers to automate infrastructure services

  • Integrating NVIDIA device plugins, Multi-Instance GPU (MIG) and GPU sharing features into the scheduling layer

  • Optimising GPU utilisation and job placement through scheduler extensions, such as kube-scheduler plugins, Slurm and Volcano

  • Collaborating with HPC, ML and DevOps teams to ensure multi-tenant, high-throughput cluster performance

  • Driving observability and telemetry integrations using Prometheus, Grafana, DCGM Exporter and OpenTelemetry

  • Implementing secure multi-user and multi-namespace GPU isolation, with RBAC and policy enforcement, such as OPA or Gatekeeper

  • Maintaining CI/CD pipelines for Kubernetes infrastructure using GitOps, ArgoCD and FluxCD

  • Contributing to infrastructure-as-code, using Terraform, Helm, and Kustomize

  • Participating in performance tuning, incident response and production readiness reviews

Who are we looking for?

The ideal candidate will have the following skills and experience

  • Extensive experience with Kubernetes in production-grade environments and working with NVIDIA and Kubernetes, including GPU Operator, device plugin, NVML, MIG and DCGM

  • Proficiency in Go or Python for operator development and Kubernetes controller logic

  • Deep understanding of Kubernetes internals, including CRDs, RBAC, custom controllers and scheduler extensions

  • Experience with GPU-intensive workloads, for example for LLMs, training pipelines and scientific computing

  • Hands-on experience with Helm, Kustomize and GitOps workflows

  • Familiarity with CNI plugins, especially NVIDIA CNI and Multus

  • Experience with monitoring GPU metrics and cluster health using Prometheus and DCGM Exporter

The following is beneficial:

  • Knowledge of container runtimes with CRI-O, containerd and NVIDIA Container Toolkit

  • Contributions to open-source projects in the Kubernetes or NVIDIA ecosystem

  • Preferred experience working with cilium or CNI plugins

Why should you apply?

  • Market-leading compensation plus annual discretionary bonus

  • Lunch provided in the office (via GrubHub)

  • Informal dress code and excellent work/life balance

  • Excellent paid time off allowance of 25 days

  • Sick days, military leave, and family and medical leave

  • Generous 401(k) plan

  • 16-weeks’ fully paid parental leave

  • Medical and Prescription, Dental, and Vision insurance

  • Life and Accidental Death & Dismemberment (AD&D) insurance

  • Employee Assistance and Wellness programs

  • Generous relocation allowance and support

  • Great selection of office snacks, and hot and cold drinks

  • On-site gym and car parking

This role is employed through our US affiliate.

G-Research is committed to cultivating and preserving an inclusive work environment. We are an ideas-driven business and we place great value on diversity of experience and opinions.

We want to ensure that applicants receive a recruitment experience that enables them to perform at their best. If you have a disability or special need that requires accommodation please let us know in the relevant section

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: CI/CD DevOps Engineering Finance GPU Grafana Helm HPC Kubernetes LLMs Machine Learning Open Source Pipelines Python Research Terraform

Perks/benefits: Fitness / gym Flex hours Flex vacation Health care Insurance Medical leave Parental leave Relocation support Salary bonus Wellness

Region: North America
Country: United States

More jobs like this