Senior Platform Engineer – Core Infrastructure

Zürich, Switzerland

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Kaiko

Kaiko’s data framework for cancer research facilitates hospitals and research institutes with data insights, A.I. support for medical doctors and the latest developments in machine-learning research.

View all jobs at Kaiko

Apply now Apply later

About kaiko

Delivering high quality cancer care is complex; specialists form a view of each patient's condition by reasoning across different data - CT scans, genomics context, treatment history and clinical notes.

Current AIs are powerful within domains but fall short when it comes to reasoning across data or domain areas. kaiko.w, our AI assistant for oncology, aims to equip every clinician with a full understanding of their patients, helping them to reason across data as they assess each case.

We’re building this in close collaboration with the Netherlands Cancer Institute (NKI) and a growing network of hospitals and research centers. We’ve raised significant long-term funding and have nearly doubled our team over the past year. We’re now 80+ people representing 25 nationalities, based across our offices in Zurich and Amsterdam

About the role

You will be joining our Core Infrastructure Team as a key player in scaling the AI systems that help oncologists make better treatment decisions. This role sits at the heart of our technical stack - ensuring our GPU clusters can train models efficiently, our data pipelines can handle petabytes of medical imaging data, and our hybrid infrastructure scales seamlessly as we expand across European hospitals.

Working at the intersection of healthcare AI and cutting-edge infrastructure, you will manage our back-end services which handle hybrid fleet management, components that drive our advancements in AI, core services used by every team at Kaiko, networking systems, and everything in between. Your work directly impacts cancer care by maintaining the platforms that power kaiko.w, our AI assistant used by hospitals across Europe.

This is a platform-first role where you'll spend 70% of your time building and optimizing core infrastructure, and 30% ensuring reliability and performance. You will split time between building developer-friendly infrastructure tools and ensuring 99.9%+ uptime for our workloads. Systems under your responsibility have company-wide reach, therefore company-wide interaction is to be expected.

On the day-to-day, you will cooperate with the wider Infrastructure Team and the rest of Platform Organization, mentor and help junior engineers around you grow, be a force multiplier of impact. You will be based either in The Netherlands or Switzerland, with the expectation of spending at least 50% of your time at the office.

Some areas of responsibility

  • You will provision and optimize containerized compute environments for ML workloads using container orchestration systems (AKS, K8s, RKE2, etc.), bare-metal GPU servers and cloud compute resources across on-premises and hybrid environments
  • You will create, influence and review ongoing network topologies, standards and processes for systems that support healthcare compliance requirements (network isolation, encrypted data flows) through software-defined networking solutions for hybrid cloud and on-premises integration
  • You will develop Infrastructure-as-Code (Terraform) and configuration-as-code (Ansible) modules for reproducible infrastructure deployment, create self-service tools and APIs with the goal of abstracting infrastructure complexity for our engineering teams
  • You will optimize high-speed interconnects for multi-GPU training clusters (InfiniBand, RoCE, etc.), which leverage virtualization-related specifications (RDMA, SR-IOV, GPUDirect, etc.)
  • You will help our data teams to improve petabyte-scale storage systems for medical imaging data (DICOM, pathology slides, etc.) by supporting the integration and management of distributed AI storage systems (WEKA, VAST, Pure, etc.)

About you

Minimum requirements:

  • 3+ years of hands-on experience with medium-sized production infrastructures (50+ users), with proficiency in one of these areas and solid exposure to at least one other:
  • Network: hybrid network topologies and enterprise switching/routing, SDN technologies, high-performance computing interconnect;
  • Storage: database management, enterprise storage systems, data solutions (e.g., VAST), storage protocols;
  • Compute: different Kubernetes distribution and back-end components (e.g., CNIs, operators, CSI drivers, etc.), with a focus on cluster administration;
  • Infrastructure-as-Code experience (Terraform or similar) and Linux systems administration
  • Automation scripting (Python, Bash, or Go) and understanding of CI/CD/GitOps workflows

Nice to have:

  • Exposure to GPU/HPC infrastructure and database (e.g., PostgreSQL, mySQL, etc.) performance optimization
  • Exposure to GitOps workflow, advanced observability tools (Prometheus, Grafana, etc.), and API design for platform services
  • Experience mentoring engineers and partnering with data teams to enable centralized platforms and remove infrastructure bottlenecks
  • Understanding of medical data compliance requirements (HIPAA, GDPR) and AI/ML infrastructure patterns (distributed training, model serving)

We are excited to gather a broad range of perspectives in our team, as we believe it will help us build better products to support a broader set of people. If you’re excited about us but don’t fit every single qualification, we still encourage you to apply. We’ve had incredible team members join us who didn’t check every box!

Why kaiko
At kaiko, we believe the best ideas come from collaboration, ownership and ambition. We’ve built a team of international experts, and your work has a direct impact. Here’s what we value:

  • Ownership: You’ll have the liberty to set your own goals (in alignment with the organizational needs), make critical decisions, and see the direct impact of your work.
  • Collaboration: You’ll have to approach disagreement with curiosity, build on common ground and create solutions together.
  • Ambition: You’ll be surrounded by people who set high standards for themselves and others, who see obstacles as opportunities, and who are relentless in their work to create better outcomes for patients.


In addition, we offer:

  • An attractive and competitive salary, a good pension plan and 25 vacation days per year.
  • Great offsites and team events to strengthen the team and celebrate successes together.
  • A EUR 1000 learning and development budget to help you grow.
  • Autonomy to do your work the way that works best for you, whether you have a kid or prefer early mornings. We would still like to see you around for a few in-person collaborative touchpoints.
  • An annual commuting subsidy.
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0
Category: Engineering Jobs

Tags: Ansible APIs CI/CD Data pipelines DICOM Engineering GPU Grafana HPC InfiniBand Kubernetes Linux Machine Learning ML infrastructure MySQL Pipelines PostgreSQL Python Research Terraform Weka

Perks/benefits: Career development Competitive pay Team events

Region: Europe
Country: Switzerland

More jobs like this