Lead Infrastructure Engineer

Bangalore

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Senior-level / Expert USD 66K - 123K * ^est.

Nirvana Insurance

Save up to 20% and improve fleet safety with Nirvana’s industry-leading Commercial Trucking Insurance and Active Safety Solutions.

View all jobs at Nirvana Insurance

Apply now Apply later

Posted 8 hours ago

Who We Are

Nirvana is on a mission to harness the power of data to revolutionize commercial insurance and enable a safer world. We are bringing much-needed innovation into the legacy, trillion-dollar commercial insurance industry. We have developed cutting-edge predictive models that use real-time IoT data from billions of connected devices, allowing us to better understand and price risk. Our AI-driven platform fundamentally changes the way an insurance company operates with personalized risk scoring, faster underwriting, modernized claims, and proactive, data-driven insights to help customers prevent accidents.

We’ve already proven the scale—reaching well over $100 million in premiums and more than doubling year over year. Our data moat is growing exponentially with more than 20 billion miles of telematics data, leading to more predictive models and new insights into how we can better understand and reduce risk. Altogether, our loss ratio, efficiency, and customer experience are redefining what can be done in the industry.

With $170+ million raised, including an industry-leading Series C round in January 2025, we’re only accelerating our growth, with strong support from top-tier VCs including Lightspeed, General Catalyst, and Valor. Nirvana’s leadership team has previously helped scale multi-billion-dollar companies from scratch, including Samsara, Rubrik, and Flexport, and includes industry veterans from Hiscox, The Hartford, and RLI.

About the Role

Your work will power the infrastructure behind every product line and engineering team at Nirvana. By keeping our systems performant, reliable, secure, and cost‑efficient, you will enable faster delivery, smoother developer workflows, and a consistently excellent experience for customers company‑wide.

Set the strategy: Own the infra roadmap, make buy‑vs‑build decisions, and align investments with Nirvana’s product goals.
Lead the team: Grow and mentor a high‑performing group of builders, foster a culture of ownership and experimentation.
Build critical systems: Design, ship, and operate the cloud‑native foundations that every engineering team relies on.
Champion best practices: Drive security, reliability, and cost efficiency through automation and clear standards.
Own Core Systems: Architect, build, and operate Nirvana’s cloud-native data and compute infrastructure, ensuring scalability, security, and reliability as we continue to integrate and process billions of events per day.
Drive Platform Excellence: Partner closely with product, data, and engineering teams to deliver a robust developer experience, high-availability services, automated deployments, and observability across the stack.
Elevate Team & Culture: Help shape a high-performing, collaborative engineering culture. Mentor engineers and champion best infra practices as we scale.
Build vs Buy decisions: You will be involved in critical decisions that will define & shape up the future of our Infrastructure involving “buying” vs “build it in-house” decisions

What You’ll Work On

Platform Reliability & Automation: Build automation for cloud resource provisioning, CI/CD pipelines, end-to-end monitoring, and incident response. Lead efforts to improve reliability, latency, and system self-healing.
Cost Optimization & Observability: Develop strategies and tools to optimize for performance, cost, and resource efficiency. Define SLOs/SLAs, implement metrics & dashboards, and drive root-cause analysis.
Workflow Orchestration: Level‑up our in‑house Temporal‑style engine with rich features, frictionless devX, and zero‑downtime upgrades.
ML & Compute Foundations: GPU/CPU pools and one‑click model rollouts that let data scientists push to production in hours.
Observability & Reliability: Unified metrics/traces/logs, SLO dashboards, and automated chaos/self‑healing to keep everything fast and stable.
Developer Experience Tooling: Golden‑path templates, GitOps workflows, and bespoke CLI/IDE plugins that turn infrastructure into a superpower.

About You

You are an owner. You take end-to-end responsibility for mission-critical infra and are comfortable making architecture decisions with long-term impact.
You ship. You thrive on delivering reliable infra improvements into production and enabling others to move faster and safer.
You are a craftsman. You take pride in building high-quality, resilient systems and automating toil wherever you see it.
You love simplicity. You favor elegant, maintainable solutions to complex scaling challenges, and push for automation and standardization.
You’re a lifelong learner. You excel at picking up new infra technologies, cloud stacks, and approaches—and relish driving continued technical evolution.
You value clear, open communication—whether through code reviews, runbooks, architecture docs, or incident retrospectives.

Requirements

7+ years building large-scale backend and infrastructure systems (e.g., with Golang, Scala, Java, or C++), including significant time designing, operating, and improving distributed systems and stateful services.
Deep experience with cloud-native infrastructure (AWS, GCP, or Azure) and containerization/orchestration (Docker, Kubernetes, etc.).
Demonstrated ability to decompose complex systems into reusable modules, design APIs/interfaces, and drive adoption of platforms or shared tooling.
Experience with infrastructure-as-code, CI/CD, monitoring, and incident management for production systems at scale.
Passion for developer productivity, automation, and working cross-functionally to deliver robust, scalable platforms.

Nice-to-have

Security & Compliance: Implement robust cloud security practices, automate compliance (SOC2, etc.), manage identities/roles, and ensure data privacy across multi-tenant environments.
Cloud‑native security chops—least‑privilege IAM, secrets management (Vault/KMS), container runtime hardening, OPA/Kyverno policy‑as‑code, and automated vulnerability scanning.
Experience building ML infrastructure (GPU scheduling, feature stores, model serving, experiment tracking).
Background in compliance automation (SOC 2, ISO 27001) and incident response/threat modeling.
Familiarity with high‑throughput streaming systems (Kafka, Kinesis) and time‑series databases for telemetry.

What You’ll Get from Us