Site Reliability Engineer - Data Infrastructure, AD/ADAS

London

Woven by Toyota

Woven by Toyota will help Toyota to develop next-generation cars and to realize a mobility society in which everyone can move freely, happily and safely.

View all jobs at Woven by Toyota

Apply now Apply later

Woven by Toyota is enabling Toyota’s once-in-a-century transformation into a mobility company. Inspired by a legacy of innovating for the benefit of others, our mission is to challenge the current state of mobility through human-centric innovation — expanding what “mobility” means and how it serves society.
Our work centers on four pillars: AD/ADAS, our autonomous driving and advanced driver assist technologies; Arene, our software development platform for software-defined vehicles; Woven City, a test course for mobility; and Cloud & AI, the digital infrastructure powering our collaborative foundation. Business-critical functions empower these teams to execute, and together, we’re working toward one bold goal: a world with zero accidents and enhanced well-being for all.
TEAMOur data platform team is working on accelerating autonomous driving by providing access to petabytes of data collected by our fleet of autonomous and non-autonomous vehicles. Efficient, fast and cost-effective access to data at large scale is key to tackle the hardest problems in AD/ADAS, from developing the Machine Learning (ML) models for perception and prediction of human driving patterns, to increasing the sophistication of our validation and simulation by identifying rare and interesting real-world driving situations. The data ecosystem developed by the Data Infrastructure team is a key building block for developing and testing modern AD/ADAS products that will impact millions of customers.
Our ML and Data pipelines are built on-top of the open-source Flyte orchestration framework and are deployed to AWS. Pipeline code is written in Python. We leverage AWS S3, GCP BigQuery and ElasticSearch for data storage and search. We schedule our workloads on AWS EKS. Our infrastructure is spread across multiple regions and multiple cloud providers. We believe strongly in automation and testing to ensure delivery of robust and correct systems.  We are a distributed team, working in Japan, the UK and the US.
WHO ARE WE LOOKING FOR?The Cloud and Data Infrastructure team is looking for engineers who are passionate about and enable the next generation of automotive software development. The right candidate will have excellent communication skills, solid coding skills, expertise in building scalable, reliable, highly available and fault-tolerant systems, broad knowledge of software engineering and site reliability engineering in areas such as Large-Scale Data and Compute Infrastructure, Stream Processing, Kubernetes, High-Performance Networking, Observability and Infrastructure Automation.

RESPONSIBILITIES

  • Set the technology strategy for our cloud infrastructure, factoring in the AD/ADAS cloud development needs and our cloud cost optimisation targets
  • Design, build, maintain, optimize and support large scale, multi-region, multi-cloud compute and storage infrastructure powering our data platform and mission critical services.
  • Work with fellow Data Infrastructure engineers and Site Reliability engineers to ensure our systems are scalable, reliable, fault-tolerant, highly available, highly performant, and observable.
  • Manage incidents, triage product or system issues and debug/track/resolve by analyzing the root cause of these issues and the impact on users & operations.
  • Work closely with other Data Infrastructure engineers, Site Reliability engineers, ML Platform engineers, Computer Vision and ML engineers on high-impact projects to create innovative solutions to problems in the self-drive space.
  • Mentor junior engineers in their day to day work and drive best engineering practices across the organization.
  • Contribute to the long term strategy for several of our systems and products.

MINIMUM QUALIFICATIONS

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • Experience as a  professional software engineer in one or more programming languages (e.g., Python, Go, Java, C, C++).
  • Experience as a Site Reliability Engineer, working with Terraform, Docker, cloud-native technologies, networking and Kubernetes in production.
  • Experience designing, deploying, monitoring and maintaining large-scale, fault-tolerant multi-region and/or multi-cloud distributed systems.
  • Ability to debug & optimize code, to troubleshoot distributed systems and to automate routine tasks.

NICE TO HAVES

  • Master’s degree in Computer Science.
  • Experience working as a Software Engineer on data-intensive applications, data platforms, data pipelines, workflow orchestration, batch processing, and/or distributed databases.
  • Previous experience in monitoring, tracking and optimising cloud compute and storage costs
  • Experience working with RPC protocols and their formats, e.g., gRPC/protobuf, Apache Avro, etc.
  • Experience with cloud-based (e.g. AWS, GCP, Azure) microservice architecture, event-driven, distributed architectures.
  • Experience working in a fast-paced environment, collaborating across teams and disciplines.
  • Experience with data governance, data privacy and security.
WHAT WE OFFERWe are committed to creating a modern work environment that supports our employees and their loved ones. We offer many options of the best programs to allow you to do your most meaningful work and to help you shape the future of mobility.・Excellent health, wellness, dental and vision coverage・A rewarding pension・Flexible vacation policy・Family planning and care benefits
Our Commitment・We are an equal opportunity employer and value diversity.・Any information we receive from you will be used only in the hiring and onboarding process. Please see our privacy notice for more details.
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: Architecture Autonomous Driving Avro AWS Azure BigQuery Computer Science Computer Vision Data governance Data pipelines Distributed Systems Docker Elasticsearch Engineering GCP Java Kubernetes Machine Learning Open Source Pipelines Privacy Python Security Terraform Testing

Perks/benefits: Career development Flex hours Flex vacation Health care Wellness

Region: Europe
Country: United Kingdom

More jobs like this