Data Platform Engineer, Enterprise AI

Tokyo

Full Time Mid-level / Intermediate USD 47K - 87K * ^est.

Woven by Toyota

Woven by Toyota will help Toyota to develop next-generation cars and to realize a mobility society in which everyone can move freely, happily and safely.

View all jobs at Woven by Toyota

Apply now Apply later

Posted 1 month ago

About Woven by ToyotaWoven by Toyota, a part of the Toyota Group, is challenging the current state of mobility through human-centric innovation and empowering mobility transformation. Through our AD/ADAS technology, our automotive software development platform Arene OS, our mobility test course Toyota Woven City, and Toyota’s growth fund, Woven Capital, we are pioneering the movement of people, goods, information, and energy, weaving a future of enhanced safety, connectivity and well-being for all.
=========================================================================
TEAMThe Enterprise AI team is dedicated to empowering Toyota and its affiliates with a robust platform for AI innovation. Our mission is to provide a comprehensive, end-to-end machine learning ecosystem that propels the development of groundbreaking projects, such as autonomous driving, within the Toyota Group. As a standardized machine learning platform under Woven by Toyota, we aim to streamline every facet of AI development, from training and inference to MLOps, thereby enhancing the safety, convenience, and autonomy of Toyota vehicles.
Within this dynamic environment, the Data Platform Engineering team plays a pivotal role. We design and implement scalable, globally distributed data delivery solutions tailored for Toyota and its partners. Our team is at the forefront of developing both human-assisted and automated data labeling services, and we engage collaboratively across various model development and AI solution initiatives. Through these efforts, we ensure that data is not only accessible but also actionable, driving innovation and efficiency across the enterprise.
WHO ARE WE LOOKING FOR?As a backend engineer, you will help develop the labeling suites while working with ML engineers distributed across different regions. Expect large datasets, shipping them globally. We aim to change data acquisition and delivery of human/machine-labeled data to expedite global development of machine learning projects.
You will have both technical and communicational skills. As a part of the team, you are a believer in healthy, constructive, and optimistic feedback, as we encourage each other to improve our development practices; refactoring, rewriting legacy code, profiling, code style, and code reviews.

RESPONSIBILITIES

Collaborate with the team lead and software engineers to develop the backend of labeling suites, ensuring both functional and non-functional requirements of the product are met
Design, implement, and deploy features from inception to completion
Enable support for multiple machine learning training data formats and facilitate on-the-fly conversions
Integrate with various data sinks, including machine learning data visualization solutions, to manage datasets owned by the Data Annotations Engineering team
Work closely with frontend developers to establish and maintain API contracts
Report directly to the manager overseeing the Data Annotation Engineering team

MINIMUM QUALIFICATIONS

A minimum of 2 years of experience in Python development, with at least 1 year dedicated to asynchronous Python programming, and a foundational understanding of machine learning
Proficiency in working with large datasets, including databases with extensive rows or documents, and a solid grasp of concurrency, distributed computing, and blob storage
Familiarity with event-driven architectures utilizing multiple message queues (channels)
Knowledge of major RDBMS and NoSQL databases, such as PostgreSQL and MongoDB
Hands-on experience with Kubernetes
Ability to work in the office 3 days per week in accordance with our hybrid work model
Proficiency in English at a business level

NICE TO HAVES

Contributions to open-source projects and the ability to analyze open-source software
Familiarity with spatial/geometry information or experience with vector databases
Experience with PyTorch data loaders and working with 2D/3D-based machine learning training data formats
Understanding of machine learning, with a focus on deep learning
Knowledge of image and point cloud processing techniques
Proficiency in one or more programming languages commonly used in machine learning or massively parallel computing environments

=========================================================================Important Points・All interviews will be arranged via Google Meet, unless otherwise stated.・The same job descriptions are available in both English and Japanese; therefore, we kindly ask that you apply to only one version.・We kindly request that you submit your resume in English, if possible. However, Japanese resumes are also acceptable. Please note that, depending on the English proficiency requirements of the role, we may request an English version of your resume later in the process.
WHAT WE OFFER・Competitive Salary - Based on experience・Work Hours - Flexible working time・Paid Holiday - 20 days per year (prorated)・Sick Leave - 6 days per year (prorated)・Holiday - Sat & Sun, Japanese National Holidays, and other days defined by our company・Japanese Social Insurance - Health Insurance, Pension, Workers’ Comp, and Unemployment Insurance, Long-term care insurance・Housing Allowance・Retirement Benefits・Rental Cars Support・In-house Training Program (software study/language study)
Our Commitment・We are an equal opportunity employer and value diversity.・Any information we receive from you will be used only in the hiring and onboarding process. Please see our privacy notice for more details.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 2 0 0

Categories: Deep Learning Jobs Engineering Jobs

Tags: APIs Architecture Autonomous Driving Data visualization Deep Learning Engineering Kubernetes Machine Learning ML models MLOps MongoDB NoSQL Open Source PostgreSQL Privacy Python PyTorch RDBMS