Senior Data Scientist
Los Altos, CA
Full Time Senior-level / Expert USD 201K - 302K
Toyota Research Institute
The Discover, Nurture, and Adopt (DNA) division at TRI focuses on enabling innovation and transformation at Toyota by building a bridge between TRI research and Toyota products, services, and needs. We achieve this through partnership, collaboration, and shared commitment. DNA is leading a new cross-organizational project between TRI and Woven by Toyota to research and develop a fully end-to-end learned automated driving / ADAS stack. This cross-org collaborative project is harmonious with TRI’s robotics divisions' efforts in Diffusion Policy and Large Behavior Models (LBM).
We are looking for a data scientist to contribute to the end-to-end model development using Diffusion Policy and Large Behavior Models (LBM) for automated driving. This role involves focusing on analyzing large-scale data, evaluating models, curating datasets, and optimizing performance to enhance automated driving systems. You will also develop algorithms, pipelines and key strategies to enhance and streamline scalable large fleet data and vendor data. You will work closely with machine learning researchers, software engineers, and roboticists to drive execution, manage dependencies, and accelerate the transition of research models into real-world deployment.
This role offers a unique opportunity to drive innovative AI research into real-world deployment, shaping the future of automated driving. If you are passionate about building transformative AI-driven mobility solutions, we invite you to apply and help us advance the next generation of automated driving vehicle technology.
Responsibilities
- Design and implement data strategies for collecting, sampling, labeling, and using large-scale datasets to train and validate Large Behavior Models (LBM) in automated driving scenarios.
- Develop metrics and evaluation frameworks for anomaly detection and trend analysis in data from various sources, including real-world vehicle platforms and simulations.
- Analyze model performance metrics, model failure modes, statistical relevance of datasets, collaborating with machine learning engineers to fine-tune, debug, and optimize models for automated driving tasks.
- Design and improve scalable data pipelines and automation for machine learning and performance evaluation.
- Support the integration of multi-modal data sources (e.g., vision, radar, language, maps) into end-to-end driving models.
- Work closely with cross-functional teams to bridge the gap between research models and production deployment.
Qualifications
- MS or PhD in related fields.
- 5+ years of experience in data science, machine learning, or a related field, with a focus on large-scale AI applications.
- Strong background in statistical analysis, data mining, and machine learning techniques.
- Proficiency in Python and SQL for data manipulation, analysis, and visualization.
- Experience with theoretical aspects of data science and machine learning (deep learning, statistical analysis, and mathematical modeling).
- Experience in building machine learning algorithms and infrastructure, including data pre- and post-processing, sampling and curation, ablation studies, and evaluation.
- Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and ML model evaluation.
- Familiarity with large-scale dataset management, including handling high-dimensional sensor data.
- Strong analytical and problem-solving skills, with the ability to interpret complex datasets and drive actionable insights.
- Experience working in cross-functional AI research and engineering teams.
Bonus Qualifications
- Experience in automated driving technologies, including perception, prediction, planning, or sensor simulation.
- Experience in building or managing infrastructure, such as Docker, Kubernetes, Jenkins, GitHub Actions.
- Experience working with temporal/sequential and/or spatial data.
- Hands-on experience with sensor data processing (e.g., camera, LiDAR, radar, IMU).
- Familiarity with automotive simulation tools and real-world autonomous vehicle testing.
Please reference this Candidate Privacy Notice to inform you of the categories of personal information that we collect from individuals who inquire about and/or apply to work for Toyota Research Institute, Inc. or its subsidiaries, including Toyota A.I. Ventures GP, L.P., and the purposes for which we use such personal information.
TRI is fueled by a diverse and inclusive community of people with unique backgrounds, education and life experiences. We are dedicated to fostering an innovative and collaborative environment by living the values that are an essential part of our culture. We believe diversity makes us stronger and are proud to provide Equal Employment Opportunity for all, without regard to an applicant’s race, color, creed, gender, gender identity or expression, sexual orientation, national origin, age, physical or mental disability, medical condition, religion, marital status, genetic information, veteran status, or any other status protected under federal, state or local laws.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records for employment.
Tags: Data Mining Data pipelines Deep Learning Docker Engineering GitHub Jenkins Kubernetes Lidar Machine Learning ML models PhD Pipelines Privacy Python PyTorch Radar Research Robotics SQL Statistics TensorFlow Testing
Perks/benefits: Career development Medical leave Parental leave Salary bonus
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.