2025 Summer Intern, PhD, Behavior Research - Vision Language Action Models

Mountain View, CA, USA

Internship Entry-level / Junior USD 120K

Waymo

Waymo—formerly the Google self-driving car project—makes it safe and easy for people & things to get around with autonomous vehicles. Take a ride now.

View all jobs at Waymo

Apply now Apply later

Posted 1 month ago

Waymo is an autonomous driving technology company with the mission to be the most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo One, a fully autonomous ride-hailing service, and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over one million rider-only trips, enabled by its experience autonomously driving tens of millions of miles on public roads and tens of billions in simulation across 13+ U.S. states.

The mission of the Waymo Research team is to develop machine learning solutions addressing open problems in autonomous driving, towards the goal of safely operating Waymo vehicles in dozens of cities and under all driving conditions. As part of our work, we also initiate and foster collaborations with other research teams in Alphabet. Research areas that we are currently focusing on include reinforcement learning, learning from demonstration, generative modeling, Bayesian inference, hierarchical learning, and robust evaluation.

Waymo interns work alongside leaders in the industry on projects that deliver significant impact to the company. We believe learning is a two-way street: leveraging your knowledge while providing you with opportunities to expand your skill-set. Interns are an important part of our culture and our recruiting pipeline. Join us at Waymo for a fun and rewarding internship!

You will:

Extend existing Vision Language Models to Vision Language Action Models
Establish scaling laws for foundational models for the AV space
Explore leveraging language models' reasoning capabilities to improve end-to-end driving and other various driving tasks
Work on the intersection of Large Foundation Models and Robotics for scaling embodied AI domains

You have:

Currently enrolled in a PhD program in computer science, statistics, applied mathematics, physics, or a related technical field of study
Basic software programming or scripting skills (Python, C/C++)
Experience with machine learning, deep learning, Foundational Models, and/or LLMs
Familiar with one of the modern deep learning frameworks (e.g. Pytorch, JAX, Tensorflow)

We prefer:

Hands-on experience with VLMs, multimodal models
Publications from top peer-reviewed conferences (eg: CVPR, ECCV, ICCV, Neurips, ICLR)
Prior industry experience (including internships) in software engineering and computer vision / machine learning

Note: This will be a hybrid onsite internship position. We will accept resumes on a rolling basis until the role is filled. To be in consideration for multiple roles, you will need to apply to each one individually - please apply to the top 3 roles you are interested in.

#LI-Hybrid

The expected hourly rate for this full-time position is listed below. Interns are also eligible to participate in the Company’s generous benefits programs, subject to eligibility requirements.Hourly PhD Pay$60.10—$60.10 USD

Apply now Apply later

Job stats: 4 2 0

Categories: Computer Vision Jobs NLP Jobs Research Jobs

Tags: Autonomous Driving Bayesian Computer Science Computer Vision Deep Learning Engineering Generative modeling ICLR JAX LLMs Machine Learning Mathematics NeurIPS PhD Physics Python PyTorch Reinforcement Learning Research Robotics Statistics TensorFlow