Engineering Manager, ML Training Platform
Foster City, CA
Zoox
Zoox is a purpose-built autonomous vehicle designed for riders, not drivers. Learn more about the Zoox robotaxi and the future of ride-hailing.
Zoox is on a mission to reimagine transportation and ground-up build autonomous robotaxis that are safe, reliable, clean, and enjoyable for everyone. We are still in the early stages of deploying our robotaxis on public roads, and it is a great time to join Zoox and have a significant impact in executing this mission. The ML Platform team at Zoox plays a crucial role in enabling innovations in ML and CV to make autonomous driving as seamless as possible.
The OpportunityAre you excited to manage our ML Training Platform that enables autonomous driving? You will get to work across all ML teams within Zoox - Perception, Prediction, Planner, Simulation, Collision Avoidance, Data Science, etc., and have the opportunity to significantly push the boundaries of how ML is practiced within Zoox.This team builds and operates the core part of the ML platform that powers model training at scale. We are responsible for developing and operating ML tools, deep learning frameworks, and distributed model training infrastructure to support foundational models and reinforcement learning. This team also owns the model repository and model lifecycle management tools used by our applied research teams for in- and off-vehicle ML use cases. You will lead a team of strong software engineers and act as a force multiplier for our internal customers. This team has a lot of growth opportunities as we expand our robotaxi deployments and venture into new ML domains. If you want to learn more about our stack behind autonomous driving, please look here.
Vaccine MandateEmployees working in this position will be required to have received a vaccine approved by the U.S. Food and Drug Administration and/or the World Health Organization. In addition, employees who are eligible for a COVID-19 booster vaccine (“Booster”) will be required to receive a Booster. Employees will be required to show proof of vaccination status upon receipt of a conditional offer of employment. That offer of employment will be conditioned upon, among other things, an Applicant’s ability to show proof of vaccination status. Please note the Company provides reasonable accommodations in accordance with applicable state, federal, and local laws.
The OpportunityAre you excited to manage our ML Training Platform that enables autonomous driving? You will get to work across all ML teams within Zoox - Perception, Prediction, Planner, Simulation, Collision Avoidance, Data Science, etc., and have the opportunity to significantly push the boundaries of how ML is practiced within Zoox.This team builds and operates the core part of the ML platform that powers model training at scale. We are responsible for developing and operating ML tools, deep learning frameworks, and distributed model training infrastructure to support foundational models and reinforcement learning. This team also owns the model repository and model lifecycle management tools used by our applied research teams for in- and off-vehicle ML use cases. You will lead a team of strong software engineers and act as a force multiplier for our internal customers. This team has a lot of growth opportunities as we expand our robotaxi deployments and venture into new ML domains. If you want to learn more about our stack behind autonomous driving, please look here.
In this role, you will
- Vision: Develop and execute a strategic vision for our ML training platform, ensuring scalability, reliability, and performance to support large-scale Foundation and RL models.
- Technical acumen: Lead the design, implementation, and operation of a robust and efficient ML training platform to enable the training, experimentation, validation, and monitoring of ML models.
- Hiring: Attract, hire, and inspire a diverse world-class engineering team, fostering a culture of innovation, collaboration, and excellence.
- Partnership: Collaborate closely with cross-functional teams, including ML researchers, software engineers, data engineers, and hardware engineers to define requirements and align on architectural decisions.
- Mentorship: Enable the engineers in the team to grow their careers by providing the right opportunities along with clear and timely feedback.
Qualifications
- 8+ years of total experience, including 3+ years of engineering management experience.
- Excellent leadership skills with a demonstrated ability to build and manage high-performing engineering teams.
- Experience enabling large-scale, cost-efficient distributed model training and ML compute infrastructure.
- Experience with training frameworks such as PyTorch, Hugging Face, Ray, DeepSpeed, JAX, etc., leveraging GPUs, TPUs, or Trainium.
- Experience building model lifecycle management tools and managing AWS costs for our ML needs.
Vaccine MandateEmployees working in this position will be required to have received a vaccine approved by the U.S. Food and Drug Administration and/or the World Health Organization. In addition, employees who are eligible for a COVID-19 booster vaccine (“Booster”) will be required to receive a Booster. Employees will be required to show proof of vaccination status upon receipt of a conditional offer of employment. That offer of employment will be conditioned upon, among other things, an Applicant’s ability to show proof of vaccination status. Please note the Company provides reasonable accommodations in accordance with applicable state, federal, and local laws.
Job stats:
0
0
0
Tags: Autonomous Driving AWS Deep Learning Engineering JAX Machine Learning ML models Model training PyTorch Reinforcement Learning Research
Perks/benefits: Career development Equity / stock options Health care Insurance Salary bonus Signing bonus Startup environment
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Sr. Data Engineer jobsBusiness Intelligence Developer jobsPower BI Developer jobsBI Developer jobsStaff Data Scientist jobsStaff Machine Learning Engineer jobsPrincipal Software Engineer jobsData Science Intern jobsDevOps Engineer jobsJunior Data Analyst jobsData Science Manager jobsSoftware Engineer II jobsData Manager jobsData Analyst Intern jobsLead Data Analyst jobsStaff Software Engineer jobsBusiness Data Analyst jobsAI/ML Engineer jobsAccount Executive jobsSr. Data Scientist jobsData Specialist jobsData Governance Analyst jobsSenior Backend Engineer jobsBusiness Intelligence Analyst jobsData Engineer III jobs
Consulting jobsMLOps jobsAirflow jobsOpen Source jobsEconomics jobsLinux jobsKPIs jobsKafka jobsTerraform jobsJavaScript jobsGitHub jobsData Warehousing jobsPostgreSQL jobsRDBMS jobsNoSQL jobsScikit-learn jobsStreaming jobsComputer Vision jobsClassification jobsBanking jobsPrompt engineering jobsPhysics jobsGoogle Cloud jobsRAG jobsOracle jobs
Pandas jobsHadoop jobsdbt jobsBigQuery jobsScala jobsR&D jobsLooker jobsData warehouse jobsGPT jobsReact jobsScrum jobsLangChain jobsPySpark jobsDistributed Systems jobsELT jobsMicroservices jobsIndustrial jobsCX jobsJira jobsSAS jobsRedshift jobsOpenAI jobsModel training jobsTypeScript jobsJenkins jobs