Senior Software Engineer, HPC Platform Modernization
Foster City, CA
Zoox
Zoox is reinventing personal transportation with an all-electric, fully autonomous ride-hail vehicle, built for riders, not drivers.
Zoox is looking for an experienced Software Engineer to work on key new frameworks and infrastructure modernization for our custom High-Performance Computing infrastructure and its supporting ecosystem of tools and services. Zoox HPC services combine industry-best scheduling and workload orchestration technologies, such as Ray.io and SLURM, with value-add workflows specifically for Autonomous Vehicle development. These HPC services form the backbone of development workflows across all Zoox software teams, from data engineering to training our AI models in Perception, Planner, Prediction, to simulation, and more. You will take on a breadth of end-to-end responsibilities including distributed system design, algorithmic job scheduling, and adaptive cloud scaling in support of all of Zoox’s computational needs.
The position comes with a high degree of independence and the opportunity to help define Zoox’s compute scaling strategy, both technically and organizationally. You will work closely with stakeholders in Autonomy and Software teams to iterate on world-class developer experiences, incorporating the latest industry tools and best practices.
The position comes with a high degree of independence and the opportunity to help define Zoox’s compute scaling strategy, both technically and organizationally. You will work closely with stakeholders in Autonomy and Software teams to iterate on world-class developer experiences, incorporating the latest industry tools and best practices.
In this role, you will:
- Evaluate new distributed system paradigms and technologies to meet Zoox’s ever-growing computational and storage needs
- Strike a balance between incremental improvements to Zoox’s existing in-house HPC infrastructure and greenfield services and abstractions.
- Create production-grade web service APIs, SDKs, and other tools to provide a world-class developer experience for all of Zoox’s software teams.
Qualifications
- 7+ years of experience
- Experience with Ray.io, particularly Ray Core and Ray Data
- Experience with Kubernetes, particularly for heterogeneous workloads and clusters
- Experience with Ray.io and Kubernetes deployed on Amazon Web Services (AWS) or other similar cloud providers such as Azure or GCP
- Proficiency with Python
Bonus Qualifications
- Exposure to machine learning workloads (training, inference, data generation, etc) from a compute infra service provider perspective
- Experience with Kubernetes or SLURM at scale (>10k+ nodes)
- Experience with SLURM workload manager
Job stats:
0
0
0
Category:
Engineering Jobs
Tags: APIs AWS Azure Engineering GCP HPC Kubernetes Machine Learning Python
Perks/benefits: Career development Equity / stock options Health care Insurance Salary bonus Signing bonus
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Data Engineer II jobsStaff Machine Learning Engineer jobsBI Developer jobsData Scientist II jobsStaff Data Scientist jobsPrincipal Data Engineer jobsData Manager jobsSenior AI Engineer jobsJunior Data Analyst jobsData Science Manager jobsResearch Scientist jobsBusiness Data Analyst jobsPrincipal Software Engineer jobsData Specialist jobsLead Data Analyst jobsData Science Intern jobsData Analyst Intern jobsSoftware Engineer II jobsSr. Data Scientist jobsData Analyst II jobsData Engineer III jobsSoftware Engineer, Machine Learning jobsBI Analyst jobsAzure Data Engineer jobsJunior Data Engineer jobs
Snowflake jobsEconomics jobsLinux jobsOpen Source jobsData Warehousing jobsRDBMS jobsComputer Vision jobsGoogle Cloud jobsAirflow jobsHadoop jobsKafka jobsNoSQL jobsMLOps jobsJavaScript jobsBanking jobsData warehouse jobsKPIs jobsClassification jobsScikit-learn jobsScala jobsPhysics jobsOracle jobsLooker jobsStreaming jobsTerraform jobs
PostgreSQL jobsSAS jobsPySpark jobsR&D jobsBigQuery jobsData Mining jobsScrum jobsPandas jobsGitHub jobsCX jobsDistributed Systems jobsJira jobsRedshift jobsdbt jobsIndustrial jobsRobotics jobsUnstructured data jobsMicroservices jobsPharma jobsReact jobsJenkins jobsData strategy jobsMySQL jobsE-commerce jobsGPT jobs