Senior Data Scientist, Machine Learning

Los Angeles, CA

Serve Robotics

Why move a 2-pound burrito in a 2-ton car? Meet Serve, the future of sustainable, self-driving delivery.

View all jobs at Serve Robotics

Apply now Apply later

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses. The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los Angeles while doing commercial deliveries.

The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los Angeles while doing commercial deliveries. We’re looking for talented individuals who will grow robotic deliveries from surprising novelty to efficient ubiquity.

Who We Are

We are tech industry veterans in software, hardware, and design who are pooling our skills to build the future we want to live in. We are solving real-world problems leveraging robotics, machine learning and computer vision, among other disciplines, with a mindful eye towards the end-to-end user experience. Our team is agile, diverse, and driven. We believe that the best way to solve complicated dynamic problems is collaboratively and respectfully.

What you'll be doing

Serve Robotics aims to develop dependable and proficient sidewalk autonomy software. We are looking for a talented Senior Data Scientist who bridges the gap between ML infrastructure and ML engineers. The ideal candidate possesses strong fundamentals in machine learning, with the ability to prototype and train learning-based models using data-centric techniques. This individual should also have expertise in data ETL processes, SQL queries, and building scalable data pipelines to make data accessible for model training.

Responsibilities

  • Prototype and train learning-based models using a data-centric approach, applying techniques such as automated feature engineering, active learning, and fine-tuning on curated datasets.

  • Design, develop, and maintain efficient data and feature extraction pipelines to support ML engineers in accessing high-quality data for model training.

  • Design auto labeling system using ensemble of models that can reason from multi-modal data for different use-cases, For example: image semantic labeling using vision grounded models, intent and path prediction ground truth.

  • Perform complex data extraction, transformation, and loading (ETL) processes, ensuring data is clean, accessible, and well-documented. Write and optimize high-quality SQL queries for data analysis and ingestion from various sources.

  • Partner with data infrastructure and ML engineers to ensure seamless integration of data and machine learning workflows.

  • Produce high-quality, maintainable code and participate in peer code reviews to share knowledge and uphold team standards.

Qualifications

  • Master’s in Computer Science, Data Science, or a related technical field and 5+ years of industry experience in data engineering, machine learning, or a similar domain.

  • Strong proficiency in Python and SQL, with demonstrated experience building data pipelines at scale and ETL workflows that cater to multi-modal data (e.g., images, point clouds, time-series data).

  • Proven ability to work with PB’s of datasets, including structured, semi-structured, and unstructured data.

  • Hands-on experience working with ML frameworks such as TensorFlow, PyTorch, or similar.

  • Solid understanding of ML fundamentals and data-centric techniques for model training.

  • Experience with cloud platforms (GCP, AWS, or Azure) and tools like Kubernetes, Docker, and Airflow.

  • Excellent communication skills and the ability to collaborate with cross-functional teams.

What makes you standout

  • Experience optimizing ML workflows using MLOps tools such as MLflow, TFX, Kubeflow, or similar platforms.

  • Strong understanding of transformer-based models and their application in data-centric AI workflows.

  • Knowledgable in advanced SQL query optimization and ETL pipeline performance tuning.

  • Familiarity with tools for scalable data engineering, such as Apache Beam, Dask, or BigQuery.

Apply now Apply later
Job stats:  1  0  0

Tags: Agile Airflow AWS Azure BigQuery Computer Science Computer Vision Data analysis Data pipelines Docker Engineering ETL Feature engineering GCP Kubeflow Kubernetes Machine Learning MLFlow ML infrastructure MLOps Model training Pipelines Python PyTorch Robotics SQL TensorFlow TFX Unstructured data

Regions: Remote/Anywhere North America
Country: United States

More jobs like this