Staff Software Engineer, Metrics - US (Remote)
San Francisco, California
- Remote-first
- Website
- @weights_biases 𝕏
- GitHub
- Search
Weights & Biases
Weights & Biases, developer tools for machine learningWeights & Biases is a Series C company with $250M in funding and over 200 employees. We proudly serve over 1,000 customers and more than 30 foundation model builders including customers such as OpenAI, NVIDIA, Microsoft, and Toyota.
As a Staff Engineer, you'll lead the effort to scale our metrics and storage systems, ensuring they meet the complex demands of our most advanced customers. You’ll play an instrumental role in the evolution of our platform as we grow our capability to ingest and query petabytes of data, making critical technical decisions that optimize the performance, reliability, and cost-effectiveness of our systems.
You will set the technical direction for the team, guiding the organization to balance short-term deliverables with strategic, long-term architectural improvements. You'll partner closely with product management, revenue teams, and other engineering groups to shape and deliver the future of W&B’s flagship Models product, supporting experiment tracking and analytics utilized by over 2,500 leading machine learning and AI teams worldwide.
Responsibilities:
- Design and implement infrastructure that is scalable, efficient, and tailored to customer needs.
- Lead the maintenance and monitoring of existing services, identifying and executing necessary improvements to ensure ongoing performance and reliability.
- Participate in team-wide rotations to respond to customer support issues and site outages.
- Communicate and collaborate effectively with internal and external stakeholders to achieve optimal outcomes.
- Lead and mentor junior engineers, supporting their professional growth and development within the company.
Requirements:
- 8+ years of experience in software engineering, with a focus on data platforms and/or distributed systems.
- Strong software engineering fundamentals and proficiency in at least one modern programming language (e.g., Python, Go, Typescript).
- Extensive experience designing and scaling customer-facing APIs in production environments, ideally leveraging systems like MySQL, Postgres, Clickhouse, Bigtable, Pub/Sub, Kafka, etc.
- Hands-on experience with Kubernetes, Terraform, and major cloud providers (e.g., GCP, AWS, Azure).
Our benefits:
- 🏝️ Flexible time off
- 🩺 Medical, Dental, and Vision for employees and Family Coverage
- 🏠 Remote first culture with in-office flexibility in San Francisco
- 💵 Home office budget with a new high-powered laptop
- 🥇 Truly competitive salary and equity
- 🚼 12 weeks of Parental leave (U.S. specific)
- 📈 401(k) (U.S. specific)
- Supplemental benefits may be available depending on your location
- Explore benefits by country
#LI-Remote
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs AWS Azure Bigtable Deep Learning Distributed Systems Engineering GCP Generative AI Kafka Kubernetes Machine Learning MySQL OpenAI PostgreSQL Python Terraform TypeScript Weights & Biases
Perks/benefits: Career development Competitive pay Equity / stock options Flex vacation Gear Health care Medical leave Parental leave Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.