Staff Software Engineer — Infrastructure
Hybrid / San Francisco, CA or Redwood City, CA
Full Time Senior-level / Expert USD 200K - 270K
Snorkel AI
Unlock the power of programmatic AI data development to build production AI applications with Snorkel Flow—100x faster!We’re on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to achieving differentiation, high performance, and production-ready systems. We work with some of the world’s largest organizations to empower scientists, engineers, financial experts, product creators, journalists, and more to build custom AI with their data faster than ever before. Excited to help us redefine how AI is built? Apply to be the newest Snorkeler!
As a Software Engineer on the Infrastructure team, you'll accelerate the Snorkel AI team and our customers by improving our developer platform and services for user and data management across the stack. You’ll work closely with other engineers, researchers, and product management to align on the highest leverage improvements for CI/CD, cloud infrastructure, deployment, security, authentication/authorization, and more.
Main Responsibilities
- Deploy and maintain CI/CD and software release pipelines across multiple environments and continuously improve testing frameworks development tooling and deployment best practices
- Define and build our deployment strategy, internal and external, for SaaS-hosted, on prem, and managed service offerings
- Build and maintain Snorkel’s production and staging infrastructure, own our k8s and cloud strategy
- Design, develop, and maintain observability, alarms, and monitoring tools
- Participate in on-call responsibilities in rotation with the engineering team
- Work a hybrid schedule with three days per week in our Redwood City HQ or the SF office and work remotely with "No Meeting" Tuesdays and Thursdays
Required Qualifications
- Bachelor's degree in Computer Science or related field, or equivalent demonstrated experience
- 8+ years of experience in distributed systems and cloud-native applications
- Strong experience with cloud platforms and infrastructure as code (Terraform, CloudFormation, Helm)
- Deep expertise building services on Kubernetes (EKS, GKE etc)
- Regularly follows the best software engineering practices and hold a high bar for the team by leading design, code review and test plan reviews
- Proven ability to lead and mentor teams of engineers.
Preferred Qualifications
- Strong development experience in Python or other language like Java, golang, scala etc
- Extremely well versed in building and managing cloud infrastructure for enterprise platforms on (AWS, GCP, Azure) and services like EC2, EKS, VPC etc
- Experience in one or more of the build tools like Bazel, Gradle, Make etc. Extra points for someone who has hands on experience in building and managing large code bases with these tools
- Designed and implemented developer-friendly APIs or tools to boost developer productivity
- Familiarity in deployment, monitoring and maintenance of large-scale enterprise software products
- Follow the best software development practices, and hold the high engineering bar for the team by regularly leading design, code review and test plan reviews
- Experience working cross-functionally across teams including product, design, customer success and support
- Familiarity in developing and releasing infrastructure software for SaaS and on-prem platforms
- Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others
- [Nice to have]: Hands-on experience setting up and operating Kubernetes clusters in production at scale
- [Nice to have]: Experience leading teams building large scale distributed computing systems for ML Training or Serving, eg: Ray, Spark, Tensorflow etc
- [Nice to have]: Hands-on experience in creating and maintaining metrics and dashboards on observability platforms such as New Relic, DataDog, Chronosphere, or similar tools
- [Nice to have]: Experience building services and infrastructure for Machine learning and AI Systems
The salary range for this position based in the San Francisco Bay Area is $200,000.00 - $270,000.00. All offers include equity compensation in the form of employee stock options.
Be Your Best At Snorkel Snorkel AI is on a mission to make machine learning practical for everyone, and it starts with building a team that welcomes, represents and gives opportunity to all. We work at the frontier of AI and software engineering, and believe that underrepresented communities need to play a part in shaping the future of these fields. At Snorkel AI, we actively work to create an environment that values end-to-end ownership, diverse forms of impact, and opportunities for personal growth. Snorkelers are supported by an amazing team and an amazing set of benefits. For Full-time employees, we offer comprehensive medical, dental, and vision plans for Snorkelers and their families, plus a yearly wellness stipend. Our 401k program lets Snorkelers plan for their future and our parental leave program lets new parents take up to 20 weeks of paid time off. Learn more about these benefits and more — like our workstation setup allowance — on our Careers page. Snorkel AI is proud to be an Equal Employment Opportunity employer and is committed to building a team that represents a variety of backgrounds, perspectives, and skills. Snorkel AI embraces diversity and provides equal employment opportunities to all employees and applicants for employment. Snorkel AI prohibits discrimination and harassment of any type on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local law. All employment is decided on the basis of qualifications, performance, merit, and business need. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.Tags: APIs AWS Azure Bazel CI/CD CloudFormation Computer Science Data management Distributed Systems EC2 Engineering GCP Generative AI Golang Helm Java Kubernetes Machine Learning Pipelines Python Research Scala Security Spark TensorFlow Terraform Testing
Perks/benefits: 401(k) matching Career development Equity / stock options Health care Medical leave Parental leave Wellness
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.