DevOps Engineer
Remote (US)
Tecton
Tecton makes it simple to activate data for smarter AI. Our platform abstracts away all of the complex data engineering to get data to models.Tecton’s founders developed the first Feature Store when they created Uber’s Michelangelo ML platform, and we’re now bringing those same capabilities to every organization in the world.
Tecton is funded by Sequoia Capital, Andreessen Horowitz, and Kleiner Perkins, along with strategic investments from Snowflake and Databricks. We have a fast-growing team that’s distributed around the world, with offices in San Francisco and New York City. Our team has years of experience building and operating business-critical machine learning systems at leading tech companies like Uber, Google, Meta, Airbnb, Lyft, and Twitter.
As a member of Tecton’s Infrastructure Engineering DevOps team, you will contribute to and own the foundation for building, automating, and scaling Tecton. You will leverage your experience with cloud architectures, distributed systems, containerization technologies (Kubernetes), and Linux system internals to design, build, and maintain our multi-cloud deployments, ensure our systems are secure in-depth, and work closely with the rest of Tecton’s Infrastructure Engineering team to scale and optimize our core compute and online serving systems.
Prior experience with machine learning is not required. We are looking for exceptional DevOps, infrastructure, and software engineers who are driven to find simple solutions to complex challenges. You'll be at the intersection of design, engineering, and operational processes.
Responsibilities
- Own the complete lifecycle of Tecton’s cloud infrastructure development from design through automation, deployment, and operation
- Engage with other engineering and solutions teams to build tools that will accelerate engineering and deployment efficiency
- Develop and maintain infrastructure and tooling to monitor observability of Tecton health, availability, latency
- Joint ownership building and managing Tecton’s CI/CD system to reliably deploy production components with a GitOps model; Including the multi-language, multi-platform Build System based on Bazel
- Participate in an on-call rotation, triaging and addressing Tecton platform major incidents
Qualifications
- Engineer with 5+ years experience in DevOps, SRE, or Software Engineering
- Experience with infrastructure-as-code tools such as Terraform
- Fluent in one or more programming languages such as Python or Golang
- Expertise in cloud providers such as AWS, Google Cloud, and/or Microsoft Azure
- Experience building and troubleshooting robust and secure networks
- Experience with microservices & container orchestration such as Kubernetes
- Expertise in observability stack (Prometheus, ELK, Chronosphere, Datadog, etc.)
- A passion for excellence and high developer productivity
- Strong and effective verbal and written communication skills
- In-depth experience with Linux systems administration and troubleshooting
Nice to have
- Experience building reliable CI/CD pipelines (Github, CircleCI, Buildkite, etc.)
- Experience with Kubernetes configuration management tools (Helm, Kustomize, etc.)
- Experience with GitOps tools (Flux CD, Argo CD)
- Experience with on-call rotation and support of production environments
- Experience working with complex Build Systems (Bazel in particular)
- Experience working with large scale data infrastructure or batch/streaming data pipelines
This employer participates in E-Verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Azure Bazel CI/CD Databricks Data pipelines DevOps Distributed Systems ELK Engineering GCP GitHub Golang Google Cloud Helm Kubernetes Linux Machine Learning Microservices Pipelines Python Snowflake Streaming Terraform
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.