Site Reliability Engineer
USA - Remote
Close
Close is the inside sales CRM of choice for startups and SMBs. Make more calls, send more emails and close more deals starting today.Close is a bootstrapped, profitable, 100% remote, ~100 person team of thoughtful individuals who prioritize taking ownership and making a meaningful impact. Weāre eager to make a product our customers fall in love with over and over again.
We š small scaling businesses. Since 2013, weāve been building a CRM that focuses on better communication, without the hassle of manual data entry or a complex UI. We are out to supercharge sales productivity with the most modern, thoughtfully designed, all-in-one, communication-focused CRM. Our backend tech stack consists primarily of Python Flask web apps with our TaskTiger scheduler handling many of the backend asynchronous task processing chores. Our data stores include MongoDB, PostgreSQL, Elasticsearch, and Redis. The underlying infrastructure runs on AWS using a combination of managed services like EKS, MSK, RDS and ElasticCache and non-managed services running on EC2 instances. We have CI/CD pipelines that build Docker images, run automated tests and deploy to Kubernetes clusters. We also use these images in our local development environment allowing coding locally against all of our services. We have a well-documented public API that is consumed by our front-end JavaScript app as well as numerous integrations. Our infrastructure is heavily automated using Terraform, Ansible and other AWS tools.
Our product development process is inspired by Shape Up. We love open sourcing our code and ideas on our GitHub and on The Making of Close, our behind-the-scenes Product & Engineering blog. Check out our open source projects like close-mongo-ops-manager, SocketShark, TaskTiger, LimitLion and ciso8601.
About the RoleYou will be joining the Infrastructure Team at Close. This team builds and maintains the platform that runs all Close systems (and do we have a lot of those). Work with us and youāll be working with:
Multi-terrabyte MongoDB, PostgreSQL, and Elasticsearch clusters
Telemetry systems built on Grafanaās LGTM stack and ClickHouse processing over 130 TB per month
Multiple Kubernetes clusters running tens of thousands of pods
Github Actions & ArgoCD powered CI/CD that can go from merged, to production, to rolled back in 10 minutes
A system that is stable, up to date, and hasnāt needed scheduled downtime in 4 years
You are a rock in the storm. With your hard won expertise, gained through battles won and lost, you consistently build robust systems from quality components fit to underpin mission critical applications. You value simplicity over familiarity. You value resilience over speed. You take pride in building composable and maintainable tools.
Youāve worked with a diverse array of infrastructure tools and systems, including:
CICD (CircleCI, GitHub Actions, ArgoCD)
Configuration Management (Ansible, Terraform)
Databases (Elasticsearch, MongoDB, PostgreSQL, ClickHouse)
Cloud Computing (Kubernetes, AWS)
Telemetry (Loki, Tempo, Grafana, Mimir/Prometheus)
You're comfortable working in a fast-paced environment with a small and talented team where you're supported in your efforts to grow professionally. You're able to manage time well, communicate effectively, and collaborate in a fully distributed team.
Fully automating our databaseās lifecycles with Argo Workflow
Eliminating all static credentials where they may be
Reducing downtime and disruption due to maintenance or disaster to new lows
Help us improve our multi-region disaster recovery system.
Senior 1 & 2 level candidates should have 5+ years of experience building modern infrastructure systems.
Staff level candidates should have 8+ years of experience.
The buck stops with you! You are the kind of person who is respected as an expert on the systems you run.
You have been the final point of escalation in the support of mission critical production systems
You are familiar with some of the following technologies: AWS, Terraform, Kubernetes, Ansible, MongoDB, PostgreSQL, Elasticsearch
You have a strong grasp of common networking and data transfer protocols such as DNS, HTTP, TCP
You are able to speak and write in English
You are located in the USA (ET, CT, MT, PT)
Contributed open source code related to our tech stack.
Have experience maintaining very large databases
Has been through a successful disaster response
Have experience with multi-region architectures
Have run MLOps systems
Experience scaling Temporal
Competitive compensation including an organization-wide goal-based bonus
Paid Time Off: 5 Weeks PTO upon joining + Winter Holiday Break and Summer Holiday Break. Each year with the company, youāll receive 2 additional PTO days
80% Work Option: Work with your manager to choose between working 5 day weeks (standard full-time) or 4 day weeks @ 80% pay
Paid Parental Leave for primary and secondary caregivers
Sabbatical: After 5 years with the team, youāre eligible for a 1 month paid sabbaticalĀ
Healthcare (US residents): Medical, Dental, Vision with HSA option (US residents), Dependent care FSA (US residents)
401k (US residents): We match 6% contributions with immediate vesting
Build a house you want to live in - Examine long-term thinking and action
No BS - Practice transparency and honesty, especially when itās hard
Invest in each other - Build successful relationships with your coworkers and customers
Discipline equals freedom - Keep your word to yourself and others
Strive for greatness - Constantly challenge yourself and others
Learn MoreListen to our CEO and Founder, Steli Efti, tell the story of Closeās journey in the $0-30m Blueprint.Ā
Watch our culture video from our 2023 team retreat in Milan. Every year our entire team gathers in person to build connection, foster cross-functional collaboration, and have fun. In 2025, weāre headed to Paris, France.Ā
Explore our product. You can watch a ten-minute video demo on our home page.
Our Hiring ProcessWe ask a few role-specific questions as part of our application process. These questions are designed to help us learn more about you from the start so please answer each question thoughtfully. We see this as an opportunity to get to know you beyond your resume.
While we are excited by all the opportunities that generative AI has unlocked, we request that you refrain from relying on AI tools when completing an application. Every application is read closely by humans and any obviously AI generated applications will be disregarded.
Regardless of fit, you can expect to hear back from our team with an update on the status of your candidacy.Ā
If you progress to the interview process, youāll receive a full outline of the role-specific interview process in your first touchpoint with us. We do our best to make the hiring process clear and human.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index š°
Tags: Ansible APIs Architecture AWS CI/CD Docker EC2 Elasticsearch Engineering Flask Generative AI GitHub Grafana JavaScript Kubernetes MLOps MongoDB Open Source Pipelines PostgreSQL Python Terraform
Perks/benefits: 401(k) matching Career development Competitive pay Health care Medical leave Paid sabbatical Parental leave Startup environment Transparency
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.