Senior Site Reliability Engineer

Argentina

Full Time Senior-level / Expert USD 68K - 126K *

Clarifai Inc.

Clarifai’s full-stack AI Lifecycle Platform helps you build, train, and deploy AI faster. Scale AI securely with LLMs, RAG, and Generative AI. Sign up today!

View all jobs at Clarifai Inc.

Apply now Apply later

Posted 20 hours ago

Senior Site Reliability Engineer

About the Company

Clarifai is a leading, full-lifecycle deep-learning AI platform for computer vision, natural language processing, LLM and audio recognition. We help organizations transform unstructured images, video, text, and audio data into structured data at a significantly faster and more accurate rate than humans would be able to do on their own. Founded in 2013 by Matt Zeiler, Ph.D. Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with employees remotely based throughout the United States, Canada, Argentina, India and Estonia.

We have raised $100M in funding to date, with $60M coming from our most recent Series C, and are backed by industry leaders like Menlo Ventures, Union Square Ventures, Lux Capital, New Enterprise Associates, LDV Capital, Corazon Capital, Google Ventures, NVIDIA, Qualcomm and Osage.

Clarifai is proud to be an equal opportunity workplace dedicated to pursuing, hiring, and retaining a diverse workforce.

Your Impact

Clarifai’s platform is a kubernetes-native distributed system that requires the orchestration of many components. Efficiently serving and training large neural networks presents unique design and infrastructure challenges.

You will be critical to solving these challenges both in the context of the cloud and in on premise environments. Additionally, you will be responsible for our broader cloud infrastructure and development tools and environments.

The Opportunity

Ensure the smooth operation and high availability of Clarifai's core services
Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
Develop Kubernetes resources and custom tooling for seamless cloud and on-premise deployments
Design and implement scalable, secure, and cost-effective infrastructure solutions.
Partner with teams across the organization to identify & solve engineering challenges

Requirements

BS/BA in Computer Science or related degree
Good knowledge of cloud providers (AWS, GCP or similar)
Expertise with Kubernetes (EKS, GKE, self-hosted) and Infrastructure as Code using Terraform, Helm
Solid understanding of web and networking (HTTP, TLS, DNS, Certificates, etc)
Experience with CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and Atlantis
Strong interpersonal skills working with teams across different time zones and regions