Senior DevOps Engineer
Herzliya, none, IL
Dataloop AI
Drive your AI to production with end-to-end data management, automation pipelines and quality-first data labeling platform. Learn how.Description
What does Dataloop do?
Dataloop AI is on a mission to provide an all-in-one platform for AI/GenAI lifecycle management, specializing in unstructured data. Our platform offers advanced data management, annotation tools, MLOps workflows, and data pipelines for seamless development and production. Dataloop empowers organizations with a unique ML applications marketplace to easily deploy AI/GenAI solutions.
About the position:
We’re growing and we need people who are experienced in devops positions to help us grow faster and bigger.
We expect this position to be taken by someone who is ready to tackle big production systems and wants to learn what it takes to scale a system greatly, manage it, maintain it, and keep it operational at all times.
This includes, but is not limited to, suggesting, planning and executing tasks that will help achieve this goal.
Key Responsibilities:
- Cloud Infrastructure Management: Provision, manage, and optimize resources across Azure (preferred), AWS, and GCP, ensuring high availability, cost-efficiency, and performance.
- Kubernetes Orchestration: Architect and operate advanced Kubernetes environments — both managed (AKS, EKS, GKE) and on-prem (RKE, Rancher, OpenShift).
- CI/CD Engineering: Develop and manage robust CI/CD pipelines using Bitbucket Pipelines, Argo Workflows, and Jenkins, automating build-test-deploy workflows.
- Monitoring & Logging: Implement and manage monitoring systems with Prometheus, Grafana, and centralized logging with ELK/EFK stacks.
- Infrastructure Automation: Leverage Terraform, Ansible, and scripting (Python/Bash) to build and manage infrastructure as code (IaC).
- On-Premise and Air-Gapped Deployments: Architect and support isolated environments using local DNS (PowerDNS), registries (Harbor), GitOps, and secure deployment practices.
- Security & Compliance: Implement IAM policies, secrets management (Vault), encryption, and secure software delivery pipelines.
- Documentation & System Design: Author detailed technical documentation including architectural blueprints, SOPs, and disaster recovery plans.
- Collaboration & Mentorship: Work cross-functionally with developers, product, and QA teams; mentor junior DevOps engineers; and drive a culture of excellence.
- Customer Engagement: Participate in technical discussions and workshops with enterprise clients to support onboarding and production success.
Requirements
Minimum Qualifications:
- 5+ years of experience in a DevOps, Site Reliability, or Platform Engineering role
- Strong command over Linux system administration, cloud networking, and container orchestration
- Proven experience with Azure, AWS, and GCP cloud services, with Azure being a strong preference
- Advanced skills in Kubernetes, with expertise in OpenShift, RKE, and Rancher
- Familiarity with GitOps, Bitbucket Pipelines, Helm, and ArgoCD
- Experience with observability using Prometheus, Grafana, Elasticsearch, Fluentd/Filebeat, and Kibana
- Hands-on expertise with Terraform, Ansible, and scripting languages like Bash or Python
- Knowledge of secure deployment practices, disaster recovery, and high availability designs
Nice to Have:
- Red Hat, Kubernetes, or cloud certifications (e.g., RHCA, CKA, Azure DevOps Expert)
- Experience in GPU-enabled Kubernetes clusters
- Familiarity with DNS, Image Registry, or service mesh technologies
- Exposure to hybrid infrastructure environments
- Understanding of DevSecOps and compliance standards
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Ansible AWS Azure Bitbucket CI/CD Data management Data pipelines DevOps Elasticsearch ELK Engineering GCP Generative AI GPU Grafana Helm Jenkins Kibana Kubernetes Linux Machine Learning MLOps Pipelines Python Security Terraform Unstructured data
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.