Staff Backend Engineer – AI Algorithm Platform
Israel
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Cloudinary
Streamline media management and improve user experience by automatically delivering images and videos, enhanced and optimized for every user.
Cloudinary empowers companies to deliver exceptional digital experiences by managing the entire media lifecycle at scale. Within Cloudinary’s R&D, the Research Group leads the development of cutting-edge algorithms for media understanding, generation, and optimization. We are seeking an experienced Staff Backend Engineer to lead the engineering efforts behind our homegrown platform for serving and operating production-grade AI models and AI based algorithms. This is a mission-critical role for someone passionate about building highly-scalable, GPU-aware, cloud-native systems that act as the connective tissue between algorithm research and product innovation. You will play a pivotal part in re-designing and evolving the platform, while supporting both research and application teams across the organization, and contributing to MLOps initiatives.
Key Responsibilities
- Own the architecture, stability, scalability, and performance of the system.
- Design and implement platform features that support both synchronous low-latency and asynchronous compute-heavy algorithm execution.
- Enhance GPU management, scheduling, and resource allocation for optimal performance and cost-efficiency.
- Ensure robust Kubernetes-based deployment and observability for a highly dynamic system.
- Act as the technical bridge between Research and Application teams by translating requirements into scalable system designs.
- Collaborate closely with algorithm developers to streamline model deployment processes.
- Partner with backend engineers (primarily working in Ruby and Go) to integrate the research group algorithms into Cloudinary services.
- Advocate for high standards in code quality, observability, testing, and security.
- Guide engineering integration efforts when consuming the different platform APIs.
- Provide mentorship, support, and best practices to other engineers interacting with the platform.
- Take part in general R&D efforts, supporting a broader production environment.
- Contribute to the evolution of MMS to support a wider range of algorithmic workloads and model types.
- Help shape tooling and infrastructure for model versioning, rollout, monitoring, and testing.
- Collaborate with DevOps and Infrastructure teams to maintain operational excellence, system observability, and robust infrastructure support
Cross-Team Collaboration
Engineering Excellence
Platform Extension and MLOps
Your Qualifications
- 8+ years of experience in software engineering, with 3+ years working on infrastructure/platforms involving ML/AI, GPU, or data-heavy systems.
- Proficiency in Python and familiarity with backend languages such as Ruby and/or Go.
- Strong understanding of Kubernetes internals and experience running GPU workloads in production environments.
- In-depth knowledge of AWS services.
- Experience architecting systems that support both real-time and asynchronous processing pipelines.
- Familiarity with the ML lifecycle and MLOps practices, including CI/CD for models, monitoring, and rollback strategies.
Bonus Qualifications
- Experience working in research-driven environments or alongside data scientists, algorithm research team and ML engineers.
- Contributions to open-source projects related to model serving, Kubernetes operators, or ML platforms.
- Experience supporting systems with diverse user groups across engineering and research disciplines.
Why Join Us?
- Opportunity to build and scale a one-of-a-kind platform powering state-of-the-art media algorithms.
- Collaborate with world-class research, engineering, and product teams.
- Have a direct impact on product experiences used by millions of developers and end-users.
- Be part of a culture that values creativity, autonomy, and continuous improvement.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
2
0
0
Categories:
Deep Learning Jobs
Engineering Jobs
Leadership Jobs
Tags: APIs Architecture AWS CI/CD DevOps Engineering GPU Kubernetes Machine Learning MLOps Model deployment Open Source Pipelines Python R R&D Research Ruby Security Testing
Region:
Middle East
Country:
Israel
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Business Intelligence Developer jobsData Scientist II jobsPrincipal Data Engineer jobsSr. Data Engineer jobsBI Developer jobsStaff Data Scientist jobsPrincipal Software Engineer jobsStaff Machine Learning Engineer jobsDevOps Engineer jobsData Science Intern jobsJunior Data Analyst jobsAI/ML Engineer jobsSoftware Engineer II jobsData Manager jobsStaff Software Engineer jobsData Science Manager jobsLead Data Analyst jobsData Analyst Intern jobsSr. Data Scientist jobsData Specialist jobsBusiness Data Analyst jobsBusiness Intelligence Analyst jobsData Governance Analyst jobsData Engineer III jobsSenior Backend Engineer jobs
Consulting jobsMLOps jobsAirflow jobsOpen Source jobsEconomics jobsKafka jobsLinux jobsKPIs jobsGitHub jobsJavaScript jobsTerraform jobsPrompt engineering jobsPostgreSQL jobsRAG jobsStreaming jobsScikit-learn jobsBanking jobsData Warehousing jobsNoSQL jobsClassification jobsPhysics jobsRDBMS jobsComputer Vision jobsdbt jobsPandas jobs
Google Cloud jobsHadoop jobsScala jobsLangChain jobsGPT jobsR&D jobsMicroservices jobsData warehouse jobsBigQuery jobsCX jobsDistributed Systems jobsScrum jobsELT jobsReact jobsOracle jobsLooker jobsIndustrial jobsPySpark jobsOpenAI jobsJira jobsRedshift jobsRobotics jobsSAS jobsTypeScript jobsUnstructured data jobs