Senior Machine Learning OPS
Remote
This is a remote position.
Job Title: Senior Machine Learning Operations Engineer
Location : Remote in any Latin American country
Shift : Day shift
Experience: 5+ years
Job description
Job Overview:
We are seeking a highly skilled and experienced Senior Machine Learning Operations Engineer to join our team. In this role, you will work at the intersection of data science, engineering, and operations, ensuring the seamless deployment, monitoring, and scaling of machine learning models in production. You will collaborate closely with data scientists, engineers, and product teams to build and maintain robust MLOps pipelines, optimize model performance, and drive automation for model lifecycle management.
Key Responsibilities:
- Model Deployment & Automation: Design, implement, and manage scalable MLOps pipelines for continuous deployment of machine learning models into production environments.
- Model Monitoring & Optimization: Develop and maintain real-time monitoring systems to track the performance and health of machine learning models in production. Implement automation for model retraining, versioning, and rollback as needed.
- Infrastructure Management: Architect and manage the infrastructure required to deploy machine learning models at scale, ensuring efficient resource usage and high availability.
- Collaboration with Cross-functional Teams: Work closely with data scientists, software engineers, and business stakeholders to translate machine learning models into operational systems, ensuring smooth transitions from development to production.
- CI/CD Integration: Build and optimize Continuous Integration/Continuous Deployment (CI/CD) pipelines for machine learning applications, ensuring smooth and efficient updates to production environments.
- Model Performance Tuning: Lead efforts to optimize and fine-tune models for better performance, scalability, and cost efficiency in production.
- Documentation & Knowledge Sharing: Create and maintain clear documentation on model deployment pipelines, infrastructure, and best practices. Share knowledge with the team and foster a culture of collaboration and learning.
- Research & Implementation of New Tools: Stay up-to-date with the latest tools, frameworks, and technologies in the MLOps ecosystem and drive innovation to enhance the operational capabilities of machine learning models.
Requirements:
- Education: Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field. Relevant work experience may substitute for educational qualifications.
- Experience:
- 5+ years of relevant experience in MLOps (e.g., CI/CD for ML models, deployment strategies, infrastructure automation).
- Expertise in tools and platforms (e.g., Kubernetes, Docker, AWS/GCP/Azure, TensorFlow, PyTorch).
- Understanding of ML workflows and lifecycle management.
- Experience with scalable, production-grade ML systems.
- Soft Skills:
- Strong problem-solving and analytical skills.
- Excellent communication skills with the ability to collaborate with cross-functional teams.
- Leadership experience or a proven track record of mentoring junior engineers is a plus.
What We Offer:
- Competitive salary and benefits package.
- Opportunities for professional growth and development.
- A collaborative, innovative, and inclusive work environment.
- The chance to work with cutting-edge technologies and solve complex challenges in the ML space.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: AWS Azure CI/CD Computer Science Docker Engineering GCP Kubernetes Machine Learning ML models MLOps Model deployment Pipelines PyTorch Research TensorFlow
Perks/benefits: Career development Competitive pay Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.