MLOps Engineer

Santa Clara,CA, United States

Applied Materials

We work closely with our customers as strategic partners in ten countries across Europe. Facilitating progress through long-term relationships, and delivering the expertise, technology and services that bring their ideas and innovations to...

View all jobs at Applied Materials

Apply now Apply later

Applied Materials’ IT organization has a long reputation of being a great place to work. The IT team has been recognized as one of Computerworld's 100 Best Places to Work in IT nine times. In addition, numerous Applied IT leaders have been honored as a CIO Magazine's Ones to Watch or Computerworld Premier 100 IT leaders.

Role Overview: As an MLOps Engineer, you will be responsible for ensuring the smooth operation of ML pipelines, from model development to deployment and monitoring. You will work across the full lifecycle of ML systems, including CI/CD, model versioning, orchestration, performance tuning, and automation. You will collaborate with cross-functional teams to design and implement scalable, reliable, and efficient ML infrastructure solutions that enable rapid experimentation and deployment of machine learning models.

 

Key Responsibilities:

 

Act a liaison with a sub-group within a business unit or a GIS Domain area for business and MLOps technology strategy alignment, solution discovery, service management, and project portfolio management. Analyze Business Requirements:

  • Convert business requirements and/or issues into functional and technical specifications.

  • Assist in the design and technical development of complex MLOps solutions to meet business needs.

  • Perform and document application and platform configuration.

  • Prepare and execute test scenarios and scripts (unit, integration, performance, regression, acceptance) and data integration.

  • Participate in new technology evaluations.

Adhere to GIS Processes:

  • Guide junior staff and contingent workers to adhere to GIS project management, software application development, testing, service management, change management, RCA, and other relevant processes, standards, governance, and controls.

  • Manage the execution of SOX controls and testing, and support internal and external audits.

Plan and Manage MLOps Projects:

  • Plan and manage small to medium-sized MLOps projects to ensure effective and efficient execution in line with guardrails of scope, timeline, budget, and quality.

  • Serve as an MLOps team lead on medium to large cross-functional application processes.

Oversee Contingent Workers:

  • Manage contingent workers performing MLOps project and/or support services.

  • Responsible for the selection, onboarding, and offboarding of contingent workers in a timely manner.

  • Manage contingent worker project/task assignments and ensure work product quality.

  • Approve contingent worker timesheets/costs.

Model Deployment & Orchestration:

  • Collaborate with Data Scientists to deploy machine learning models into production environments.

  • Design and implement CI/CD pipelines to automate the training, validation, and deployment of models.

  • Ensure seamless integration of models with backend systems and cloud infrastructure.

Infrastructure Management:

  • Build and maintain scalable infrastructure for ML workflows using cloud platforms (AWS/GCP/Azure).

  • Manage containerized environments (Docker, Kubernetes) for model deployment and scaling.

  • Optimize model serving environments for low-latency and high-availability needs.

Monitoring & Optimization:

  • Implement and maintain monitoring and logging systems to track model performance and identify issues in real-time.

  • Ensure model performance is aligned with business goals and continually improve model retraining cycles.

  • Implement auto-scaling and fault-tolerant mechanisms to ensure high availability of ML services in production.

Collaboration & Communication:

  • Work closely with data scientists, software engineers, and product teams to ensure alignment on model requirements, performance, and deployment strategy.

  • Provide guidance and best practices for maintaining model quality and optimizing deployment processes.

  • Assist in creating documentation for the deployment pipeline, model versioning, and infrastructure.

Automation & Tooling:

  • Develop scripts and tools to automate repetitive tasks within the ML lifecycle (data collection, preprocessing, retraining).

Security and Compliance:

  • Implement security best practices for data privacy, access control, and model integrity in production environments.

  • Ensure compliance with relevant industry regulations for ML operations.

 

Qualifications and Experience:

 

  • Technical Skills:

    • Proficient in Python or other scripting languages.

    • Experience with ML model deployment frameworks such as TensorFlow Serving or custom REST APIs.

    • Strong knowledge of containerization technologies (Docker, Kubernetes) and cloud platforms (AWS, GCP, Azure).

    • Familiarity with CI/CD tools and version control systems (Git).

    • Understanding of ML lifecycle tools (Kubeflow, MLflow) is a plus.

    • Experience in monitoring, logging, and alerting tools (Prometheus, Grafana, Datadog).

 

  • Experience:

    • 4+ years of experience working in MLOps, DevOps, or related roles in a production environment.

    • Demonstrated experience deploying and maintaining machine learning models at scale in production.

    • Knowledge of model performance monitoring, A/B testing, and model retraining strategies.

  • Education:

    • Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field, or equivalent practical experience.

    • Relevant certifications (e.g., AWS Certified Machine Learning, Google Cloud Professional Machine Learning Engineer) are a plus.

 

Functional Knowledge:

  • Demonstrates conceptual and practical expertise in MLOps and basic knowledge of related disciplines.

Business Expertise:

  • Has knowledge of best practices and how own area integrates with others.

  • Is aware of the competition and the factors that differentiate them in the market.

Leadership:

  • Acts as a resource for colleagues with less experience.

  • May lead small projects with manageable risks and resource requirements.

Problem Solving:

  • Solves complex problems.

  • Takes a new perspective on existing solutions.

  • Exercises judgment based on the analysis of multiple sources of information.

Impact:

  • Impacts a range of customer, operational, project, or service activities within own team and other related teams.

  • Works within broad guidelines and policies.

Interpersonal Skills:

  • Explains difficult or sensitive information.

  • Works to build consensus.

Qualifications

Education:

Bachelor's Degree (Required)

Skills:

DevOps, Machine Learning Operations, Performance Modeling

Certifications:

Languages:

Years of Experience:

4 - 7 Years

Work Experience:

Additional Information

Time Type:

Full time

Employee Type:

Assignee / Regular

Travel:

Yes, 10% of the Time

Relocation Eligible:

Yes

U.S. Salary Range:

$152,000.00 - $209,000.00

The salary offered to a selected candidate will be based on multiple factors including location, hire grade, job-related knowledge, skills, experience, and with consideration of internal equity of our current team members. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation such as participation in a bonus and a stock award program, as applicable.

For all sales roles, the posted salary range is the Target Total Cash (TTC) range for the role, which is the sum of base salary and target bonus amount at 100% goal achievement.

Applied Materials is an Equal Opportunity Employer committed to diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, color, national origin, citizenship, ancestry, religion, creed, sex, sexual orientation, gender identity, age, disability, veteran or military status, or any other basis prohibited by law. 

Apply now Apply later
Job stats:  11  0  0

Tags: A/B testing APIs AWS Azure CI/CD Computer Science DevOps Docker Engineering GCP Git Google Cloud Grafana Kubeflow Kubernetes Machine Learning MLFlow ML infrastructure ML models MLOps Model deployment Pipelines Privacy Python Security TensorFlow Testing

Perks/benefits: Career development Equity / stock options Relocation support Salary bonus Travel

Region: North America
Country: United States

More jobs like this