AI/ML Operations

Pune, Maharashtra, India

InfraCloud

InfraCloud helps companies build GPU Cloud, modernize applications and infrastructure with our expertise in cloud native technologies.

View all jobs at InfraCloud

Apply now Apply later

Location: Pune,Maharashtra,India

Job Overview:

We are seeking an MLOps Engineer to support the development, deployment, and operationalization of AI-based models. The ideal candidate will have strong experience in both machine learning and DevOps, with a deep understanding of model deployment, monitoring, and scaling. In this role, you will be responsible for migrating our AI-based application from a cloud-hosted solution to a self-hosted model, ensuring robust model integration, and managing the lifecycle of machine learning models, from development and training to deployment and monitoring.

Key Responsibilities:

  1. Model Development & Training.

    • Develop and implement efficient pipelines for training, validating, and fine-tuning machine learning models.

    • Ensure that models are trained on relevant datasets and are capable of handling a wide variety of use cases

  2. Deployment and Infrastructure:

    • Design and implement the architecture required for self-hosted AI models, both on-premise and in the cloud, depending on project requirements.

    • Work closely with cloud infrastructure teams (AWS, GCP, Azure) to deploy and scale models in a production environment.

    • Set up automated deployment pipelines for the continuous integration and delivery (CI/CD) of machine learning models.

  3. Model Monitoring and Performance Tuning:

    • Monitor model performance and identify areas for improvement, including latency, resource consumption, and prediction accuracy.

    • Develop strategies for model retraining, continuous learning, and tuning to maintain the best performance over time.

    • Implement logging and monitoring systems to track model health and trigger alerts for potential issues.

  4. Integration & Automation:

    • Integrate AI models with the existing platform, ensuring smooth interaction between models and front-end systems.

    • Automate the end-to-end ML pipeline, from data collection and preprocessing to model deployment and feedback loops.

    • Collaborate with DevOps and infrastructure teams to ensure smooth operation of models in production environments.

  5. Collaboration & Documentation:

    • Document model architectures, training processes, deployment pipelines, and system configurations for internal use.

    • Provide technical support and troubleshooting for model-related issues during integration or in production.

  6. Security and Compliance:

    • Ensure compliance with data privacy and security regulations in the AI model deployment process.

    • Implement robust access controls, encryption, and audit logging for model and data access.

Qualifications:

  • Education: Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field.

  • Experience:

    • 3+ years of experience in MLOps, with a focus on machine learning model deployment and management.

    • Hands-on experience with both on-premise and cloud-based AI model deployment (AWS, GCP, Azure).

    • Experience with containerization technologies (Docker, Kubernetes) for model deployment and scaling.

    • Familiarity with continuous integration/continuous delivery (CI/CD) tools (e.g., Jenkins, GitLab CI, CircleCI).

    • Experience in monitoring tools (e.g., Prometheus, Grafana) and managing model performance at scale.

Apply to this job
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: Architecture AWS Azure CI/CD Computer Science DevOps Docker Engineering GCP GitLab Grafana Jenkins Kubernetes Machine Learning ML models MLOps Model deployment Pipelines Privacy Security

Perks/benefits: Career development

Region: Asia/Pacific
Country: India

More jobs like this