Lead Data Scientist

Bengaluru

Apply now Apply later

About Team

The Myntra Data Science team is at the forefront of innovation, delivering cutting-edge solutions that drive significant revenue and enhance customer experiences across various touchpoints. Our models impact millions of customers, leveraging real-time, near-real-time, and offline solutions with diverse latency requirements. These models are built on massive datasets, allowing for deep learning and growth opportunities within a rapidly expanding organization. By joining our team, you'll gain hands-on experience with an extensive e-commerce platform, learning to develop models that handle millions of requests per second with sub-second latency.

We take pride in deploying solutions that not only utilize state-of-the-art machine learning techniques—such as graph neural networks, diffusion models, transformers, representation learning, optimization methods, and Bayesian modeling—but also contribute to the research community with multiple peer-reviewed publications.

Roles and Responsibilities

Design, Develop, and Deploy: Develop, deploy, and maintain machine learning models that are not only theoretically sound but also practical and scalable. Our team places a strong emphasis on rapid, trustworthy experimentation for validating models and features. 

Model Maintenance: Design and build machine learning pipelines optimized for scalability, ensuring seamless model training, evaluation, and deployment. Monitor the performance of machine learning models in real-time using statistical methods, ensuring their efficiency and effectiveness.Implement and manage real-time data systems to handle large data streams efficiently.

Technical Expertise: Conduct R&D in innovative techniques such as Recommender Systems, Computer Vision, NLP, Generative AI and Causal Inference, pushing the boundaries of practical machine learning applications.

Software Development: Develop robust, scalable, and maintainable software solutions for seamless model deployment.

CI/CD Pipelines: Set up and manage Continuous Integration/Continuous Deployment (CI/CD) pipelines for automated testing, deployment, and model integration.

Collaboration: Work closely with the Product Managers, Platforms and Engineering teams to ensure smooth deployment and integration of ML models into Myntra production systems. 

Data Management: Utilize big data technologies and data lakes to preprocess and shape raw data for machine learning applications.

Code Quality: Write clean, efficient, and maintainable code following best practices.

Performance Optimization: Conduct performance testing, troubleshooting, and tuning to ensure optimal model performance.

Continuous Learning: Stay up-to-date with the latest advancements in machine learning and technology, sharing insights and knowledge across the organization.

Experience

  • Industry Experience: Master's degree in a related technical field with 4+ years of relevant industry experience or Bachelor's degree in Computer Science, Data Science, Machine Learning, Statistics, or a related technical field with 6+ years of relevant industry experience OR

OR

Ph.D. in a related field with a thesis in a domain relevant to Myntra's needs (e.g., Recommender Systems, Natural Language Processing).

  • Machine Learning Expertise: At least 4 years of hands-on experience as a Machine Learning Engineer or a similar role. Solid understanding of statistics, particularly as it applies to machine learning, including probability theory, hypothesis testing, and statistical inference.
  • Production Deployment: Proven track record of implementing and scaling machine learning models and pipelines in a production environment.
  • Programming Skills: Strong proficiency in Python or equivalent programming languages for model development.
  • ML Frameworks: Familiarity with leading machine learning frameworks (Keras, TensorFlow, PyTorch) and libraries (scikit-learn).
  • CI/CD Tools: Experience with CI/CD tools and practices.
  • Communication: Excellent verbal and written communication skills.
  • Teamwork & Independence: Ability to work collaboratively in a team environment or independently as needed. Mentor team members technically on designing and deploying ML pipelines and services.
  • Workload Management: Strong organizational skills to manage and prioritize tasks, supporting your manager effectively. 

Preferred Qualifications

  • Strong emphasis on rapid, trustworthy experimentation for validating machine learning models and hypotheses.
  • Hands-on experience with Search and Recommender Systems, Computer Vision, or Forecasting is strongly desired. We value candidates who emphasize practical implementation and scaling of machine learning solutions.
  • Experience with real-time systems and databases like Kafka, Cassandra, Vector Databases, or Bigtable is highly valued.
  • Prior experience with Generative AI techniques earns brownie points.
  • Advanced understanding and experience in Causal Inference earns a lot of brownie points.
  • Strong communication skills, especially in conveying complex technical and statistical concepts to a non-technical audience.
  • Experience with big data technologies like Spark or other distributed computing frameworks.

Exceptional candidates are encouraged to apply, even if you don't meet every listed qualification. We're open to hiring individuals who demonstrate outstanding potential.

Nice to Have

  • Research Contributions: Publications or presentations in recognized Machine Learning and Data Science journals/conferences.
  • Cloud Services: Proficiency in cloud platforms (AWS, Google Cloud) and an understanding of distributed systems.
  • Generative AI Exposure: Familiarity with Generative AI models.
  • Database Management: Experience with SQL and/or NoSQL databases.
  • ML Orchestration: Knowledge of ML orchestration tools (Airflow, Kubeflow, MLFlow).


Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Airflow AWS Bayesian Big Data Bigtable Cassandra Causal inference CI/CD Computer Science Computer Vision Data management Deep Learning Diffusion models Distributed Systems E-commerce Engineering GCP Generative AI Google Cloud Kafka Keras Kubeflow Machine Learning MLFlow ML models Model deployment Model training NLP NoSQL Pipelines Probability theory Python PyTorch R R&D Recommender systems Research Scikit-learn Spark SQL Statistics TensorFlow Testing Transformers

Perks/benefits: Career development Conferences

Region: Asia/Pacific
Country: India

More jobs like this