Senior Data and Applied Scientist

Hyderabad, Telangana, India

Full Time Senior-level / Expert USD 93K - 174K *

Microsoft

Entdecken Sie Microsoft-Produkte und -Dienste für Ihr Zuhause oder Ihr Unternehmen. Microsoft 365, Copilot, Teams, Xbox, Windows, Azure, Surface und mehr kaufen

View all jobs at Microsoft

Apply now Apply later

Posted 18 hours ago

Microsoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.

The AI Platform organization at Microsoft builds the end-to-end Azure AI stack/PaaS and is core to Azure’s innovation and differentiation, as well as all of Microsoft’s flagship products, from Office to Teams, to Xbox. We are the team building Azure OpenAI, Azure ML, Cognitive Services, and the global Azure AI infrastructure for running the largest AI workloads on the planet.

Within the AI Platform, our Evaluation AI team specializes in building evaluation frameworks for cutting-edge deep learning models, including Large Language Models, Small Language Models, RAG, fine-tuned, and distilled models across NLP, vision, multimodal, Co-pilot and agentic frameworks. We build the next-generation model evaluation platform for generative applications, leveraging state-of-the-art (SOTA) OSS and OAI models.

We are looking for a passionate, creative, analytical Data Scientist who loves NLP, deep learning and wants to ship products quickly at a massive scale. We will provide a lot of opportunities for you to learn, grow and contribute.

Responsibilities

Design and build sophisticated evaluation frameworks for advanced deep learning models, including LLMs, SLMs, RAG, fine-tuned, distilled models, and Generative AI applications like Co-pilots and agentic frameworks
Research, implement, and refine AI evaluation frameworks to assess model performance across diverse metrics, such as quality, robustness, fairness, and efficiency
Conduct extensive experiments to evaluate model performance, robustness, and generalization capabilities, applying advanced prompt engineering techniques to enhance outcomes
Collaborate with cross-functional teams, including researchers, data scientists, software engineers, and product managers
Work with large-scale datasets, ensuring proper preprocessing, feature selection, and data representation
Develop, evaluate, and fine-tune large language models (LLMs) using transformer-based architectures and state-of-the-art Generative AI techniques
Design and build end-to-end ML pipelines covering model training, data analysis, model serving and model evaluation
Drive innovation by implementing novel techniques from published literature and industry best practices to enhance evaluation capabilities for Azure Evaluation platform
Embody our culture and values

Qualifications

Required Qualification:

Master's degree with 7+ years of experience, or Bachelor's degree with 9+ years of experience, or 10+ years of relevant experience in Computer Science, Computer Engineering, Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, or a related field
Proficiency in Generative AI, Large Language Modelling, natural language processing / computer vision / multimodal analysis /deep learning/machine learning
Proven experience in managing structured and unstructured data, leveraging complex datasets, applying advanced statistical techniques, developing sophisticated algorithms, and executing large-scale A/B testing.
Proficiency in using one or more programming or scripting languages such as Python, R, or C# to develop production-grade quality product
Proficiency in open-source frameworks like TensorFlow and PyTorch, transformer-based and diffuser-based models (e.g., BERT, GPT, T5, Llama, Stable Diffusion) and LLMs
Experience with creating metrics, predicting trend analysis and assessing experimentation results
Experience in fine-tuning / distillation / working with RAG AI models on large datasets
Experience in building ML pipelines covering model training, data analysis, model serving and model evaluation
Deep understanding of statistics, linear algebra, and probability theory
Excellent verbal and written communication skills and ability to work independently and collaboratively

Preferred Qualifications:

Publication(s) in top-tier conferences or journals in related fields (e.g., ACL, EMNLP, CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, etc.)
Familiarity with cloud platforms (e.g., Azure, AWS)

#IDCAIPlatformHiring

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Data Science Jobs

Tags: A/B testing Architecture AWS Azure BERT Computer Science Computer Vision Data analysis Deep Learning Econometrics Economics EMNLP Engineering Generative AI GPT ICLR ICML Linear algebra LLaMA LLMs Machine Learning Mathematics ML infrastructure Model training NeurIPS NLP OpenAI Open Source Pipelines Probability theory Prompt engineering Python PyTorch R RAG Research Security Stable Diffusion Statistics TensorFlow Testing Unstructured data