Senior Manager - Data Science
Mumbai - Hiranandani, India
Marsh McLennan
Marsh McLennan is the world’s leading professional services firm in risk, strategy and people. We bring together experts from across our four global businesses — Marsh, Guy Carpenter, Mercer and Oliver Wyman — to help make organizations more...Company:
MarshDescription:
We are seeking a talented individual to join our Data Science team at Marsh. This role will be based in Mumbai. This is a hybrid role that has a requirement of working at least three days a week in the office.
Senior Manager - Data Science and Automation
We will count on you to:
- Identify opportunities which add value to the business and make the process more efficient.
- Invest in understand the core business including products, process, documents, and data points with the objective of identifying efficiency and value addition opportunities.
- Design and develop end-to-end NLP/LLM solutions for document parsing, information extraction, and summarization from PDFs and scanned text.
- Develop AI applications to automate manual and repetitive tasks using generative AI and machine learning.
- Fine-tune open-source LLMs (like LLaMA, Mistral, Falcon, or similar) or build custom pipelines using APIs (OpenAI, Anthropic, Azure OpenAI).
- Build custom extraction logic using tools like LangChain, Haystack, Hugging Face Transformers, and OCR libraries like Tesseract or Azure Form Recognizer.
- Create pipelines to convert outputs into formatted Microsoft Word or PDF files using libraries like docx, PDFKit, ReportLab, or LaTeX.
- Collaborate with data engineers and software developers to integrate AI models into production workflows.
- Ensure model performance, accuracy, scalability, and cost-efficiency across business use cases.
- Stay updated with the latest advancements in generative AI, LLMs, and NLP research to identify innovative solutions.
- Design, develop, and maintain robust data pipelines for extracting, transforming, and loading (ETL) data from diverse sources.
- As the operational scales up design and implement scalable data storage solutions and integrate them with existing systems.
- Utilize cloud platforms (AWS, Azure, Google Cloud) for data storage and processing.
- Conduct code reviews and provide mentorship to junior developers.
- Stay up-to-date with the latest technology trends and best practices in data engineering and cloud services.
- Ability to lead initiatives and deliver results by engaging with cross-functional teams and resolving data ambiguity issues.
- Be responsible for the professional development of your projects and institute a succession plan.
What you need to have:
- Bachelor's degree in Engineering, Analytics, or a related field, MBA, Computer Applications, IT, Business Analytics, or any discipline.
- Proven experience of 8-12 years in Python development
- Hands-on experience with frameworks and libraries like Transformers, LangChain, PyTorch/TensorFlow, spaCy, Hugging Face, and Haystack.
- Strong expertise in document parsing, OCR (Tesseract, AWS Textract, Azure Form Recognizer), and entity extraction.
- Proficiency in Python and familiarity with cloud-based environments (Azure, AWS, GCP).
- Experience deploying models as APIs/microservices using FastAPI, Flask, or similar.
- Familiarity with PDF parsing libraries (PDFMiner, PyMuPDF, Apache PDFBox) and Word generation libraries (python-docx, PDFKit).
- Solid understanding of prompt engineering and prompt-tuning techniques.
- Proven experience with data automation and building data pipelines.
- Proven track record in building and maintaining data pipelines and ETL processes.
- Strong knowledge of Python libraries such as Pandas, NumPy, and PySpark, Camelot.
- Familiarity with database management systems (SQL and NoSQL databases).
- Experience in designing and implementing system architecture.
- Ability to operate in a multi layered technology architecture and shape the technology maturity of the organization.
- Solid understanding of software development best practices, including version control (Git), code reviews, and testing frameworks (PyTest, UnitTest).
- Strong attention to detail and ability to work with complex data sets.
- Effective communication skills to present findings and insights to both technical and non-technical stakeholders. Specify superior listening, verbal and written communication skills
- Excellent project management and organization skills
- Superlative stakeholder management skills – ability to positively influence stakeholders.
- Synthesis skills- Ability to connect the dots and answer the business question.
- Excellent problem-solving, structuring and critical-thinking skills.
- Ability to work independently and collaboratively in a fast-paced environment.
What makes you stand out?
- Master’s degree in Computer Science, Engineering, or related fields.
- Experience in working with large-scale data sets and real-time data processing.
- Familiarity with additional programming languages like Java, C++, or R.
- Strong problem-solving skills and ability to work in a fast-paced environment.
Why join our team:
- We help you be your best through professional development opportunities, interesting work and supportive leaders.
- We foster a vibrant and inclusive culture where you can work with talented colleagues to create new solutions and have impact for colleagues, clients and communities.
- Our scale enables us to provide a range of career opportunities, as well as benefits and rewards to enhance your well-being.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Amazon Textract Anthropic APIs Architecture AWS Azure Business Analytics Computer Science Data pipelines Engineering ETL FastAPI Flask GCP Generative AI Git Google Cloud Haystack Java LangChain LLaMA LLMs Machine Learning Microservices NLP NoSQL NumPy OCR OpenAI Open Source Pandas Pipelines Prompt engineering PySpark Python PyTorch R Research spaCy SQL TensorFlow Testing Transformers
Perks/benefits: Career development Flex hours Insurance
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.