Senior Software Engineer – Research Data (RAG Engineer)
Hyderabad
Bristol Myers Squibb
Bristol Myers Squibb is a global biopharmaceutical company committed to discovering, developing and delivering innovative medicines to patients with serious diseases.Working with Us
Challenging. Meaningful. Life-changing. Those aren’t words that are usually associated with a job. But working at Bristol Myers Squibb is anything but usual. Here, uniquely interesting work happens every day, in every department. From optimizing a production line to the latest breakthroughs in cell therapy, this is work that transforms the lives of patients, and the careers of those who do it. You’ll get the chance to grow and thrive through opportunities uncommon in scale and scope, alongside high-achieving teams rich in diversity. Take your career farther than you thought possible.
Bristol Myers Squibb recognizes the importance of balance and flexibility in our work environment. We offer a wide variety of competitive benefits, services and programs that provide our employees with the resources to pursue their goals, both at work and in their personal lives. Read more: careers.bms.com/working-with-us.
Overview: Join our Semantic Data Products team within the Research Data group and become a pivotal force in revolutionizing our understanding and utilization of research data to drive drug discovery. As a RAG Engineer, you’ll be at the intersection of advanced AI and domain expertise, employing retrieval-augmented generation and semantic modeling principles to create meaningful connections within our research data platform. This role is instrumental in transforming raw data into actionable insights, accelerating scientific breakthroughs through intelligent data integration and querying.
Your Impact in Drug Discovery: In this role, you will directly contribute to the enhancement of our research data products, enriching the platform with data that empowers machine learning models and supports evidence-based decision-making in drug discovery. Your work will enable the development of RAG-powered data products, providing deep insights and actionable knowledge to support and speed up our journey to new scientific advancements.
Key Responsibilities:
- Develop RAG workflows, including prompt engineering and retrieval models, to answer R&D High Value Questions (HVQ) through the creation of advanced semantic data models.
- Design and support ETL processes on AWS to align data from multiple sources with unified data models, leveraging tools like Glue, Athena, and DataZone.
- Innovate and implement methods for validating data products against smart data contracts, ensuring quality and consistency.
- Build techniques to enable self-service querying and derive insights from our data products, making complex information accessible to researchers.
- Contribute methods that facilitate and automate normalization, linking, and augmentation of R&D data, ensuring a smooth data flow for research use cases.
- Expand expertise in collaborating with life sciences and early discovery stakeholders to align data products with research goals.
- Prototype and leverage AI agents to automate tasks such as biomarker extraction from research papers, information gathering from internal and external documents, and summarizing key research concepts.
Qualifications:
- Strong expertise in RAG workflows and data consumption/data exposure using AWS tools like Glue, Athena, and DataZone.
- Solid understanding of findable, accessible, interoperable, and reusable (FAIR) data principles.
- Knowledge of semantic data theory and practical applications, with experience in or willingness to learn RDF/SPARQL/SKOS/Knowledge Graphs.
- Proficiency in Python for automation, data processing, and natural language processing (NLP).
- Excellent problem-solving and analytical skills, with an ability to work independently and collaboratively.
- Strong communication skills, both written and verbal, with a passion for continuous learning.
Preferred Experience:
- Previous experience in a life sciences environment or drug discovery setting is advantageous.
- Familiarity with semantic data modeling tools (e.g., TopBraid EDG, SciBite) and visualization tools (e.g., GraphDB, relevant Python libraries)
Join our team and make a transformative impact on the future of scientific research and drug discovery. Your expertise in RAG and data engineering will unlock new frontiers in data analysis and semantic data understanding.
If you come across a role that intrigues you but doesn’t perfectly line up with your resume, we encourage you to apply anyway. You could be one step away from work that will transform your life and career.
Uniquely Interesting Work, Life-changing Careers
With a single vision as inspiring as “Transforming patients’ lives through science™ ”, every BMS employee plays an integral role in work that goes far beyond ordinary. Each of us is empowered to apply our individual talents and unique perspectives in an inclusive culture, promoting diversity in clinical trials, while our shared values of passion, innovation, urgency, accountability, inclusion and integrity bring out the highest potential of each of our colleagues.
On-site Protocol
BMS has a diverse occupancy structure that determines where an employee is required to conduct their work. This structure includes site-essential, site-by-design, field-based and remote-by-design jobs. The occupancy type that you are assigned is determined by the nature and responsibilities of your role:
Site-essential roles require 100% of shifts onsite at your assigned facility. Site-by-design roles may be eligible for a hybrid work model with at least 50% onsite at your assigned facility. For these roles, onsite presence is considered an essential job function and is critical to collaboration, innovation, productivity, and a positive Company culture. For field-based and remote-by-design roles the ability to physically travel to visit customers, patients or business partners and to attend meetings on behalf of BMS as directed is an essential job function.
BMS is dedicated to ensuring that people with disabilities can excel through a transparent recruitment process, reasonable workplace accommodations/adjustments and ongoing support in their roles. Applicants can request a reasonable workplace accommodation/adjustment prior to accepting a job offer. If you require reasonable accommodations/adjustments in completing this application, or in any part of the recruitment process, direct your inquiries to adastaffingsupport@bms.com. Visit careers.bms.com/eeo-accessibility to access our complete Equal Employment Opportunity statement.
BMS cares about your well-being and the well-being of our staff, customers, patients, and communities. As a result, the Company strongly recommends that all employees be fully vaccinated for Covid-19 and keep up to date with Covid-19 boosters.
BMS will consider for employment qualified applicants with arrest and conviction records, pursuant to applicable laws in your area.
If you live in or expect to work from Los Angeles County if hired for this position, please visit this page for important additional information: https://careers.bms.com/california-residents/
Any data processed in connection with role applications will be treated in accordance with applicable data privacy policies and regulations.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Athena AWS Data analysis Drug discovery Engineering ETL Excel Machine Learning ML models NLP Privacy Prompt engineering Python R RAG R&D RDF Research
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.