Full Stack Data Engineer
Bengaluru Luxor North Tower, India
GSK
At GSK, we unite science, technology and talent to get ahead of disease togetherAbout Us: GSK is one of the world's pre-eminent pharmaceutical and healthcare companies. We take pride in leading a healthcare revolution. Through disruptive approaches in R&D and commercial business processes, our Decision Science & AI team applies advanced capabilities to drive innovation, augment decision making, and allows us to better serve our patients, healthcare professionals, and consumers.
We are seeking a highly skilled and experienced full stack data engineer to join our team to help support the US Specialty Business Unit. If you are passionate about leveraging cutting-edge cloud technologies and data engineering best practices to drive insights in the healthcare sector, we encourage you to apply.
Responsibilities
Design & Build Robust Data Pipelines: Develop, implement, and maintain end-to-end data pipelines that efficiently ingest, process, and transform large-scale pharmaceutical datasets, with a focus on anonymized patient-level datasets (APLD) and prescriber/account level datasets (e.g. Xponent, Drug Distribution Data).
Data Integration & ETL Processes: Collaborate with cross-functional teams to architect and optimize ETL processes, ensuring seamless integration of various data sources into centralized Azure environments (e.g., Azure Data Lake Gen2, Azure Data Factory).
Cloud & Microservices Implementation: Leverage Azure Cloud services to build scalable, secure, and high-performance data solutions. Develop RESTful APIs and microservices to facilitate data access and support business intelligence needs.
Quality Assurance & Data Governance: Ensure data accuracy, consistency, and security through rigorous testing, monitoring, and adherence to data governance standards. Collaborate with data governance teams to implement and maintain data cataloging practices.
Collaboration & Continuous Improvement: Work closely with data scientists, and Insights & Analytics teams to support advanced analytics and machine learning initiatives. Participate in code reviews, CI/CD pipeline development, and continuous process improvements to enhance overall system performance.
Basic Qualifications
Experience: 5-7 years of recent, hands-on experience in data engineering or a related field, with a proven track record of building robust data pipelines.
Education: Bachelor’s Degree (or higher) in Computer Science, Engineering, or a related discipline.
Technical Proficiency:
Demonstrated experience with distributed computing platforms (e.g. Databricks), to process and analyze large datasets.
Proficient in programming languages and frameworks such as Python, SQL, and PySpark.
Strong experience with Azure Cloud environments, including Azure Data Lake Gen2, Azure Data Factory, and SQL Data Warehouse.
Solid understanding of ETL processes and data integration best practices.
Domain Experience: Understanding of pharmaceutical industry datasets (e.g., APLD) or similar large-scale healthcare datasets (Electronic Health Records).
Development Practices: Experience with microservices architecture, REST API development, version control (Git), and CI/CD pipelines (e.g., Azure DevOps).
Problem-Solving & Collaboration: Demonstrated ability to troubleshoot complex data issues, optimize system performance, and work effectively within a team environment.
Preferred Qualifications
Industry Expertise: Prior experience within the pharmaceutical or healthcare industry is highly desirable.
Data Visualization & Reporting: Exposure to traditional visualization/reporting tools (e.g., Power BI, Tableau) and GenAI tools (e.g. Databricks Genie) to support data-driven decision-making. Experience with developing semantic data models for visualization tools is a plus.
Machine Learning & MLOps: Basic understanding of machine learning workflows and experience with ML pipelines or MLOps practices is a plus.
Additional Data Technologies: Knowledge of other data storage and processing technologies (e.g., Vector databases, Cosmos DB, Redis, Elasticsearch) is beneficial, though not the primary focus.
Cloud & Container Technologies: Experience with container orchestration platforms (e.g., Kubernetes) and additional cloud services to further enhance scalable deployments is beneficial, though not the primary focus.
As an employer committed to Inclusion, we encourage you to reach out if you need any adjustments during the recruitment process. Please contact our Recruitment Team at IN.recruitment-adjustments@gsk.com to discuss your needs.
Why GSK?
Uniting science, technology and talent to get ahead of disease together.
GSK is a global biopharma company with a special purpose – to unite science, technology and talent to get ahead of disease together – so we can positively impact the health of billions of people and deliver stronger, more sustainable shareholder returns – as an organisation where people can thrive. We prevent and treat disease with vaccines, specialty and general medicines. We focus on the science of the immune system and the use of new platform and data technologies, investing in four core therapeutic areas (infectious diseases, HIV, respiratory/ immunology and oncology).
Our success absolutely depends on our people. While getting ahead of disease together is about our ambition for patients and shareholders, it’s also about making GSK a place where people can thrive. We want GSK to be a place where people feel inspired, encouraged and challenged to be the best they can be. A place where they can be themselves – feeling welcome, valued, and included. Where they can keep growing and look after their wellbeing. So, if you share our ambition, join us at this exciting moment in our journey to get Ahead Together.
Important notice to Employment businesses/ Agencies
GSK does not accept referrals from employment businesses and/or employment agencies in respect of the vacancies posted on this site. All employment businesses/agencies are required to contact GSK's commercial and general procurement/human resources department to obtain prior written authorization before referring any candidates to GSK. The obtaining of prior written authorization is a condition precedent to any agreement (verbal or written) between the employment business/ agency and GSK. In the absence of such written authorization being obtained any actions undertaken by the employment business/agency shall be deemed to have been performed without the consent or contractual agreement of GSK. GSK shall therefore not be liable for any fees arising from such actions or any fees arising from any referrals by employment businesses/agencies in respect of the vacancies posted on this site.
It has come to our attention that the names of GlaxoSmithKline or GSK or our group companies are being used in connection with bogus job advertisements or through unsolicited emails asking candidates to make some payments for recruitment opportunities and interview. Please be advised that such advertisements and emails are not connected with the GlaxoSmithKline group in any way.
GlaxoSmithKline does not charge any fee whatsoever for recruitment process. Please do not make payments to any individuals / entities in connection with recruitment with any GlaxoSmithKline (or GSK) group company at any worldwide location. Even if they claim that the money is refundable.
If you come across unsolicited email from email addresses not ending in gsk.com or job advertisements which state that you should contact an email address that does not end in “gsk.com”, you should disregard the same and inform us by emailing askus@gsk.com, so that we can confirm to you if the job is genuine.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: API Development APIs Architecture Azure Business Intelligence CI/CD Computer Science Cosmos DB Databricks Data governance Data pipelines Data visualization Data warehouse DevOps Elasticsearch Engineering ETL Generative AI Git Kubernetes Machine Learning Microservices MLOps Pharma Pipelines Power BI PySpark Python R R&D REST API Security SQL Tableau Testing
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.