Senior Data Engineer (m/f/d)

Heidelberg

GSK

At GSK, we unite science, technology and talent to get ahead of disease together

View all jobs at GSK

Apply now Apply later

Meet Doug: using data to inform GSK's decisions

At GSK, we have bold ambitions for patients, aiming to positively impact the health of 2.5 billion people by the end of the decade. R&D is committed to discovering and delivering transformational vaccines and medicines to prevent and change the course of disease. Science and technology are coming together in a way they never have before, and we have strong tech-enabled capabilities that allow us to build a deeper understanding of the patient, human biology and disease mechanisms, and transform medical discovery.  We are revolutionising the way we do R&D. We’re uniting science, technology and talent to get ahead of disease together. 

Senior Data Engineer (m/f/d)

Achieving delivery of the data that matters needs design and implementation of data flows and data products which leverage internal and external data assets and tools to drive discovery and development is a key objective for the Quality Engineering and Lab Data Engineering team within GSK's R&D Tech organisation. There are five key drivers for this approach, which are closely aligned with GSK's corporate priorities of Innovation, Performance and Trust:

  • Automation of end-to-end data flows: Faster and reliable ingestion of data from high throughput biomedical techniques, such as flow cytometry, sequencing or imaging, to extract value of investments in new technologies or approaches (instrument to analysis-ready data in <12h)
  • Innovative domain-expert specific data products: to enable rapid agile optimization and view into data streams that enable scientist faster key insights into experimental setups, novel approaches or data acquisition parameters leading to faster biopharmaceutical development cycles.
  • Enabling governance by design of external and internal data: with engineered practical solutions for controlled use and monitoring
  • Supporting end-to-end code traceability and data provenance: Increasing assurance of data integrity through automation, integration
  • Improving engineering efficiency: Extensible, reusable, scalable, maintainable, testable, deployable, traceable data and code in a cloud native context.

The Data Steams and Operation Engineering team accelerates biopharmaceutical drug discovery and development by designing and developing orchestrated pipelines by using existing or develop novel microservices on k8s aimed at surfacing QC and analytics ready data. These orchestrated pipelines and products provide automated scalable complete end-to-end processing of data from instruments or external data sources to analysis ready data in order to drive drug discovery and development process.

Key responsibilities

We are looking for a highly skilled and experienced Senior Data Engineer (m/f/d) to join to help us make this vision a reality. This software practitioner will work with a team of talented data, cloud and software engineers focused on knowledge and data systems for drug development and discovery. The team works with the Staff Engineer and is accountable for designing, building and testing new end to end data flows and creating data product in the cloud.

A Senior Data Engineer is a highly technical individual contributor who’s responsibility is:

  • the development, testing and deployment of cloud native processing nodes including parameter harmonization for hybrid cloud pipelines.
  • to contribute to the development of design patterns for our processing node framework and microservices supporting those.
  • to contribute to the development of cloud native workflows for high throughput biology and chemistry from instrument to analysis-ready data under 12h
  • to contribute to the harmonization between processing nodes, workflows and microservices in order to use the processing node framework to its fullest extent
  • to practice “systems” level thinking in sync with senior staff and architects, manages interfaces across packages, data structures and microservices
  • to exercise judgement and perform evaluations of external packages, pipeline orchestration tools and microservices in order to accelerate development of end-to-end data pipelines.
  • to play an active role of a development team driving our culture.

Specific requirements

  • Competitive candidates will have a proven track record of excellent coding and software development within high-performing engineering teams, and shipping products; extensive experience in collaborative coding is very important. 
  • Strong software development record in Python or at least another common programming language, e.g. Javascript, Rust or C++ and should follow best practises software and starting with low level design documentation, produce clean, readable code that is well-documented and appropriately tested
  • Comfortable working in an Agile software development environment using e.g. ADO, JIRA and Confluence
  • Prior cloud development experience (e.g. AWS, GoogleCloud, Azure) is a plus.
  • She/he/they should have knowledge in k8s; k8s on different clouds, e.g. aks, gke is a plus.
  • She/he/they should be very comfortable with GitOps processes, ranging from automated testing on different scales, automated deployment of packages and containers to infrastructure as code using e.g. argoCD or fluxCD.
  • She/he/they has background knowledge in life science and tools around high through-put data analysis thereof, and thus can lead discussions with scientist and record acceptance criteria accordingly.
  • A team player, eager to invest in personal and team growth
  • Excellent communication and writing skills in English

Basic Qualifications

  • BS in Computer Science, Software Engineering, biomedical engineering, engineering, or bioinformatics/computational biology, with extensive years of experience (or MS with multiple years of experience, or PhD) in the biotech/pharmaceutical/ healthcare/diagnostics/health insurance space
  • Extensive architecture, coding and testing experience, excellent teamwork

Preferred Qualifications

If you have the following characteristics, it would be a plus

  • Background in biomedical data processing is a plus, especially in but not limited to the fields of flow cytometry, proteomics or imaging.
  • Experience in GenAI is a plus 
  • AI/ML experience using e.g. tensorflow, pytorch, keras or scikit-learn is a plus.
  • Experience in building products with modern Cloud architectures, platforms, and back-end systems

Find out more:  

Annual Report 2023 

Product Pipeline 

#EBDE

#LI-GSK

Why GSK?

Uniting science, technology and talent to get ahead of disease together.

GSK is a global biopharma company with a special purpose – to unite science, technology and talent to get ahead of disease together – so we can positively impact the health of billions of people and deliver stronger, more sustainable shareholder returns – as an organisation where people can thrive. We prevent and treat disease with vaccines, specialty and general medicines. We focus on the science of the immune system and the use of new platform and data technologies, investing in four core therapeutic areas (infectious diseases, HIV, respiratory/ immunology and oncology).

Our success absolutely depends on our people. While getting ahead of disease together is about our ambition for patients and shareholders, it’s also about making GSK a place where people can thrive. We want GSK to be a place where people feel inspired, encouraged and challenged to be the best they can be. A place where they can be themselves – feeling welcome, valued, and included. Where they can keep growing and look after their wellbeing. So, if you share our ambition, join us at this exciting moment in our journey to get Ahead Together.

If you require an accommodation or other assistance to apply for a job at GSK, please contact the GSK Service Centre at 1-877-694-7547 (US Toll Free) or +1 801 567 5155 (outside US).

GSK is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive equal consideration for employment without regard to race, color, national origin, religion, sex, pregnancy, marital status, sexual orientation, gender identity/expression, age, disability, genetic information, military service, covered/protected veteran status or any other federal, state or local protected class.

Important notice to Employment businesses/ Agencies

GSK does not accept referrals from employment businesses and/or employment agencies in respect of the vacancies posted on this site. All employment businesses/agencies are required to contact GSK's commercial and general procurement/human resources department to obtain prior written authorization before referring any candidates to GSK. The obtaining of prior written authorization is a condition precedent to any agreement (verbal or written) between the employment business/ agency and GSK. In the absence of such written authorization being obtained any actions undertaken by the employment business/agency shall be deemed to have been performed without the consent or contractual agreement of GSK. GSK shall therefore not be liable for any fees arising from such actions or any fees arising from any referrals by employment businesses/agencies in respect of the vacancies posted on this site.

Please note that if you are a US Licensed Healthcare Professional or Healthcare Professional as defined by the laws of the state issuing your license, GSK may be required to capture and report expenses GSK incurs, on your behalf, in the event you are afforded an interview for employment. This capture of applicable transfers of value is necessary to ensure GSK’s compliance to all federal and state US Transparency requirements. For more information, please visit GSK’s Transparency Reporting For the Record site.

Apply now Apply later
  • Share this job via
  • 𝕏
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0
Category: Engineering Jobs

Tags: Agile Architecture AWS Azure Bioinformatics Biology Chemistry Computer Science Confluence Data analysis Data pipelines Drug discovery Engineering Generative AI JavaScript Jira Keras Kubernetes Machine Learning Microservices Pharma PhD Pipelines Python PyTorch R R&D Rust Scikit-learn TensorFlow Testing

Perks/benefits: Career development Insurance Startup environment

Region: Europe
Country: Germany

More jobs like this