Principal / Senior Data Scientist

Wellcome Genome Campus, United Kingdom

Wellcome Sanger Institute

We are a world-leading genomics research institute in Cambridge. Our work helps improve human health and understand life on Earth

View all jobs at Wellcome Sanger Institute

Apply now Apply later

Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenges.

We are looking for a highly motivated and experienced Data Scientist to join a collaborative project between the Generative and Synthetic Genomics and Cellular Genetics Programmes at Wellcome Sanger Institute, this will be on a 2 years fixed term contract with possible extension.

About the Role:

This is a joint position between the Taipale and Lotfollahi groups with an overarching goal of understanding and generating sequence data using generative AI models. The Taipale group has pioneered various methods to curate data at a scale unmatched anywhere else, primarily focussing on transcription factor binding sites, and determinants of gene expression and cell growth, whereas the Lotfollahi group is known for large-scale generative modelling in single-cell analysis.

This project aims to leverage datasets internally generated by the Taipale group to create large scale generative models for biology, enhancing our understanding of transcription and rules for gene expression. You will work within an interdisciplinary team of life scientists and computer/ML scientists, with a shared objective of advancing biological research through these generative models.

You will be supported in your personal and professional development. Besides a very stimulating environment, both groups also offer a very good exposure within and outside academia through their network e.g. Taipale group offers consultancy to Deepmind and has various grants with CZI whereas Lotfollahi group has provided consultancy to various big pharma companies before and is a cofounder of AI-VIVO.

You will be responsible for:

  • Independently drive machine learning projects and write outcomes in a scientific publication for submission to journals or machine learning conferences (ICLR, ICML, CVPR, etc).

  • Collaborate with team members, propose, develop, and evaluate new machine learning models that enable understanding single-cell data and its application in drug discovery.

  • Work with Ph.D. students and postdocs in collaborating teams on developing solutions for interdisciplinary scientific problems in biology.

  • Contribute to writing scientific papers on biotechnology and biology.

  • Distill your developed solutions into open-source and easy-to-install packages with documentation that facilitates the usage of your solution for downstream users, including biologists and bioinformaticians.

  • Present your research and analysis pipelines to internal and external audiences.


Essential Technical Skills:

  • MSc and/or Ph.D. or equivalent experience in a relevant quantitative discipline (e.g., Computer Science, Computational Biology, Genetics, Bioinformatics, Physics, Engineering, or Applied Statistics/Mathematics)

  • Experience in/using advanced statistical techniques, machine learning, and modern deep learning techniques

  • Previous ML work experience in scientific/academic environment (RA/Internships are considered as work experience)

  • Knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch

  • Knowledge of software development good practices and collaboration tools, including git- based version control, python package management, and code reviews

  • Experience working with cloud environments and tools, such as Amazon AWS S3, EC2

  • Evidence of related work experience as a researcher in the area of Machine learning

  • Strong publication record

  • Ability to quickly understand scientific, technical, and process challenges and breakdown complex problems into actionable steps

Essential Competencies and Behaviours:

  • Excellent communication skills, with the ability to explain complex machine learning algorithms and statistical methods to non-technical stakeholders (g1/g2)

  • Ability to work in a frequently changing environment with the capability to interpret management information to amend plans (g1/g2)

  • Ability to prioritise, manage workload, and deliver agreed activities consistently on time (g1/g2)

  • Demonstrate good networking, influencing and relationship building skills (g1)

  • Strategic thinking is the ability to see the ‘bigger picture (g1)

  • Ability to build collaborative working relationships with internal and external stakeholders at all levels (g1/g2)

  • Demonstrates inclusivity and respect for all (g1)

Other Information

Link to relevant publication of the groups

Application Process:

Please apply with your CV and a cover letter explaining your motivation for applying and how your skills and experience meet the above essential criteria.

Salary range (Dependant on skills and experience): 

Grade 1 Principal Data Scientist £53,717 to £63,000 Role Profile

  • Proven experience using advanced statistical techniques, machine learning, and modern deep learning techniques

  • Strong knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.

  • Demonstrate good networking, influencing and relationship building skills

  • Strategic thinking is the ability to see the ‘bigger picture

  • Demonstrates inclusivity and respect for all

  • Leading and managing machine learning projects

Grade 2 Senior Data Scientist £44,905 to £52,000 Role Profile

  • Skilled in advanced statistical techniques, machine learning, and modern deep learning techniques

  • Proven experience of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.

  • Contributing to machine learning Projects

Closing Date: 13th July 2025

Recruitment Process: Shortlisting w/c 14th July, Zooms Interviews w/c 21st or w/c 28th July

Contract Type: 2 years Fixed Term Contract with possible extension

Hybrid Working at Wellcome Sanger:

We recognise that there are many benefits to Hybrid Working; including an improved work-life balance, with more focused time, as well as the ability to organise working time so that collaborative opportunities and team discussions are facilitated on campus. The hybrid working arrangement will vary for different roles and teams. The nature of your role and the type of work you do will determine if a hybrid working arrangement is possible.

Equality, Diversity and Inclusion:

We aim to attract, recruit, retain and develop talent from the widest possible talent pool, thereby gaining insight and access to different markets to generate a greater impact on the world. We have a supportive culture with the following staff networks, LGBTQ+, Parents and Carers, Disability and Race Equity to bring people together to share experiences, offer specific support and development opportunities and raise awareness. The networks are also a place for allies to provide support to others.

We want our people to be whoever they want to be because we believe people who bring their best selves to work, do their best work. That’s why we’re committed to creating a truly inclusive culture at Sanger Institute. We will consider all individuals without discrimination and are committed to creating an inclusive environment for all employees, where everyone can thrive.

Our Benefits:

We are proud to deliver an awarding campus-wide employee wellbeing strategy and programme. The importance of good health and adopting a healthier lifestyle and the commitment to reduce work-related stress is strongly acknowledged and recognised at Sanger Institute.

Sanger Institute became a signatory of the International Technician Commitment initiative In March 2018.  The Technician Commitment aims to empower and ensure visibility, recognition, career development and sustainability for technicians working in higher education and research, across all disciplines.

Apply now Apply later
Job stats:  3  1  1
Category: Data Science Jobs

Tags: AWS Bioinformatics Biology Computer Science Deep Learning Drug discovery EC2 Engineering Generative AI Generative modeling Git ICLR ICML Machine Learning Mathematics ML models Open Source Pharma Physics Pipelines Python PyTorch Research Scikit-learn SciPy Statistics TensorFlow

Perks/benefits: Career development Conferences Equity / stock options Health care Team events

Region: Europe
Country: United Kingdom

More jobs like this