Principal / Senior Data Scientist
Wellcome Genome Campus, United Kingdom
Full Time Senior-level / Expert GBP 44K - 63K
Wellcome Sanger Institute
We are a world-leading genomics research institute in Cambridge. Our work helps improve human health and understand life on EarthDo you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenges.
We are looking for a highly motivated and experienced Data Scientist to join a collaborative project between the Generative and Synthetic Genomics and Cellular Genetics Programmes at Wellcome Sanger Institute, this will be on a 2 years fixed term contract with possible extension.
About the Role:
This is a joint position between the Taipale and Lotfollahi groups with an overarching goal of understanding and generating sequence data using generative AI models. The Taipale group has pioneered various methods to curate data at a scale unmatched anywhere else, primarily focussing on transcription factor binding sites, and determinants of gene expression and cell growth, whereas the Lotfollahi group is known for large-scale generative modelling in single-cell analysis.
This project aims to leverage datasets internally generated by the Taipale group to create large scale generative models for biology, enhancing our understanding of transcription and rules for gene expression. You will work within an interdisciplinary team of life scientists and computer/ML scientists, with a shared objective of advancing biological research through these generative models.
You will be supported in your personal and professional development. Besides a very stimulating environment, both groups also offer a very good exposure within and outside academia through their network e.g. Taipale group offers consultancy to Deepmind and has various grants with CZI whereas Lotfollahi group has provided consultancy to various big pharma companies before and is a cofounder of AI-VIVO.
You will be responsible for:
Independently drive machine learning projects and write outcomes in a scientific publication for submission to journals or machine learning conferences (ICLR, ICML, CVPR, etc).
Collaborate with team members, propose, develop, and evaluate new machine learning models that enable understanding single-cell data and its application in drug discovery.
Work with Ph.D. students and postdocs in collaborating teams on developing solutions for interdisciplinary scientific problems in biology.
Contribute to writing scientific papers on biotechnology and biology.
Distill your developed solutions into open-source and easy-to-install packages with documentation that facilitates the usage of your solution for downstream users, including biologists and bioinformaticians.
Present your research and analysis pipelines to internal and external audiences.
Essential Technical Skills:
MSc and/or Ph.D. or equivalent experience in a relevant quantitative discipline (e.g., Computer Science, Computational Biology, Genetics, Bioinformatics, Physics, Engineering, or Applied Statistics/Mathematics)
Experience in/using advanced statistical techniques, machine learning, and modern deep learning techniques
Previous ML work experience in scientific/academic environment (RA/Internships are considered as work experience)
Knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch
Knowledge of software development good practices and collaboration tools, including git- based version control, python package management, and code reviews
Experience working with cloud environments and tools, such as Amazon AWS S3, EC2
Evidence of related work experience as a researcher in the area of Machine learning
Strong publication record
Ability to quickly understand scientific, technical, and process challenges and breakdown complex problems into actionable steps
Essential Competencies and Behaviours:
Excellent communication skills, with the ability to explain complex machine learning algorithms and statistical methods to non-technical stakeholders (g1/g2)
Ability to work in a frequently changing environment with the capability to interpret management information to amend plans (g1/g2)
Ability to prioritise, manage workload, and deliver agreed activities consistently on time (g1/g2)
Demonstrate good networking, influencing and relationship building skills (g1)
Strategic thinking is the ability to see the ‘bigger picture (g1)
Ability to build collaborative working relationships with internal and external stakeholders at all levels (g1/g2)
Demonstrates inclusivity and respect for all (g1)
Other Information
Link to relevant publication of the groups
Application Process:
Please apply with your CV and a cover letter explaining your motivation for applying and how your skills and experience meet the above essential criteria.
Salary range (Dependant on skills and experience):
Grade 1 Principal Data Scientist £53,717 to £63,000 Role Profile
Proven experience using advanced statistical techniques, machine learning, and modern deep learning techniques
Strong knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.
Demonstrate good networking, influencing and relationship building skills
Strategic thinking is the ability to see the ‘bigger picture
Demonstrates inclusivity and respect for all
Leading and managing machine learning projects
Grade 2 Senior Data Scientist £44,905 to £52,000 Role Profile
Skilled in advanced statistical techniques, machine learning, and modern deep learning techniques
Proven experience of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.
Contributing to machine learning Projects
Closing Date: 13th July 2025
Recruitment Process: Shortlisting w/c 14th July, Zooms Interviews w/c 21st or w/c 28th July
Contract Type: 2 years Fixed Term Contract with possible extension
Hybrid Working at Wellcome Sanger:
We recognise that there are many benefits to Hybrid Working; including an improved work-life balance, with more focused time, as well as the ability to organise working time so that collaborative opportunities and team discussions are facilitated on campus. The hybrid working arrangement will vary for different roles and teams. The nature of your role and the type of work you do will determine if a hybrid working arrangement is possible.
Equality, Diversity and Inclusion:
We aim to attract, recruit, retain and develop talent from the widest possible talent pool, thereby gaining insight and access to different markets to generate a greater impact on the world. We have a supportive culture with the following staff networks, LGBTQ+, Parents and Carers, Disability and Race Equity to bring people together to share experiences, offer specific support and development opportunities and raise awareness. The networks are also a place for allies to provide support to others.
We want our people to be whoever they want to be because we believe people who bring their best selves to work, do their best work. That’s why we’re committed to creating a truly inclusive culture at Sanger Institute. We will consider all individuals without discrimination and are committed to creating an inclusive environment for all employees, where everyone can thrive.
Our Benefits:
We are proud to deliver an awarding campus-wide employee wellbeing strategy and programme. The importance of good health and adopting a healthier lifestyle and the commitment to reduce work-related stress is strongly acknowledged and recognised at Sanger Institute.
Sanger Institute became a signatory of the International Technician Commitment initiative In March 2018. The Technician Commitment aims to empower and ensure visibility, recognition, career development and sustainability for technicians working in higher education and research, across all disciplines.
Tags: AWS Bioinformatics Biology Computer Science Deep Learning Drug discovery EC2 Engineering Generative AI Generative modeling Git ICLR ICML Machine Learning Mathematics ML models Open Source Pharma Physics Pipelines Python PyTorch Research Scikit-learn SciPy Statistics TensorFlow
Perks/benefits: Career development Conferences Equity / stock options Health care Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.