Data Scientist

Hinxton, United Kingdom

European Molecular Biology Laboratory (EMBL)

With 29 member states, laboratories at six locations across Europe and thousands of scientists and engineers working together, the European Molecular Biology Laboratory is a powerhouse of biological expertise. The intergovernmental...

View all jobs at European Molecular Biology Laboratory (EMBL)

Apply now Apply later

We are looking for a Data Engineer to join the Literature Services Team at EMBL-EBI. The Literature Services Team operates in an agile environment which develops Europe PMC (http://europepmc.org) and works on a variety of literature-related and text analytics projects.  The team enjoys a collegial and supportive atmosphere to deliver top quality software for our users.

Europe PMC is a database of research publications, encompassing around 45M records, made up of the abstracts and full text of research articles, preprints and other documents. This is enhanced by text mining for biological entities and data accessions, and supplemented by additional data and persistent identifiers enabled through integrations with other services. The Europe PMC website receives millions of visitors per month and increasing amounts of data is downloaded programmatically. Europe PMC has set trends in the adoption of new infrastructure developments such as data integration with publications and preprints discovery and indexing.

The EBI is a world-leading bioinformatics centre providing biological data to the scientific community, with expertise in data storage, analysis and representation. This biomolecular information is made available through extensive services accessed via its web pages (www.ebi.ac.uk). Optimising the usefulness of these services to the user community is an ongoing and challenging task.

Your role 

  • Working with ELIXIR/EMBL-EBI database resources to identify most efficient and robust methods to link data outputs to literature in order to make data FAIRer (in particular, more findable, and more accessible) and to facilitate research assessment which rewards FAIR data. 

  • Working with database curators at EMBL-EBI and other ELIXIR core data resources to understand their requirements and provide meaningful and robust solutions to their challenges, such as identifying and linking to research articles that are curatable for their resources

  • Work with the Europe PMC team to develop state of the art search capabilities within Europe PMC, include AI developments and use of the existing available infrastructure and annotations to improve user experience

  • Work with new and varied data in both structured and unstructured forms (e.g. ontologies, text-mined outputs) to improve article to data links and enable targeted search of the literature

  • Act as life sciences subject matter expert within the Literature Services Team and seek funding opportunities to expand work in this area to enhance the capabilities of Europe PMC to support researchers 

  • Coordinate with ELIXIR/EMBL-EBI work package partners to gather requirements and feedback for proposed Scientific programme strategy and proposals and translate them into project implementation

  • Work closely with project coordinator and UX architect to address user needs for data links delivery and enhancement, and develop solutions for implementation within  Europe PMC

  • Work with Europe PMC funders to understand all their requirements and communicate Europe PMC capabilities to ensure their needs are met and they have an understanding of the full capabilities of the service 

  • Assist development of Europe PMC’s Open Science strategy, incorporating elements such as ORCID, ROR, Funding information, non traditional research outputs and other emerging developments in metrics and metadata

  • Ability to test and rapidly prototype novel applications of Europe PMC and other open data to determine feasibility of full implementation and user need

  • Periodically write reports/papers on the latest activities of the group as well as attend conferences and workshop to promote activities of the Europe PMC and Literature Services team

You have 

  • PhD/MSc in related bioinformatics field

  • 4+ years of research and development experience

  • Strong background in data integration

  • Experience in fast prototyping languages (e.g. Python) and Unix skills.

  • Experience with linked data, RDF, MongoDB, Graph databases (e.g.: Neo4J), Triple store technologies.

  • Relational databases and SQL (Oracle and PostgreSQL experience desirable)

  • R, Python, Java, Javascript, and common libraries for data science and visualisation (Pandas, ggplot2 etc.)

  • Familiarity with conventional machine learning and deep learning, including training models and developing pipelines

  • Familiarity with natural language processing concepts and techniques

  • Experience developing and working in collaboration with other academic partners

  • Experience in academic communication and presenting at conferences

You may also have

  • Familiarity with academic funding model and UK and European funding bodies

  • Knowledge of Open Science practices, FAIR, preprints, post publication peer review

  • Familiarity with molecular biology and bioinformatics concepts

  • Familiarity with PubMed, MeSH, JATS

  • Front end development

  • Familiarity with academic publishing model, and with publishing in academic venues

Contract length: 3 year fixed-term (renewable up to 9 years), contract starting on 1st November 2025.

Salary: Grade 5 or Grade 6 depending on qualifications and experience, monthly salary at £3,229 or £3,612 after tax but excluding pension and health insurance contributions. Plus generous benefits.

Why join us

Do something meaningful
At EMBL-EBI you can apply your talent and passion to accelerate science and tackle some of humankind's greatest challenges. EMBL-EBI, part of the European Molecular Biology Laboratory, is a worldwide leader in the storage, analysis and dissemination of large biological datasets. We provide the global research community with access to publicly available databases and tools which are crucial for the advancement of healthcare, food security, and biodiversity.
 

Join a culture of innovation
We are located on the Wellcome Genome Campus, alongside other prominent research and biotech organisations, and surrounded by beautiful Cambridgeshire countryside. This is a highly collaborative and inclusive community where our employees enjoy a relaxed atmosphere. We are committed to ensuring our employees feel valued, supported and empowered to reach their professional potential.  Watch this video to see how EMBL-EBI makes an impact.

Enjoy lots of benefits:

  • Financial incentives: Monthly family, child and non-resident allowances, annual salary review, pension scheme, death benefit, long-term care, accident-at-work and unemployment insurances

  • Flexible working arrangements - including hybrid working patterns 

  • Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover)

  • Generous time off: 30 days annual leave per year, in addition public holidays

  • Relocation package including installation grant (if required)

  • Campus life: Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely)

  • Family benefits: On-site nursery, 10 days of child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances

  • Benefits for non-UK residents: Visa exemption, education grant for private schooling, financial support to travel back to your home country every second year and a monthly non-resident allowance.

For detailed information please visit our employee benefits page here

What else you need to know

  • International applicants: We recruit internationally and successful candidates are offered visa exemptions. Please take a look at our International Applicants page for further information.  

  • EMBL is a signatory of DORA. Find out how we apply DORA principles to our recruitment and performance assessment processes here.

  • Diversity and inclusion: At EMBL, we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ+ & individuals from all nationalities. 

  • How to apply: To apply please submit a cover letter and a CV through our online system. Applications will close at 23:59 CET on the date shown below. We aim to provide a response within two weeks after the closing date.

Closing Date

13/07/2025


 

Apply now Apply later
Job stats:  0  0  0
Category: Data Science Jobs

Tags: Agile Bioinformatics Biology Deep Learning Elixir ggplot2 Java JavaScript Machine Learning MongoDB Neo4j NLP Oracle Pandas PhD Pipelines PostgreSQL Prototyping Python R RDBMS RDF Research Security SQL UX

Perks/benefits: Career development Conferences Fitness / gym Flex hours Flex vacation Health care Medical leave Parental leave Relocation support

Region: Europe
Country: United Kingdom

More jobs like this