Data Scientist (R&D Project)

US-Remote

Apply now Apply later

The position is anticipated to be short-term, and not expected to exceed December 31, 2025. Please note that there is no guaranteed duration of the position, and your employment will be at will and for no fixed duration. As such, it can be terminated by you or the organization at any time, with or without notice or cause, for any reason not otherwise prohibited by law.

*This position is fully remote/home based. Applications will be accepted from candidates based in the UK and the following US states: FL, MA, MD, NY, PA, TX, VA.

PLOS is a nonprofit organization on a mission to drive open science forward with measurable, meaningful change in research publishing, policy, and practice. We believe in a better future where science is open to all, for all

Role Summary

Use data science to provide insight into the nature and structure of our data and content, both published content and internal data sets, and lead on developing models to improve processing, access, understanding and use of that data.

Working closely with the Subject Matter Experts, Product Managers, Software Engineers and Product Designers, you will play a key role in improving understanding of our content and data, improving how we manage, process and use that data in support of PLOS’s goals.

You will be tasked with the large-scale analysis of our broad and varied collection of scholarly content, which includes research articles and associated data sets, and line of business data and information. This will require working with structured and unstructured data, a large corpus of scholarly articles, using programmatic techniques such as statistical analysis, natural language processing, information retrieval, and machine learning. You will also work with the rest of the team to turn your insights and software prototypes into production services that improve the utility of this data for both our end users and internal stakeholders.

Responsibilities

  • Create and use machine learning models, statistical analysis, natural language processing to improve scientific content workflows, enhance discoverability, and support Open Science initiatives.
  • Collect, clean, and analyze large datasets of scientific content and related information from various sources, ensuring data quality and integrity.
  • Build and test predictive models and machine learning algorithms for tasks such as entity extraction, workflow automation, and enhancing the understanding of scientific content.
  • Visualize and present findings in a clear, concise, and compelling manner to both technical and non-technical audiences.
  • Work as part of a cross-functional team, contributing insights, models and code and deploying production services that improve our use of data.
  • Collaborate with editorial, marketing, product, and colleagues across PLOS to understand data needs and translate business requirements into analytical solutions that enable new open science capabilities.
  • Contribute to the development of data strategies and best practices within the organization and identify opportunities for workflow optimization and automation.
  • Engage with the latest research and trends in data science, Open Science, and scholarly publishing, proactively identifying opportunities to apply innovative techniques and refine best practices.
  • Consider the ethical implications of all data techniques as applied to our data, always ensuring that they are appropriate, take into account the potential for negative impact and do not bias research.

Knowledge and Skills

  • Extensive experience in statistical modeling, machine learning, and data mining techniques, with a focus on applications in text analysis or scientific data, including knowledge of forecasting, A/B testing, entity extraction, and feature engineering.
  • Proficiency in programming languages such as Python, R, and SQL, and data analysis libraries (e.g., Pandas, NumPy, SciPy, Tidyverse).
  • Strong knowledge of machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, scikit-learn, NLTK).
  • Experience with NLP techniques, such as named entity recognition (NER), topic modeling, semantic similarity, and knowledge graph construction.
  • Demonstrated ability to communicate complex technical findings clearly and effectively, both verbally and in writing, through reports and presentations to diverse audiences.
  • Strong analytical and problem-solving skills, with a high degree of attention to detail and accuracy in handling scientific data.
  • Experience working with large datasets and database systems, and ideally with scientific content repositories or publishing platforms.
  • Familiarity with the scientific research environment, scholarly literature, and open science principles are an advantage.
  • Able to develop hypotheses based on quantitative and qualitative evidence
  • Experience working with solid development practices, git, CI etc.
  • Ability to work effectively both independently and collaboratively within a remote, agile team environment.

Qualifications

  • A Master's degree in a relevant field such as Data Science, Statistics, Computer Science, Bioinformatics, or a related quantitative discipline with a focus on scientific applications is preferred.
  • Relevant work experience in a data science role within scientific publishing, research, or a related field is desirable.

Physical Requirements and Work Environment

  • Prolonged periods stationary at a desk and working on a computer
  • Some national and international travel will be required
  • Some flexibility to work across time zones

 

The base salary range we’ve established for this position is (US) $105,000 - $145,000. PLOS also offers a comprehensive benefits package summarized below. 

BENEFITS: 

US: 

  • 401k with employer match 
  • Employee sponsored health, dental and vision insurance (Dental and Vision 100% employer paid) 
  • Paid Vacation, 12 public holidays and sick leave 
  • Parental leave 
  • Birthday and three winter holidays days off 
  • Short term and long term disability insurance 
  • 2 days paid time off for volunteering per year 
  • Fully remote work environment with stipend on joining for home office  

 

To learn more about how PLOS protects your privacy, see our Employee Privacy Notice.

Apply now Apply later
Job stats:  1  0  0
Category: Data Science Jobs

Tags: A/B testing Agile Bioinformatics Computer Science Data analysis Data Mining Data quality Engineering Feature engineering Git Machine Learning ML models NLP NLTK Nonprofit NumPy Pandas Privacy Python PyTorch R R&D Research Scikit-learn SciPy SQL Statistical modeling Statistics TensorFlow Testing Topic modeling Unstructured data

Perks/benefits: 401(k) matching Career development Flex vacation Health care Home office stipend Insurance Parental leave

Regions: Remote/Anywhere North America
Country: United States

More jobs like this