Data Scientist - Materials

Cambridge, MA USA

Flagship Pioneering, Inc.

We are Flagship Pioneering We invent platforms and build companies that change the world. Pioneering Partnerships Latest News Companies founded 100+

View all jobs at Flagship Pioneering, Inc.

Apply now Apply later

🚀 About Lila Sciences

Lila Sciences is the world’s first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science.  We are pioneering a new age of boundless discovery by building the capabilities to apply AI to every aspect of the scientific method.  We are introducing scientific superintelligence to solve humankind's greatest challenges, enabling scientists to bring forth solutions in human health, climate, and sustainability at a pace and scale never experienced before. Learn more about this mission at  www.lila.ai    

At Lila, we are uniquely cross-functional and collaborative. We are actively reimagining the way teams work together and communicate. Therefore, we seek individuals with an inclusive mindset and a diversity of thought. Our teams thrive in unstructured and creative environments. All voices are heard because we know that experience comes in many forms, skills are transferable, and passion goes a long way.

If this sounds like an environment you’d love to work in, even if you only have some of the experience listed below, please apply.

🌟 Your Impact at Lila

As a Data Scientist in our Physical Sciences organization, you will transform complex experimental and testing datasets into actionable insights that drive our autonomous lab’s decision-making. You’ll partner with electrochemists, synthesis chemists, characterization specialists, and automation engineers to ensure data quality, build predictive models, and inform scientific campaigns across materials and device development.

🛠️ What You'll Be Building

  • Data Infrastructure: Design and maintain robust ETL pipelines to ingest, validate, and preprocess data from diverse sources—electrochemical tests, materials characterization, and automated lab instruments.
  • Feature Engineering & Modeling: Perform domain-relevant data transformations, extract meaningful descriptors from raw data (e.g., voltage curves, spectroscopic signatures, image-based measurements) and develop statistical or machine learning models to relate independent variables (time, composition, etc.) to performance metrics and failure modes.
  • Analytics & Visualization: Create interactive dashboards and reports to communicate trends, anomalies, and key insights to scientific and engineering teams.
  • Active Learning Support: Collaborate with ML scientists to integrate your analytical outputs into active learning loops, helping to prioritize experiments and optimize resource allocation.
  • Cross-Functional Partnership: Work closely with R&D leadership, Product Managers, and automation specialists to translate scientific questions into data requirements and modeling strategies.
  • Reproducibility & Documentation: Establish best practices for code versioning, data provenance, and analysis notebooks; contribute to internal knowledge bases and publications.

🧰 What You’ll Need to Succeed

  • Master’s or Ph.D. in Data Science, Statistics, Materials Science, Chemistry, Physics, or a related quantitative field.
  • 2+ years of experience in data analysis, statistical modeling, or machine learning—ideally applied to physical sciences or engineering datasets.
  • Proficiency in Python (pandas, NumPy, scikit-learn) and SQL for data manipulation and analysis.
  • Hands-on experience building ETL workflows using tools like Airflow, Prefect, or similar.
  • Strong foundation in experimental design, statistical inference, and multivariate analysis.
  • Familiarity with data visualization libraries (Plotly, Dash, or similar) and dashboard frameworks.

✨ Bonus Points For

  • Experience working with electrochemical or materials characterization data (e.g., impedance spectroscopy, X-ray diffraction, electron microscopy).
  • Materials-specific python libraries (pymatgen)
  • Exposure to cloud-based data platforms (AWS, GCP, or Azure) and scalable storage solutions.
  • Knowledge of containerization (Docker, Singularity) and workflow orchestration (Snakemake, Nextflow).
  • Prior contributions to open-source data tools or scientific software.
  • Understanding of active learning, Bayesian optimization, or uncertainty quantification in experimental contexts.

🌈 We’re All In

Lila Sciences is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.

🤝 A Note to Agencies

Lila Sciences does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Lila Sciences or its employees is strictly prohibited unless contacted directly by Lila Science’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Lila Sciences, and Lila Sciences will not owe any referral or other fees with respect thereto.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Data Science Jobs

Tags: Airflow AWS Azure Bayesian Chemistry Data analysis Data quality Data visualization Docker Engineering ETL Feature engineering GCP Machine Learning ML models NumPy Open Source Pandas Physics Pipelines Plotly Python R R&D Scikit-learn SQL Statistical modeling Statistics Testing

Region: North America
Country: United States

More jobs like this