Data Scientist - Materials
Cambridge, MA USA
Flagship Pioneering, Inc.
We are Flagship Pioneering We invent platforms and build companies that change the world. Pioneering Partnerships Latest News Companies founded 100+đ About Lila Sciences
Lila Sciences is the worldâs first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science.⯠We are pioneering a new age of boundless discovery by building the capabilities to apply AI to every aspect of the scientific method.⯠We are introducingâŻscientific superintelligence to solve humankind's greatestâŻchallenges, enabling scientists to bring forth solutions in human health, climate, and sustainability at a pace and scale never experienced before. Learn more about this mission at âŻwww.lila.aiâŻâŻÂ Â
At Lila, we are uniquely cross-functional and collaborative. We are actively reimagining the way teams work together and communicate. Therefore, we seek individuals with an inclusive mindset and a diversity of thought. Our teams thrive in unstructured and creative environments. All voices are heard because we know that experience comes in many forms, skills are transferable, and passion goes a long way.
If this sounds like an environment youâd love to work in, even if you only have some of the experience listed below, please apply.
đ Your Impact at Lila
As a Data Scientist in our Physical Sciences organization, you will transform complex experimental and testing datasets into actionable insights that drive our autonomous labâs decision-making. Youâll partner with electrochemists, synthesis chemists, characterization specialists, and automation engineers to ensure data quality, build predictive models, and inform scientific campaigns across materials and device development.
đ ď¸Â What You'll Be Building
- Data Infrastructure: Design and maintain robust ETL pipelines to ingest, validate, and preprocess data from diverse sourcesâelectrochemical tests, materials characterization, and automated lab instruments.
- Feature Engineering & Modeling: Perform domain-relevant data transformations, extract meaningful descriptors from raw data (e.g., voltage curves, spectroscopic signatures, image-based measurements) and develop statistical or machine learning models to relate independent variables (time, composition, etc.) to performance metrics and failure modes.
- Analytics & Visualization: Create interactive dashboards and reports to communicate trends, anomalies, and key insights to scientific and engineering teams.
- Active Learning Support: Collaborate with ML scientists to integrate your analytical outputs into active learning loops, helping to prioritize experiments and optimize resource allocation.
- Cross-Functional Partnership: Work closely with R&D leadership, Product Managers, and automation specialists to translate scientific questions into data requirements and modeling strategies.
- Reproducibility & Documentation: Establish best practices for code versioning, data provenance, and analysis notebooks; contribute to internal knowledge bases and publications.
đ§° What Youâll Need to Succeed
- Masterâs or Ph.D. in Data Science, Statistics, Materials Science, Chemistry, Physics, or a related quantitative field.
- 2+ years of experience in data analysis, statistical modeling, or machine learningâideally applied to physical sciences or engineering datasets.
- Proficiency in Python (pandas, NumPy, scikit-learn) and SQL for data manipulation and analysis.
- Hands-on experience building ETL workflows using tools like Airflow, Prefect, or similar.
- Strong foundation in experimental design, statistical inference, and multivariate analysis.
- Familiarity with data visualization libraries (Plotly, Dash, or similar) and dashboard frameworks.
â¨Â Bonus Points For
- Experience working with electrochemical or materials characterization data (e.g., impedance spectroscopy, X-ray diffraction, electron microscopy).
- Materials-specific python libraries (pymatgen)
- Exposure to cloud-based data platforms (AWS, GCP, or Azure) and scalable storage solutions.
- Knowledge of containerization (Docker, Singularity) and workflow orchestration (Snakemake, Nextflow).
- Prior contributions to open-source data tools or scientific software.
- Understanding of active learning, Bayesian optimization, or uncertainty quantification in experimental contexts.
đ Weâre All In
Lila Sciences is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.
đ¤Â A Note to Agencies
Lila Sciences does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Lila Sciences or its employees is strictly prohibited unless contacted directly by Lila Scienceâs internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Lila Sciences, and Lila Sciences will not owe any referral or other fees with respect thereto.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index đ°
Tags: Airflow AWS Azure Bayesian Chemistry Data analysis Data quality Data visualization Docker Engineering ETL Feature engineering GCP Machine Learning ML models NumPy Open Source Pandas Physics Pipelines Plotly Python R R&D Scikit-learn SQL Statistical modeling Statistics Testing
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.