Principal Data Engineer, R&D Informatics
Cambridge, MA USA
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Flagship Pioneering, Inc.
Pioneering Partnerships Latest News Companies founded 100+What if... We could harness the power of Flagship’s scientific platforms and create novel treatment options that benefit more patients, sooner?
Pioneering Medicines, an initiative of Flagship Pioneering, is building a world-class biopharmaceutical R&D capability focused on conceiving and developing life-changing treatments for patients by harnessing the power of Flagship's scientific platforms and applying those innovative approaches to serious diseases with unmet medical need. Unique to Pioneering Medicines’ approach is the opportunity to combine platforms to create truly novel and potentially transformative treatments.
Position Summary
We are seeking a Principal Data Engineer to lead the design, development, and implementation of modern, scalable data solutions that enable R&D, bioinformatics, and informatics across the enterprise. You will be the technical authority and primary owner of data architecture within the R&D Informatics team—setting direction, solving complex integration challenges, and building platforms that interoperate with our broader digital and scientific infrastructure.
This role goes beyond building pipelines: it requires strategic thinking, deep technical fluency, and strong cross-functional communication. You will work closely with stakeholders across Infrastructure & Operations (I&O), Lab IT, PI Tech, and the scientific community to shape data strategy, establish standards, and deliver high-impact solutions that accelerate scientific discovery.
Key Responsibilities
- Lead the design, implementation, and evolution of scalable, secure data architectures in AWS that support the full spectrum of R&D workflows, from exploratory data analysis to production-scale pipelines.
- Serve as the technical lead and subject matter expert on complex data engineering projects involving interdependent systems, legacy environments, and emerging cloud-first platforms.
- Collaborate across I&O, Lab IT, PI Tech, data science, and bioinformatics teams to ensure architectural alignment, reuse of components, and shared understanding of system states and dependencies.
- Architect and build robust data lakes, marts, and pipelines using AWS-native services (e.g., Glue, Redshift, Athena), open-source tools (e.g., Spark, dbt, Airflow), and custom development in Python/SQL.
- Define and implement best practices for data governance, lineage, quality, and lifecycle management in alignment with security and compliance policies.
- Champion automation and DevOps practices for continuous integration, testing, deployment, and monitoring of data pipelines.
- Leverage generative AI tools and frameworks to accelerate pipeline development, data transformation, and metadata enrichment at scale.
- Translate ambiguous scientific and operational requirements into elegant, maintainable technical solutions with a long-term vision.
- Mentor junior engineers and establish engineering standards that drive excellence, reproducibility, and innovation across the team.
Required Qualifications
- 7+ years of hands-on experience in data engineering, preferably in life sciences, biotech, or healthcare environments.
- Proven expertise in designing and operating cloud-native data architectures, particularly within AWS (e.g., S3, Glue, Redshift, RDS, Athena).
- Advanced proficiency in Python and SQL for data manipulation, pipeline development, and system automation.
- Familiarity with modern frameworks such as Airflow, dbt, Spark, and experience supporting both structured and unstructured data (CSV, JSON, Parquet, imaging, etc.).
- Demonstrated success in leading complex, cross-functional projects involving technical and scientific stakeholders.
- Strong understanding of DevOps practices, CI/CD pipelines, and Git-based collaboration models.
- Practical experience using generative AI tools (e.g., GitHub Copilot, LangChain, Claude, GPT-based tools) to boost engineering velocity and reduce boilerplate.
Preferred Qualifications
- Experience supporting bioinformatics, cheminformatics, or clinical data workflows.
- Familiarity with scientific notebook environments (e.g., Jupyter, RStudio) and/or visualization platforms (e.g., Tableau, Spotfire).
- Exposure to Agile or Scrum-based development methodologies.
- Relevant certifications (e.g., AWS Solutions Architect Associate, Data Analytics Specialty).
What We Look For
- A strategic builder who balances innovation with pragmatism.
- A systems thinker with the ability to architect solutions that scale across research and operational domains.
- A collaborative partner who thrives in cross-disciplinary settings and communicates with clarity and empathy.
- A relentless learner driven by curiosity, purpose, and a desire to impact human health.
About Flagship Pioneering:
Flagship Pioneering is a biotechnology company that invents and builds platform companies that change the world. We bring together the greatest scientific minds with entrepreneurial company builders and assemble the capital to allow them to take courageous leaps. Those big leaps in human health and sustainability exponentially accelerate scientific progress in areas ranging from cancer detection and treatment to nature-positive agriculture. What sets Flagship apart is our ability to advance biotechnology by uniting life science innovation, company creation, and capital investment under one roof in a way that is largely without precedent. Our scientific founders, entrepreneurial leaders, and professional capital managers are each aligned around an institutionalized process that enables us to innovate and transform for the benefit of people and planet. Many of the companies Flagship has founded have addressed humanity’s most urgent challenges: vaccinating billions of people against COVID-19, curing intractable diseases, improving human health, preempting illness, and feeding the world by improving the resiliency and sustainability of agriculture.
Flagship has been recognized twice on FORTUNE’s “Change the World” list, an annual ranking of companies that have made a positive social and environmental impact through activities that are part of their core business strategies, and has been twice named to Fast Company’s annual list of the World’s Most Innovative Companies.
Flagship Pioneering and our ecosystem companies are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.
At Flagship, we recognize there is no perfect candidate. If you have some of the experience listed above but not all, please apply anyway. Experience comes in many forms, skills are transferable, and passion goes a long way. We are dedicated to building diverse and inclusive teams and look forward to learning more about your unique background.
Recruitment & Staffing Agencies: Flagship Pioneering and its affiliated Flagship Lab companies (collectively, “FSP”) do not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to FSP or its employees is strictly prohibited unless contacted directly by Flagship Pioneering’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of FSP, and FSP will not owe any referral or other fees with respect thereto.
#LI-NM1
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow Architecture Athena AWS Bioinformatics CI/CD Claude Copilot CSV Data analysis Data Analytics Data governance Data pipelines Data strategy dbt DevOps EDA Engineering Generative AI Git GitHub GPT JSON Jupyter LangChain Open Source Parquet Pipelines Python R R&D Redshift Research Scrum Security Spark Spotfire SQL Tableau Testing Unstructured data
Perks/benefits: Career development Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.