Senior Data Engineer @ Entalpic

Paris, France

Breega

Breega propels pioneering and purpose-driven founders from idea into impact. Crafted for founders by founders, we built Breega to provide start-ups with the help we wish we'd had.

View all jobs at Breega

Apply now Apply later

Senior Data Engineer


Our company: Entalpic 

We are a dedicated team at the forefront of AI and chemistry, working to accelerate the energy transition. Our focus is on discovering new chemicals and materials that can lead to more sustainable practices in sectors where the need for change is most urgent. Specifically, we develop a modern generative AI platform to discover new catalysts that optimize chemical reactions, significantly reducing CO2 emissions and thus making a substantial impact on the environment.

As an early-stage AI-driven startup backed by significant funding (>5m), we base our approach on state-of-the-art academic research to drive practical business solutions. We value clear communication and simplicity in our approaches, promoting a constant optimization mindset.

Join Entalpic to be part of a growing team, eager to learn and adapt, united by the belief that our technology can make a significant positive impact and contribute to transforming carbon-intensive industries for a sustainable future.

Co-founders: Mathieu Galtier, Victor Schmidt, Alexandre Duval

Entalpic is dedicated to equal opportunity employment and fosters an environment that is open and respectful of diversity. All applicants are encouraged to apply, even if you don’t meet all above requirements. If you have passion for our mission and believe you can contribute, we want to hear from you. 


Reporting & Job Location

You will report to the CTO of Entalpic and will be located in our Paris offices.


Mission Highlights

As a centerpiece member of Entalpic’s team, you will undertake two core missions:


  1. Data Infrastructure Development: Design, develop, and maintain our data infrastructure, ensuring seamless integration of diverse data sources, including textual data, simulation outputs, and experimental results, in order to support machine learning and LLM applications.

  2. Data Platform Enhancement: lead to the development of our internal data platform, facilitating efficient data access and interaction (including through AI agents), and promoting a data-centric culture within the organization.


Role & responsibilities

  • Data Engineering: Build and optimize scalable data pipelines to process and integrate multimodal data from simulations (e.g., DFT outputs), textual sources (e.g., research papers, patents) and experimental data (e.g. time series or imagery from acquisition hardware).

  • Data Storage Solutions: Implement and manage robust data storage systems, ensuring data integrity, accessibility, and scalability to support various analytical and machine learning tasks.

  • Automation and Scripting: Develop scripts and tools to automate data ingestion, processing, and transformation tasks, enhancing workflow efficiency and reducing manual intervention.

  • Data Governance and Lineage: Establish and enforce data governance policies, ensuring proper data lineage tracking, quality control, and compliance with relevant data protection standards.

  • Infrastructure Support: Collaborate with DevOps and infrastructure teams to ensure that data engineering solutions are well-integrated into the overall system architecture, leveraging cloud platforms like AWS or GCP for deployment and scalability.

  • Collaboration and Support: Work closely with cross-functional teams, including data scientists and domain experts, to understand data requirements and provide tailored solutions that facilitate data-driven decision-making.

  • Open Source Engagement: Contribute to open-source projects and initiatives, sharing knowledge and tools developed internally to foster community engagement and collaboration.


Profile

  • M.S in Computer Science, Computer Science, Data Engineering, or a related field.

  • 7+ years of experience in data engineering, with a proven track record of handling diverse data modalities and building scalable data infrastructures.

  • Proficiency in at least two programming languages (e.g., Python, Scala, Rust, Go, etc.) and experience with both SQL (MySQL & PostgreSQL) and NoSQL (MongoDB) databases.

  • Strong understanding of data modeling, ETL processes, and data warehousing solutions.

  • Experience with cloud platforms, particularly AWS or GCP, and familiarity with infrastructure-as-code tools.

  • Excellent communication skills in English, with the ability to work collaboratively in interdisciplinary teams.

  • Thrives in a fast-paced, evolving startup environment.

Bonuses:

  • Experience with machine learning pipelines and deploying AI training infrastructures.

  • Contributions to open-source projects and active participation in relevant communities.

  • Familiarity with scientific data processing and analysis, particularly in the context of materials science.


Expertise

  • Programming: Strong software engineering skills in Python and at least one other programming language, with experience in software development best practices and version control systems such as Git.

  • Data Management: In-depth knowledge of data structures and database systems, both SQL and NoSQL, to manage and process large datasets efficiently.

  • Cloud Platforms: Experience with cloud services (AWS, GCP) and infrastructure-as-code tools (e.g., Terraform) to support data infrastructure deployment and management.

  • DevOps Collaboration: Familiarity with CI/CD pipelines and containerization tools (e.g., Docker, Kubernetes) to ensure seamless integration of data engineering solutions into the broader system architecture.

  • Open Source: Understanding of the challenges and best practices associated with developing and maintaining open-source code, libraries, and communities.


Compensation & benefits

We are a no-nonsense startup, where we favor a sustainable culture promoting work-life balance and good compensation over foosball tables and free food. We offer:

  • A competitive salary

  • Equity (BSPCE), to reflect the value you bring to Entalpic and to foster a shared journey

  • Comprehensive health insurance (Alan blue)

  • French level paid leave and time-off work

  • Dynamic work setting. Although our preference is for in-person collaboration, we will be flexible with occasional remote work arrangements.

  • and more to come as we grow

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Architecture AWS Chemistry CI/CD Computer Science Data governance Data management Data pipelines Data Warehousing DevOps Docker Engineering ETL GCP Generative AI Git Kubernetes LLMs Machine Learning MongoDB MySQL NoSQL Open Source Pipelines PostgreSQL Python Research Rust Scala SQL Terraform

Perks/benefits: Career development Competitive pay Equity / stock options Flex hours Flex vacation Health care Startup environment

Region: Europe
Country: France

More jobs like this