Data Engineer
Beirut, Lebanon
Thales
From Aerospace, Space, Defence to Security & Transportation, Thales helps its customers to create a safer world by giving them the tools they need to perform critical tasksJOB OBJECTIVE
As a Data Engineer, you will be participate in the different phases of Thales Digital Transformation Projects by collecting, modeling, and processing data on Google Cloud Platform (GCP) to enable end users to perform accurate analysis. Leveraging tools like BigQuery, Cloud Storage, Dataproc, and airflow, you’ll build scalable data pipelines, design robust data models, and support the creation of dashboards.
If you have experience with Talend, it will be considered a plus.
Roles and Responsibilities
- Assemble large, complex datasets that meet both functional and non-functional business requirements
- Integrate business data from diverse systems to build a unified, analytics-ready data foundation
- Support project qualification, including data discovery, scoping, and feasibility analysis for ETL/ELT activities
- Conduct technical framing: perform feasibility studies, define needs, estimate effort, and plan project timelines
- Develop reusable, scalable ETL/ELT pipelines using PySpark, SQL, and Python on Google Cloud Platform (GCP)
- Experience in developing Talend Jobs, is considered a strong asset.
- Define Technical Prerequisites and template
- Developing reusable ETL Jobs use cases
- Design and implement ingestion and transformation workflows using Cloud Composer (Airflow) and Dataproc
- Write clear and detailed technical specifications for data solutions and system components
- Ensure proper documentation of all work for operational maintainability and knowledge sharing
- Highly skilled in Microsoft SQL Server Stack (Data Base Engine)
- Advanced SQL, Query performance tuning Skills
- Contribute to and enrich a catalogue of reusable data solution components and templates
- Identify, design, and implement internal process improvements including infrastructure re-architecture, automation, and performance tuning
- Build and operate infrastructure for efficient extraction, transformation, and loading of data using BigQuery, GCS, and Dataproc
- Build analytical data pipelines that provide actionable insights into business KPIs such as operational performance and customer behavior
- Provide end-user support, assist stakeholders with data-related issues, and ensure customer satisfaction
- Customer-oriented mindset with a strong focus on solution quality and reliability
- Proactive, quality-driven approach to development, with a focus on best practices and continuous improvement
- Collaborate with business and technical teams (product, design, data, and executive) to support their data infrastructure needs.
- Build Data Warehouses, Data Marts
- Demonstrated experience working in cloud data engineering environments, especially with GCP data services: BigQuery, Google Cloud Storage, Dataproc, Pub/Sub, Looker, etc.
- Strong experience with distributed data processing using Apache Spark / PySpark
- Advanced SQL skills with a focus on performance and optimization in cloud-native warehouses
- Familiar with CI/CD pipelines and infrastructure as code tools (Cloud Build, Git, Terraform) for automated deployment and testing
- Proficient in row-level security, access control management, and secure data delivery
- Experienced in managing ETL feeding mechanisms: delta loads, full loads, and historical data backfills
- Skilled in building data lakes, data marts, and OLAP models using GCP-native tools
- Knowledge of Agile methodology and active participation in cross-functional teams and ceremonies
- Able to collaborate across multiple interfaces in an international context, handling complexity and scale
QUALIFICATION, CERTIFICATION & EDUCATIONAL REQUIREMENTS
Tools :
- Google Cloud Platform (GCP): BigQuery, Cloud Storage, Cloud Composer (Airflow), Dataproc, Pub/Sub, Looker
- Python, PySpark, SQL, Jupyter Notebooks
- GitLab, Cloud Build, Terraform, dbt (data build tool)
- Visual Studio Code, BigQuery UI, Looker Studio
Expertises :
- Strong understanding of CI/CD pipelines and DevOps for data workflows using Cloud Build, Git, and Terraform
- Proficient in data modeling for analytical solutions in BigQuery and Looker
- Expertise in data pipeline development using Apache Airflow, PySpark, and SQL
- Hands-on experience with data lake and lakehouse architectures on GCP
- Familiarity with Agile methodologies (Scrum/Kanban) for iterative delivery
- Understanding of data governance, security (IAM, RLS), and monitoring on cloud platforms
Educational : In a master’s degree in computer science/statistics/mathematics fields
PREFERRED SKILLS
- Fluency in English, both written and verbal.
- Excellent communication and presentation skills.
- Team work and willing to perform trough cooperation.
- Strong attention to details.
- Willingness for change and solving problem.
- Be passionate about digital transformation.
- Strong analytical mindset.
- Data oriented.
#LI-AC2
At Thales we provide CAREERS and not only jobs. With Thales employing 80,000 employees in 68 countries our mobility policy enables thousands of employees each year to develop their careers at home and abroad, in their existing areas of expertise or by branching out into new fields. Together we believe that embracing flexibility is a smarter way of working. Great journeys start here, apply now!* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow Architecture BigQuery CI/CD Computer Science Data governance Data pipelines Dataproc dbt DevOps ELT Engineering ETL GCP Git GitLab Google Cloud Jupyter Kanban KPIs Looker Mathematics OLAP Pipelines PySpark Python Scrum Security Spark SQL Statistics Talend Terraform Testing
Perks/benefits: Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.