Azure Data Engineering Lead - Calgary
CA-AB-Calgary
Capgemini
A global leader in consulting, technology services and digital transformation, we offer an array of integrated services combining technology with deep sector expertise.Description
Title: Azure Data Engineer Lead
Location: Calgary
Required skills:
• Python - Strong proficiency, unit testing expertise and good knowledge of packages for data, like Pandas, SQLAlchemy and Alembic.
• SQL - strong proficiency in DDL and DML, including window function (ex.: lag), CTEs, sub-queries, joins, optimization, and performance profiling. Know how SQL behaves in different platforms, like Spark, Postgresql, etc…
• Spark - Pyspark, SparkSQL, Batch and Streaming processing, partitioning/liquid clustering, delta tables, parquet
• Databricks - Workflows/Jobs, Clusters, SQL Warehouse, Unity Catalog, Performance profiling, log analysis.
• Azure EventHubs or a similar streaming solution understand how to best consume data from streamings for aggregations and parallelization/scaling.
• Postgresql - Queries, indexes(different types of indexes), performance profiling, JSON columns.
• Data Modeling: Dimensional Modeling (and experience with this model being used in BI tools), Normal Forms.
• Containerization (ex.: Docker)
• DBT - models, seeds, multiple environments, parameters, macros, unit tests, data tests, incremental materialization, snapshots
• Infra as code/ CICD tools: Kubernetes, Argo, Cross Plane, Terraform or similar.
• Migrations - versioning for database, for example, Alembic (preferred), Flyway, Liquibase, etc.…
• SQL and NoSQL databases, and selecting the best fit for different use cases.
• Logging and being able to query logs using KQL(Azure) or similar.
Desired skills:
• SQL Server - Queries, indexes, performance analysis, table/index options and maintenance.
• DBT with Databricks
• Azure DevOps CICD
o CICD for data pipelines
o MLOPs in general
• PowerBI - Semantic models
• MLFlow - track experiments, register models
Qualifications and experience:
• Computer Science degree or equivalent education and experience
• Professional experience with data ingestion, ETL, and ELT for structured and unstructured data.
• Strong proficiency in Python (especially data packages like pandas, numpy, etc) and SQL for analytics, database development, and data modelling
• Experience with DevOps and CICD for data.
• Experience working with one or more cloud platforms to implement data-intensive applications, Azure is preferred.
• Strong understanding of Agile methodologies and experience on an agile team
Responsibilities:
• Support, maintain, optimize and create ETL/ELT pipelines, both batch and streaming, in Databricks(PySpark, DatabricksSQL), Python, SQL and/or DBT.
• Build, Design and Model data objects.
o Proficient in Dimensional Modeling and Database normalization (Normal Form)
• Write and perform tests for data flow
• Work with a team with different skill sets and roles (back and front-end developers, data scientists, business specialists, DevOps, etc…) to deliver the best solution possible
• Collaborate with Platform teams to use the available infrastructure as efficiently as possible, including on the CICD deployment.
• Willing to work on PST timezone
Life at Capgemini
Capgemini supports all aspects of your well-being throughout the changing stages of your life and career. For eligible employees, we offer:
- Flexible work
- Healthcare including dental, vision, mental health, and well-being programs
- Paid time off and paid holidays
- Paid parental leave
- Family building benefits like adoption assistance, surrogacy, and cryopreservation
- Social well-being benefits like subsidized back-up child/elder care and tutoring
- Mentoring, coaching and learning programs
- Employee Resource Groups
- Disaster Relief
About Capgemini
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to
engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2024 global revenues of €22.1 billion.
Get the future you want | www.capgemini.com
Disclaimer
Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.
This is a general description of the Duties, Responsibilities and Qualifications required for this position. Physical, mental, sensory or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed. Whenever necessary to provide individuals with disabilities an equal employment opportunity, Capgemini will consider reasonable accommodations that might involve varying job requirements and/or changing the way this job is performed, provided that such accommodations do not pose an undue hardship.
Capgemini is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to your recruiting contact.
Please be aware that Capgemini may capture your image (video or screenshot) during the interview process and that image may be used for verification, including during the hiring and onboarding process.
Click the following link for more information on your rights as an Applicant http://www.capgemini.com/resources/equal-employment-opportunity-is-the-law
Job
: Programmer/AnalystSchedule
: Full-timePrimary Location
: CA-AB-CalgaryOrganization
: I&D* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Azure Clustering Computer Science Databricks Data pipelines dbt DDL DevOps Docker ELT Engineering ETL Generative AI JSON Kubernetes MLFlow MLOps NoSQL NumPy Pandas Parquet Pipelines PostgreSQL Power BI PySpark Python Spark SQL Streaming Terraform Testing Unstructured data
Perks/benefits: Career development Flex hours Flex vacation Health care Parental leave
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.