Technical Lead - Databricks, Azure Data Lake, Python

Greenfield, Indiana, United States

Apply now Apply later

We are looking for a highly experienced Data Engineering Specialist to join our team. The ideal candidate should have a strong background in cloud technologies, DevOps practices, and data engineering to support and enhance our RDAP initiatives.

Requirements

Key Responsibilities:

Databricks Lakehouse Solutions: Design, develop, and maintain Databricks-based solutions utilizing cloud platforms such as Azure Synapse and GCP.

DevOps & CI/CD: Implement and manage CI/CD pipelines using tools like GitHub while ensuring best practices in test-driven development, code reviews, and branching strategies.

Python Development: Build, manage, and optimize Python packages using tools such as setup, Poetry, wheels, and artifact registries.

Data Pipelines & Workflows: Develop and optimize workflows in Databricks (PySpark, Databricks Asset Bundles) for data ingestion, processing, and transformation.

Database Management: Work with SQL databases, including Unity Catalog, SQL Server, Hive, and Postgres.

Orchestration: Implement data orchestration solutions using tools like Databricks Workflows, Airflow, and Dagster.

Event-Driven Architecture: Manage event streaming solutions using Kafka, Azure Event Hub, and Google Cloud Pub/Sub.

Change Data Capture (CDC): Implement CDC strategies with tools like Debezium.

Data Migration: Design and execute data migration projects for Azure Synapse and Databricks Lakehouse.

Cloud Storage Management: Handle cloud storage solutions like Azure Data Lake Storage and Google Cloud Storage.

Identity & Access Management: Configure and manage Azure Active Directory (AD Groups, Service Principals, Managed Identities) for security and authentication.

---

Primary Skills (Must-Have):

Python Package Development (setup, poetry, wheels, artifact registries)

Databricks & PySpark (Databricks Asset Bundles)

Open File Formats (Delta, Parquet, Iceberg, etc.)

SQL Databases (Unity Catalog, SQL Server, Hive, Postgres)

Orchestration Tools (Databricks Workflows, Airflow, Dagster)

Azure Data Lake Storage

Azure Active Directory (AD Groups, Service Principals, Managed Identities)

---

Secondary Skills (Good to Have):

Kafka, Azure Event Hub, Google Cloud Pub/Sub

Change Data Capture (Debezium)

Google Cloud Storage

---

Soft Skills & Leadership Responsibilities:

Communication Skills:

Ability to articulate complex technical concepts to both technical and non-technical stakeholders.

Strong documentation skills for process guidelines, technical workflows, and reports.

Problem-Solving & Analytical Thinking:

Strong troubleshooting abilities and the capability to resolve issues effectively.

Analytical mindset to optimize data workflows and system performance.

Leadership & Collaboration:

Client Interactions: Understand business requirements, contribute to design discussions, and translate them into actionable deliverables.

Team Collaboration: Work closely with cross-functional teams across development, operations, and business units.

Stakeholder Engagement: Build and maintain strong relationships with internal and external stakeholders.

Benefits

.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0
Category: Leadership Jobs

Tags: Airflow Architecture Azure CI/CD Dagster Databricks Data pipelines DevOps Engineering GCP GitHub Google Cloud Kafka Parquet Pipelines PostgreSQL PySpark Python Security SQL Streaming TDD

Region: North America
Country: United States

More jobs like this