Technical Lead - Databricks, Azure Data Lake, Python

Greenfield, Indiana, United States

Full Time Senior-level / Expert USD 133K - 247K *

iSoftTek Solutions

View all jobs at iSoftTek Solutions

Apply now Apply later

Posted 3 hours ago

We are looking for a highly experienced Data Engineering Specialist to join our team. The ideal candidate should have a strong background in cloud technologies, DevOps practices, and data engineering to support and enhance our RDAP initiatives.

Requirements

Key Responsibilities:

Databricks Lakehouse Solutions: Design, develop, and maintain Databricks-based solutions utilizing cloud platforms such as Azure Synapse and GCP.

DevOps & CI/CD: Implement and manage CI/CD pipelines using tools like GitHub while ensuring best practices in test-driven development, code reviews, and branching strategies.

Python Development: Build, manage, and optimize Python packages using tools such as setup, Poetry, wheels, and artifact registries.

Data Pipelines & Workflows: Develop and optimize workflows in Databricks (PySpark, Databricks Asset Bundles) for data ingestion, processing, and transformation.

Database Management: Work with SQL databases, including Unity Catalog, SQL Server, Hive, and Postgres.

Orchestration: Implement data orchestration solutions using tools like Databricks Workflows, Airflow, and Dagster.

Event-Driven Architecture: Manage event streaming solutions using Kafka, Azure Event Hub, and Google Cloud Pub/Sub.

Change Data Capture (CDC): Implement CDC strategies with tools like Debezium.

Data Migration: Design and execute data migration projects for Azure Synapse and Databricks Lakehouse.

Cloud Storage Management: Handle cloud storage solutions like Azure Data Lake Storage and Google Cloud Storage.

Identity & Access Management: Configure and manage Azure Active Directory (AD Groups, Service Principals, Managed Identities) for security and authentication.

---

Primary Skills (Must-Have):

Python Package Development (setup, poetry, wheels, artifact registries)

Databricks & PySpark (Databricks Asset Bundles)

Open File Formats (Delta, Parquet, Iceberg, etc.)

SQL Databases (Unity Catalog, SQL Server, Hive, Postgres)

Orchestration Tools (Databricks Workflows, Airflow, Dagster)

Azure Data Lake Storage

Azure Active Directory (AD Groups, Service Principals, Managed Identities)

---

Secondary Skills (Good to Have):

Kafka, Azure Event Hub, Google Cloud Pub/Sub