Data Engineering Lead - Databricks
Maharashtra, Pune, India
Codvo.ai
Codvo AI delivers strategic enterprise solutions that transform your data into measurable value. We help businesses accelerate growth through custom AI implementations that adapt and scale with your needs.- Team Leadership & Mentoring:
Lead, manage, and mentor a team of data engineers, fostering a culture of technical excellence, innovation, and collaboration.
Provide technical guidance, conduct code reviews, and establish best practices for data engineering within the team.
Manage team workload, project prioritization, and delivery timelines.
Contribute to hiring, onboarding, and professional development of team members.
- Architecture & Design:
Design and architect scalable, end-to-end data solutions on the Databricks Platform (including data ingestion, processing, storage, and serving layers).
Champion and implement best practices for data modeling (e.g., medallion architecture), data warehousing, and ETL/ELT processes within Databricks (Delta Lake, Unity Catalog).
Evaluate and recommend new technologies, tools, and approaches to enhance our data platform capabilities.
Ensure solutions adhere to security, governance (leveraging Unity Catalog), and compliance standards.
- Development & Implementation:
Oversee and contribute hands-on to the development, testing, and deployment of data pipelines using Spark (PySpark/Scala/SQL), Databricks Workflows, Delta Live Tables, and other relevant tools.
Optimize data processing jobs for performance, cost-efficiency, and reliability on Databricks clusters.
Implement robust data quality checks, monitoring, and alerting mechanisms.
- Collaboration & Strategy:
Collaborate closely with data scientists, analysts, BI developers, product managers, and other stakeholders to understand data requirements and deliver effective solutions.
Translate business needs into technical specifications and data architecture designs.
Contribute to the overall data strategy and roadmap for the organization.
Act as a subject matter expert for Databricks within the company.
- Operations & Support:
Ensure the operational health and performance of the data platform and pipelines.
Lead troubleshooting efforts for complex data-related issues.
Implement and refine CI/CD processes for data engineering workflows.
Required Qualifications:
- Experience:
- Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
- 7+ years of overall experience in data engineering, data warehousing, or software engineering with a data focus.
- 2+ years of experience in a technical leadership or team lead role, mentoring junior engineers.
- 3+ years of hands-on experience designing, building, and optimizing data solutions using Databricks.
- Technical Skills:
- Deep understanding and extensive practical experience with the Databricks Platform (Delta Lake, Spark SQL, Structured Streaming, Workflows, Unity Catalog, DBFS).
- Proficiency in Spark programming (PySpark or Scala).
- Strong SQL skills and experience with data modeling techniques.
- Proven experience building and managing ETL/ELT pipelines.
- Experience with at least one major cloud platform (AWS, Azure, or GCP) and its core data services (e.g., S3/ADLS Gen2, Glue/Data Factory, IAM, etc.).
- Experience with version control systems (e.g., Git) and CI/CD practices.
- Leadership & Soft Skills:
- Excellent leadership, communication, and interpersonal skills.
- Ability to articulate complex technical concepts to both technical and non-technical audiences.
- Strong problem-solving and analytical skills.
- Proven ability to manage multiple priorities and deliver projects on time.
- Databricks Certified Data Engineer Professional or Associate.
- Experience with Delta Live Tables (DLT).
- Experience implementing data governance solutions, particularly using Databricks Unity Catalog.
- Experience with streaming data technologies (e.g., Kafka, Kinesis, Databricks Structured Streaming).
- Familiarity with MLOps concepts and tools (e.g., MLflow).
- Experience with infrastructure-as-code tools (e.g., Terraform, ARM Templates, CloudFormation).
- Experience working in an Agile/Scrum environment.
- Knowledge of containerization technologies (Docker, Kubernetes)
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Architecture AWS Azure Business Intelligence CI/CD CloudFormation Computer Science Databricks Data governance Data pipelines Data quality Data strategy Data Warehousing Docker ELT Engineering ETL GCP Generative AI Git Kafka Kinesis Kubernetes Machine Learning MLFlow MLOps Pipelines PySpark Scala Scrum Security Spark SQL Streaming Terraform Testing
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.