Data Engineering Lead - Databricks

Maharashtra, Pune, India

Codvo.ai

Codvo AI delivers strategic enterprise solutions that transform your data into measurable value. We help businesses accelerate growth through custom AI implementations that adapt and scale with your needs.

View all jobs at Codvo.ai

Apply now Apply later

About Codvo.aiCodvo.ai is a next-gen AI and engineering company helping global enterprises transform through Generative AI, Cloud-native platforms, and Product Engineering. With proprietary platforms like NeIO and Pulse, we’re enabling faster, smarter, and scalable digital transformation for industries including Energy, Retail, Travel, BFSI, and Healthcare.As we gear up to launch new AI-powered products and expand global presence, we are seeking a marketing leader to define how Codvo.ai influences the market, shapes perception, and creates a movement. Role Summary:We are seeking an experienced and highly motivated Data Engineering Lead to spearhead our data engineering initiatives, primarily focusing on the Databricks Platform. You will lead a talented team of data engineers, setting technical direction, designing, building, and optimizing robust, scalable, and reliable data pipelines and infrastructure. Your expertise in Databricks and modern data engineering practices will be crucial in enabling advanced analytics, machine learning, and business intelligence across the organization.Key Responsibilities:
  1. Team Leadership & Mentoring:

Lead, manage, and mentor a team of data engineers, fostering a culture of technical excellence, innovation, and collaboration.

Provide technical guidance, conduct code reviews, and establish best practices for data engineering within the team.

Manage team workload, project prioritization, and delivery timelines.

Contribute to hiring, onboarding, and professional development of team members.

  1. Architecture & Design:

Design and architect scalable, end-to-end data solutions on the Databricks Platform (including data ingestion, processing, storage, and serving layers).

Champion and implement best practices for data modeling (e.g., medallion architecture), data warehousing, and ETL/ELT processes within Databricks (Delta Lake, Unity Catalog).

Evaluate and recommend new technologies, tools, and approaches to enhance our data platform capabilities.

Ensure solutions adhere to security, governance (leveraging Unity Catalog), and compliance standards.

  1. Development & Implementation:

Oversee and contribute hands-on to the development, testing, and deployment of data pipelines using Spark (PySpark/Scala/SQL), Databricks Workflows, Delta Live Tables, and other relevant tools.

Optimize data processing jobs for performance, cost-efficiency, and reliability on Databricks clusters.

Implement robust data quality checks, monitoring, and alerting mechanisms.

  1. Collaboration & Strategy:

Collaborate closely with data scientists, analysts, BI developers, product managers, and other stakeholders to understand data requirements and deliver effective solutions.

Translate business needs into technical specifications and data architecture designs.

Contribute to the overall data strategy and roadmap for the organization.

Act as a subject matter expert for Databricks within the company.

  1. Operations & Support:

Ensure the operational health and performance of the data platform and pipelines.

Lead troubleshooting efforts for complex data-related issues.

Implement and refine CI/CD processes for data engineering workflows.


Required Qualifications:
  • Experience:
  • Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
  • 7+ years of overall experience in data engineering, data warehousing, or software engineering with a data focus.
  • 2+ years of experience in a technical leadership or team lead role, mentoring junior engineers.
  • 3+ years of hands-on experience designing, building, and optimizing data solutions using Databricks.
  • Technical Skills:
  • Deep understanding and extensive practical experience with the Databricks Platform (Delta Lake, Spark SQL, Structured Streaming, Workflows, Unity Catalog, DBFS).
  • Proficiency in Spark programming (PySpark or Scala).
  • Strong SQL skills and experience with data modeling techniques.
  • Proven experience building and managing ETL/ELT pipelines.
  • Experience with at least one major cloud platform (AWS, Azure, or GCP) and its core data services (e.g., S3/ADLS Gen2, Glue/Data Factory, IAM, etc.).
  • Experience with version control systems (e.g., Git) and CI/CD practices.
  • Leadership & Soft Skills:
  • Excellent leadership, communication, and interpersonal skills.
  • Ability to articulate complex technical concepts to both technical and non-technical audiences.
  • Strong problem-solving and analytical skills.
  • Proven ability to manage multiple priorities and deliver projects on time.
Preferred Qualifications:
  • Databricks Certified Data Engineer Professional or Associate.
  • Experience with Delta Live Tables (DLT).
  • Experience implementing data governance solutions, particularly using Databricks Unity Catalog.
  • Experience with streaming data technologies (e.g., Kafka, Kinesis, Databricks Structured Streaming).
  • Familiarity with MLOps concepts and tools (e.g., MLflow).
  • Experience with infrastructure-as-code tools (e.g., Terraform, ARM Templates, CloudFormation).
  • Experience working in an Agile/Scrum environment.
  • Knowledge of containerization technologies (Docker, Kubernetes)

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Agile Architecture AWS Azure Business Intelligence CI/CD CloudFormation Computer Science Databricks Data governance Data pipelines Data quality Data strategy Data Warehousing Docker ELT Engineering ETL GCP Generative AI Git Kafka Kinesis Kubernetes Machine Learning MLFlow MLOps Pipelines PySpark Scala Scrum Security Spark SQL Streaming Terraform Testing

Region: Asia/Pacific
Country: India

More jobs like this