Databricks Data Engineer

Hyderabad, Telangana, India - Remote

Applications have closed

Xenon7

Cutting-edge AI solutions backed by elite researchers and industry experts. Transform your business with our data-driven, industry-specific AI applications.

View all jobs at Xenon7

Find more jobs like this Jobs in India

Posted 1 month ago

About us:

At Xenon7, we work with leading enterprises and innovative startups on exciting, cutting-edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and on-demand resources allows us to partner with clients on transformative initiatives, driving innovation and business growth. Whether it's empowering global organizations or collaborating with trailblazing startups, we are committed to delivering advanced, impactful solutions that meet today’s most complex challenges.

About the Client:

Join one of the Fortune 500 leaders in the pharmaceutical industry that is looking innovate and expand our technological capabilities. The person would join a product team working on their self-service company-wide platform that enables all the teams and business units to deploy their AI solutions, making them accessible across the entire company.

Role Overview:

We are looking for a skilled Data Engineer to support a high-impact Lakehouse migration initiative. The ideal candidate will have strong experience in building scalable data pipelines using PySpark and Databricks, with a solid understanding of lakehouse architecture and collaborative workflows using Git-based tools. You will be part of a cross-functional team responsible for modernizing data infrastructure and enabling analytical capabilities at scale.

Responsibilities:

Design, build, and maintain scalable ETL/ELT pipelines using PySpark and Databricks Notebooks
Work with Unity Catalog to manage data governance, lineage, and access control
Collaborate with data scientists, architects, and analysts to ensure pipeline quality and data availability
Develop and maintain workflow orchestration using Databricks Workflows or equivalent
Optimize SQL queries and data transformations for performance and cost-efficiency
Use Bitbucket or GitHub to manage source control and CI/CD practices
Participate in code reviews, testing, and performance tuning of data processing

Requirements

Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Proven experience as a Data Engineer, with at least 5 years of experience, specifically in Databricks.
Experience in data engineering with Python, PySpark, and SQL
Hands-on experience with Databricks, including Unity Catalog, workflows, and notebooks
Strong understanding of lakehouse architecture principles
Experience using Bitbucket or GitHub in a collaborative team environment
Familiarity with versioning, branching, and CI/CD practices
Excellent problem-solving skills and attention to detail

Nice to Have: