Databricks Data Engineer
Hyderabad, Telangana, India - Remote
Xenon7
Cutting-edge AI solutions backed by elite researchers and industry experts. Transform your business with our data-driven, industry-specific AI applications.About us:
At Xenon7, we work with leading enterprises and innovative startups on exciting, cutting-edge projects that leverage the latest technologies across various domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and on-demand resources allows us to partner with clients on transformative initiatives, driving innovation and business growth. Whether it's empowering global organizations or collaborating with trailblazing startups, we are committed to delivering advanced, impactful solutions that meet today’s most complex challenges.
About the Client:
Join one of the Fortune 500 leaders in the pharmaceutical industry that is looking innovate and expand our technological capabilities. The person would join a product team working on their self-service company-wide platform that enables all the teams and business units to deploy their AI solutions, making them accessible across the entire company.
Role Overview:
We are looking for a skilled Data Engineer to support a high-impact Lakehouse migration initiative. The ideal candidate will have strong experience in building scalable data pipelines using PySpark and Databricks, with a solid understanding of lakehouse architecture and collaborative workflows using Git-based tools. You will be part of a cross-functional team responsible for modernizing data infrastructure and enabling analytical capabilities at scale.
Responsibilities:
- Design, build, and maintain scalable ETL/ELT pipelines using PySpark and Databricks Notebooks
- Work with Unity Catalog to manage data governance, lineage, and access control
- Collaborate with data scientists, architects, and analysts to ensure pipeline quality and data availability
- Develop and maintain workflow orchestration using Databricks Workflows or equivalent
- Optimize SQL queries and data transformations for performance and cost-efficiency
- Use Bitbucket or GitHub to manage source control and CI/CD practices
- Participate in code reviews, testing, and performance tuning of data processing
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Proven experience as a Data Engineer, with at least 5 years of experience, specifically in Databricks.
- Experience in data engineering with Python, PySpark, and SQL
- Hands-on experience with Databricks, including Unity Catalog, workflows, and notebooks
- Strong understanding of lakehouse architecture principles
- Experience using Bitbucket or GitHub in a collaborative team environment
- Familiarity with versioning, branching, and CI/CD practices
- Excellent problem-solving skills and attention to detail
Nice to Have:
- Experience with AWS or Azure cloud platforms
- Exposure to Snowflake
- Familiarity with Jenkins or other CI/CD automation tools
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Azure Bitbucket CI/CD Computer Science Databricks Data governance Data pipelines ELT Engineering ETL Git GitHub Jenkins Pharma Pipelines PySpark Python Snowflake SQL Testing
Perks/benefits: Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.