Senior Analyst - Data Engineering Developer
Mumbai - Puma Vikhroli Office, India
Puma Energy
Reliable global energy company, that delivers top-tier fuels and lubricants to retail and commercial clients worldwide, with a strong presence in Africa.Main Purpose:
▪Collaborate with data scientists and business stakeholders to design, develop, and maintain efficient data pipelines feeding into the organization's data lake.▪
Maintain the integrity and quality of the data lake, enabling accurate and actionable insights for data scientists and informed decision-making for business stakeholders.
▪Utilize extensive knowledge of data engineering and cloud technologies to enhance the organization’s data infrastructure, promoting a culture of data-driven decision-making.
▪
Apply data engineering expertise to define and optimize data pipelines using advanced concepts to improve the efficiency and accessibility of data storage.
▪Own the development of an extensive data catalog, ensuring robust data governance and facilitating effective data access and utilization across the organization.
Knowledge Skills and Abilities, Key Responsibilities:
Key Responsibilities
•Contribute to the development of scalable and performant data pipelines on Databricks, leveraging Delta Lake, Delta Live Tables (DLT), and other core Databricks components.
•Develop data lakes/warehouses designed for optimized storage, querying, and real-time updates using Delta Lake.
•Implement effective data ingestion strategies from various sources (streaming, batch, API-based), ensuring seamless integration with Databricks.
•Ensure the integrity, security, quality, and governance of data across our Databricks-centric platforms.
•Collaborate with stakeholders (data scientists, analysts, product teams) to translate business requirements into Databricks-native data solutions.
•Build and maintain ETL/ELT processes, heavily utilizing Databricks, Spark (Scala or Python), SQL, and Delta Lake for transformations.
Page
•Experience with CI/CD and DevOps practices specifically tailored for the Databricks environment.
•Monitor and optimize the cost-efficiency of data operations on Databricks, ensuring optimal resource utilization.
•Utilize a range of Databricks tools, including the Databricks CLI and REST API, alongside Apache Spark™, to develop, manage, and optimize data engineering solutions.
Key Relationships and Department Overview:
Key Relationships
•Internal – Data Engineering Manager
•Developers across various departments, Managers of Departments in other regional hubs of Puma Energy
•External – Platform providers
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs CI/CD Databricks Data governance DataOps Data pipelines DevOps ELT Engineering ETL Pipelines Python REST API Scala Security Spark SQL Streaming
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.