GMS-Senior-Databricks Developer

Bengaluru, KA, IN, 560016

EY

Mit unseren vier integrierten Geschäftsbereichen — Wirtschaftsprüfung und prüfungsnahe Dienstleistungen, Steuerberatung, Unternehmensberatung und Strategy and Transactions — sowie unserem Branchenwissen unterstützen wir unsere Mandanten dabei,...

View all jobs at EY

Apply now Apply later

At EY, you’ll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we’re counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. 

 

 

 

 

Databricks ETL Developer role for SAP Sources

 

A Databricks ETL developer with basic knowledge of SAP HANA tables should possess a combination of skills related to data engineering, ETL processes, and familiarity with both Databricks and SAP HANA environments. The applicant must have a mix of skills which will enable him/her to effectively work with SAP HANA tables and build robust ETL pipelines within the Databricks environment. Continuous learning and staying updated with the latest technologies in the data engineering field are also important in this role.

 

Key Responsibilities:

  • Design and implement scalable ETL pipelines using Databricks and Apache Spark.
  • Extract data from various sources, including SAP HANA tables, and transform and load it into the desired format for analysis.
  • Write optimized Spark SQL queries for data manipulation and aggregation.
  • Develop and maintain Databricks notebooks for data processing and workflow orchestration.
  • Monitor ETL jobs to ensure performance, reliability, and data quality.
  • Collaborate with data architects to model data and design data warehouse schemas.
  • Work closely with data scientists and analysts to understand data requirements and deliver the necessary data structures.
  • Ensure compliance with data security and privacy standards.
  • Troubleshoot and resolve issues related to data pipelines and performance.
  • Document ETL processes, including data lineage and transformations.
  • Stay up-to-date with the latest advancements in Databricks, Apache Spark, and data engineering best practices.

 

Applicant is expected to have below mention Qualifications & Skills:

Bachelor's or Master's degree in Computer Science, Engineering, Information Technology, or a related field.

 

Databricks Platform Knowledge:

  • Understanding of Databricks Unified Analytics Platform.
  • Experience with Databricks notebooks, jobs, and clusters.
  • Knowledge of Databricks utilities like DBUtils.

 

Apache Spark:

  • Proficiency in Apache Spark for large-scale data processing.
  • Ability to write Spark SQL queries for data transformation.
  • Experience with Spark DataFrames and Datasets for ETL operations.
  • Familiarity with Spark architecture and optimization techniques.

 

Programming Languages:

  • Proficiency in Scala, Python, or Java for writing Spark jobs.
  • Ability to write UDFs (User Defined Functions) in Spark.

 

ETL Process:

  • Experience in designing and implementing ETL pipelines.
  • Understanding of data modeling, data warehousing, and data architecture.
  • Knowledge of ETL best practices and performance tuning.

 

SAP HANA:

  • Basic understanding of SAP HANA tables and data structures.
  • Familiarity with SAP HANA SQL and SQLScript for querying and data manipulation.
  • Experience with connecting to SAP HANA databases from external systems.

 

Data Sources and Integration:

  • Experience with various data sources such as JSON, CSV, Parquet, etc.
  • Knowledge of data ingestion methods from different databases and APIs.
  • Ability to integrate with various data storage systems like S3, Azure Blob Storage, HDFS, etc.

 

Cloud Platforms:

  • Understanding of cloud services related to data storage and computation (AWS, Azure, GCP).
  • Experience with cloud-based Databricks deployment.

 

Version Control and CI/CD:

  • Familiarity with version control systems like Git.
  • Understanding of CI/CD pipelines for deploying ETL jobs.

 

Monitoring and Logging:

  • Experience with monitoring ETL jobs and pipelines.
  • Knowledge of logging and debugging techniques in Databricks and Spark.

 

Security and Compliance:

  • Understanding of data security, encryption, and compliance standards.
  • Familiarity with role-based access control within Databricks.

 

Collaboration and Communication:

  • Ability to work in a team and collaborate with data scientists, analysts, and business stakeholders.
  • Strong problem-solving skills and effective communication.

 

EY | Building a better working world 


 
EY exists to build a better working world, helping to create long-term value for clients, people and society and build trust in the capital markets.  


 
Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform and operate.  


 
Working across assurance, consulting, law, strategy, tax and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today.  

Apply now Apply later
  • Share this job via
  • 𝕏
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: APIs Architecture AWS Azure CI/CD Computer Science Consulting CSV Databricks Data pipelines Data quality Data warehouse Data Warehousing Engineering ETL GCP Git HDFS Java JSON Parquet Pipelines Privacy Python Scala Security Spark SQL

Perks/benefits: Career development

Region: Asia/Pacific
Country: India

More jobs like this