Data Engineering Senior Associate

Bangalore (SDC) - Bagmane Tech Park, India

PwC

We are a community of solvers combining human ingenuity, experience and technology innovation to help organisations build trust and deliver sustained outcomes.

View all jobs at PwC

Apply now Apply later

Line of Service

Advisory

Industry/Sector

Not Applicable

Specialism

Advisory - Other

Management Level

Senior Associate

Job Description & Summary

At PwC, our people in data and analytics engineering focus on leveraging advanced technologies and techniques to design and develop robust data solutions for clients. They play a crucial role in transforming raw data into actionable insights, enabling informed decision-making and driving business growth.

In data engineering at PwC, you will focus on designing and building data infrastructure and systems to enable efficient data processing and analysis. You will be responsible for developing and implementing data pipelines, data integration, and data transformation solutions.

Job Description and Key Responsibilities

  • Design, develop, and maintain robust, scalable ETL pipelines using tools like Apache Spark, Kafka, and other big data technologies.
  • Data Architecture design - Design scalable and reliable data architectures, including Lakehouse, hybrid batch/streaming systems, Lambda, and Kappa architectures.
  • Demonstrate proficiency in Python, PySpark, Spark, and a solid understanding of design patterns (e.g., SOLID).
  • Ingest, process, and store structured, semi-structured, and unstructured data from various sources.
  • Cloud experiece: Hands-on experience with setting up data pipelines using cloud offerings (AWS, Azure, GCP).
  • Optimize ETL processes to ensure scalability and efficiency.
  • Work with various file formats, such as JSON, CSV, Parquet, and Avro.
  • Possess deep knowledge of RDBMS, NoSQL databases, and CAP theorem principles
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and optimize data models for performance and scalability.
  • Document data processes, architectures, and models comprehensively to facilitate cross-team understanding and maintenance.
  • Implement and maintain CI/CD pipelines using tools like Docker, Kubernetes, and GitHub.
  • Ensure data quality, integrity, and security across all systems and processes.
  • Implement and monitor data governance best practices.
  • Stay up-to-date with emerging data technologies and trends, and identify opportunities for innovation and improvement.
  • Knowledge of other Cloud Data/Integration/Orchestration Platforms- Snowflake, Databricks, Azure Data Factory etc. is good to have

GenAI Skills

  • Leverage Large Language Models (LLMs) to generate and manage synthetic datasets for training AI models.
  • Integrate Generative AI tools into data pipelines while critically analyzing and validating Gen AI-generated solutions to ensure reliability and adherence to best practices.

Minimum years’ experience required 4-7 of experience in Programming Language (Any of Python, Scala, Java) (Python Preferred), Apache Spark, ADF, Azure Databricks, Postgres, Knowhow of NoSQL is desirable, ETL (Batch/Streaming), Git , Familiarity with Agile.

Required Qualification: BE / master’s in design / B – Design / B.Tech / HCI – Certification (Preferred)

Education (if blank, degree and/or field of study not specified)

Degrees/Field of Study required:

Degrees/Field of Study preferred:

Certifications (if blank, certifications not specified)

Required Skills

Optional Skills

Accepting Feedback, Accepting Feedback, Active Listening, Agile Scalability, Amazon Web Services (AWS), Analytical Thinking, Apache Hadoop, Azure Data Factory, Communication, Creativity, Data Anonymization, Database Administration, Database Management System (DBMS), Database Optimization, Database Security Best Practices, Data Engineering, Data Engineering Platforms, Data Infrastructure, Data Integration, Data Lake, Data Modeling, Data Pipeline, Data Quality, Data Transformation, Data Validation {+ 18 more}

Desired Languages (If blank, desired languages not specified)

Travel Requirements

Available for Work Visa Sponsorship?

Government Clearance Required?

Job Posting End Date

March 27, 2025

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Agile Architecture Avro AWS Azure Big Data CI/CD CSV Databricks Data governance Data pipelines Data quality Docker Engineering ETL GCP Generative AI Git GitHub Hadoop Java JSON Kafka Kubernetes Lambda LLMs NoSQL Parquet Pipelines PostgreSQL PySpark Python RDBMS Scala Security Snowflake Spark Streaming Unstructured data

Region: Asia/Pacific
Country: India

More jobs like this