Pyspark Developer- Chennai

IT BUILDING, RAMANUJAN IT SEZ,

Citi

Citi is a leading global bank for institutions with cross-border needs, a global provider in wealth management and a U.S. personal bank.

View all jobs at Citi

Apply now Apply later

Job Title: Data Engineer – C10/Officer (India)

The Role

The Data Engineer is accountable for developing high quality data products to support the Bank’s regulatory requirements and data driven decision making. A Data Engineer will serve as an example to other team members, work closely with customers, and remove or escalate roadblocks. By applying their knowledge of data architecture standards, data warehousing, data structures, and business intelligence they will contribute to business outcomes on an agile team.

Responsibilities

  • Developing and supporting scalable, extensible, and highly available data solutions
  • Deliver on critical business priorities while ensuring alignment with the wider architectural vision
  • Identify and help address potential risks in the data supply chain
  • Follow and contribute to technical standards
  • Design and develop analytical data models

Required Qualifications & Work Experience

  • First Class Degree in Engineering/Technology (4-year graduate course)
  • 3 to 4 years’ experience implementing data-intensive solutions using agile methodologies
  • Experience of relational databases and using SQL for data querying, transformation and manipulation
  • Experience of modelling data for analytical consumers
  • Ability to automate and streamline the build, test and deployment of data pipelines
  • Experience in cloud native technologies and patterns
  • A passion for learning new technologies, and a desire for personal growth, through self-study, formal classes, or on-the-job training
  • Excellent communication and problem-solving skills

Technical Skills (Must Have)

  • ETL: Hands on experience of building data pipelines. Proficiency in at least one of the data integration platforms such as Apache Spark, Talend and Informatica
  • Big Data: Exposure to ‘big data’ platforms such as Hadoop, Hive or Snowflake for data storage and processing
  • Data Warehousing & Database Management: Understanding of Data Warehousing concepts, Relational (Oracle, MSSQL, MySQL) and NoSQL (MongoDB, DynamoDB) database design
  • Data Modeling & Design: Good exposure to data modeling techniques; design, optimization and maintenance of data models and data structures
  • Languages: Proficient in one or more programming languages commonly used in data engineering such as Python
  • DevOps: Exposure to concepts and enablers - CI/CD platforms, version control, automated quality control management

Technical Skills (Valuable)

  • 3-5 Years of Apache Spark /Pyspark: experience in using Apache Pyspark to develop scalable and efficient data processing applications. Experienced in writing Pyspark code to handle large data set ,perform data transformation , familiarity with Pyspark integration with other Apache Spark component ,such as Spark SQL , Understanding of Pyspark optimization techniques such as caching, partitioning and broadcasting
  • Data Quality & Controls: Exposure to data validation, cleansing, enrichment and data controls
  • Containerization: Fair understanding of containerization platforms like Docker, Kubernetes
  • File Formats: Exposure in working on Event/File/Table Formats such as Avro, Parquet, Protobuf, Iceberg, Delta
  • Others: Basics of Job scheduler like Autosys. Basics of Entitlement management

Certification on any of the above topics would be an advantage.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Citi is an equal opportunity and affirmative action employer.

Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View the "EEO is the Law" poster. View the EEO is the Law Supplement.

View the EEO Policy Statement.

View the Pay Transparency Posting

Apply now Apply later
  • Share this job via
  • 𝕏
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Agile Architecture Avro Big Data Business Intelligence CI/CD Data pipelines Data quality Data Warehousing DevOps Docker DynamoDB Engineering ETL Hadoop Informatica Kubernetes MongoDB MS SQL MySQL NoSQL Oracle Parquet Pipelines PySpark Python RDBMS Snowflake Spark SQL Talend

Perks/benefits: Career development

Region: Asia/Pacific
Country: India

More jobs like this