Bigadata Pyspark Lead - C13 – VP - Pune
PLOT NO-1, S.NO. 77, India
⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️
Citi
Citi is a leading global bank for institutions with cross-border needs, a global provider in wealth management and a U.S. personal bank.The Data Analytics Lead Analyst is a strategic professional who stays abreast of developments within own field and contributes to directional strategy by considering their application in own job and the business. Recognized technical authority for an area within the business. Requires basic commercial awareness. There are typically multiple people within the business that provide the same level of subject matter expertise. Developed communication and diplomacy skills are required in order to guide, influence and convince others, in particular colleagues in other areas and occasional external customers. Significant impact on the area through complex deliverables. Provides advice and counsel related to the technology or operations of the business. Work impacts an entire area, which eventually affects the overall performance and effectiveness of the sub-function/job family.
Responsibilities:
- Deep hands-on experience with PySpark for data processing, ETL (Extract, Transform, Load) operations, data manipulation, and building distributed computing solutions on large datasets.
- Proficiency in designing and building robust data pipelines, data ingestion, transformation, and processing workflows
- Solid understanding of data modeling principles, database design, and strong SQL skills for data querying and analysis.
- Ability to analyze data, identify patterns, uncover insights, and translate business needs into actionable data solutions.
- Leading and mentoring a team of data engineers or analysts, fostering best practices, and ensuring the delivery of high-quality data products.
- Working closely with product partners and business analysts, to understand requirements and deliver impactful analytical solutions.
Qualifications:
To be successful in this role, you should meet the following requirements:
- 8+ years of experience in handling distributed / big data projects.
- Proficiency in Pyspark, Linux scripting, SQL and Bigdata tools.
- Technology stack – Pyspark, ETL, Unix Shell Scripting, Python, Spark, SQL, Impala, Hive
- Strong exposure in interpretation of business requirements from a technical perspective. Design, develop and implement IT solutions that fulfill business users' requirements and conform to a high level of quality standard.
- Sound problem-solving skills and attention to detail.
- Strong communication, presentation and team collaboration skills.
- Knowledge of Automation and DevOps practices.
- Familiarity with agile development methodologies using Jira
Education:
- Bachelor’s/University degree or equivalent experience, potentially Masters degree
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Data Analytics------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Big Data Data Analytics Data pipelines DevOps ETL Jira Linux Pipelines PySpark Python Shell scripting Spark SQL
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.