Big Data Engineer - C13 - IRVING

6400 LAS COLINAS BLVD IRVING, United States

Citi

Citi is a leading global bank for institutions with cross-border needs, a global provider in wealth management and a U.S. personal bank.

View all jobs at Citi

Apply now Apply later

The Big Data Engineer is a strategic professional who combines deep domain knowledge with modern data engineering and AI capabilities to drive high-impact solutions. The role demands strong commercial awareness, technical leadership, and the ability to influence key stakeholders. The ideal candidate will have a strong foundation in traditional ML techniques, Deep learning, and Generative AI, along with hands-on experience in big data platforms and data engineering stacks (Hadoop, PySpark, SnowFlake).  This individual contributes directly to evolving data strategies by incorporating Generative AI (GenAI) models, LLM pipelines, and modern data engineering stacks (e.g., Hadoop, PySpark, Snowflake) into analytical processes and enterprise applications.

The work has broad business implications across LOBs and often influences the operational efficiency and risk strategies of multiple functions.

Responsibilities:

  • Integrates GenAI and LLM-based strategies (RAG, embeddings, prompt engineering) within analytics workflows.
  • Collaborates across teams to define scalable and secure pipelines leveraging Hadoop, PySpark, and Snowflake.
  • Applies deep understanding of data ingestion, transformation, and feature engineering on large datasets for real-time and batch processing.
  • Leverages Vector DBs (e.g., FAISS, PGVector) and embedding stores to enable semantic search and similarity-based analytics.
  • Builds, optimizes, and governs model pipelines supporting both rule-based analytics and ML-powered decision engines.
  • Advises business units on emerging GenAI use-cases, helping to translate domain challenges into ML opportunities.
  • Ensures production-grade deployments by working closely with MLOps, DataOps, and Platform Teams.
  • Contributes to the data governance and policy frameworks ensuring compliance and ethical use of GenAI.


Qualifications:

  • 6–10 years of experience in advanced analytics, data science, or engineering roles.
  • 4–5 years of experience of hands-on machine learning experience using Python and PySpark
  • Strong knowledge of Hadoop ecosystem, PySpark  and distributed data processing.
  • Experience designing or supporting Snowflake architectures and workflows.
  • Familiarity with LLMs, LangChain, Prompt engineering, Embedding Techniques, and Vector Databases.
  • Proven experience building and scaling analytics solutions in production (preferably in financial services or fraud detection).
  • Strong understanding of data lineage, governance, and secure data access for regulated environments.


Education:

  • Bachelor’s/University degree or equivalent experience, potentially Masters degree


This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Data Analytics

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Primary Location:

Irving Texas United States

------------------------------------------------------

Primary Location Full Time Salary Range:

$125,760.00 - $188,640.00


In addition to salary, Citi’s offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards. Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs. Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays. For additional information regarding Citi employee benefits, please visit citibenefits.com. Available offerings may vary by jurisdiction, job level, and date of hire.

------------------------------------------------------

Anticipated Posting Close Date:

Apr 10, 2025

------------------------------------------------------

Citi is an equal opportunity and affirmative action employer.

Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View the "EEO is the Law" poster. View the EEO is the Law Supplement.

View the EEO Policy Statement.

View the Pay Transparency Posting

Apply now Apply later
Job stats:  0  0  0

Tags: Architecture Big Data Data Analytics Data governance DataOps Deep Learning Engineering FAISS Feature engineering Generative AI Hadoop LangChain LLMs Machine Learning MLOps Pipelines Prompt engineering PySpark Python RAG Snowflake

Perks/benefits: Career development Competitive pay Health care Insurance Medical leave Wellness

Region: North America
Country: United States

More jobs like this