Data Engineer - Risk & Analytics
Bengaluru, India
Credit Saison India
Discover how Credit Saison India empowers India’s credit growth with innovative financial solutions, fostering business success and financial inclusion..Established in 2019, CS India is one of the country’s fastest growing Non-Bank Financial Company (NBFC) lenders, with verticals in wholesale, direct lending and tech-enabled partnerships with Non-Bank Financial Companies (NBFCs) and fintechs. Its tech-enabled model coupled with underwriting capability facilitates lending at scale, meeting India’s huge gap for credit, especially with underserved and under penetrated segments of the population.
Credit Saison India is committed to growing as a lender and evolving its offerings in India for the long-term for MSMEs, households, individuals and more. CS India is registered with the Reserve Bank of India (RBI) and has an AAA rating from CRISIL (a subsidiary of S&P Global) and CARE Ratings.
Currently, CS India has a branch network of 45 physical offices, 1.2 million active loans, an AUM of over US$1.5B and an employee base of about 1,000 people.
Credit Saison India (CS India) is part of Saison International, a global financial company with a mission to bring people, partners and technology together, creating resilient and innovative financial solutions for positive impact.
Across its business arms of lending and corporate venture capital, Saison International is committed to being a transformative partner in creating opportunities and enabling the dreams of people.
Based in Singapore, over 1,000 employees work across Saison’s global operations spanning Singapore, India, Indonesia, Thailand, Vietnam, Mexico, Brazil.
Saison International is the international headquarters (IHQ) of Credit Saison Company Limited, founded in 1951 and one of Japan’s largest lending conglomerates with over 70 years of history and listed on the Tokyo Stock Exchange. The Company has evolved from a credit-card issuer to a diversified financial services provider across payments, leasing, finance, real estate and entertainment.
Roles & Responsibilities
Define and lead the data architecture vision and strategy, ensuring it supports analytics, ML, and business operations at scale.
Architect and manage cloud-native data platforms using Databricks and AWS, leveraging the lakehouse architecture to unify data engineering and ML workflows.
Build and optimize large-scale batch and streaming pipelines using Apache Spark, Airflow, and AWS Glue, ensuring high availability and fault tolerance.
Design and develop data marts, warehouses, and analytics-ready datasets tailored for BI, product, and data science teams.
Implement robust ETL/ELT pipelines with a focus on reusability, modularity, and automated testing.
Enforce and scale data governance practices, including data lineage, cataloging, access management, and compliance with security and privacy standards.
Partner with ML Engineers and Data Scientists to build and deploy ML pipelines, leveraging Databricks MLflow, Feature Store, and MLOps practices.
Provide architectural leadership across data modeling, data observability, pipeline monitoring, and CI/CD for data workflows.
Evaluate emerging tools and frameworks, recommending technologies that align with platform scalability and cost-efficiency.
Mentor data engineers and foster a culture of technical excellence, innovation, and ownership across data teams.
Required Skills & Qualifications
8+ years of hands-on experience in data engineering, with at least 4 years in a lead or architect-level role.
Deep expertise in Apache Spark, with proven experience developing large-scale distributed data processing pipelines.
Strong experience with Databricks platform and its internal ecosystem (e.g., Delta Lake, Unity Catalog, MLflow, Job orchestration, Workspaces, Clusters, Lakehouse architecture).
Extensive experience with workflow orchestration using Apache Airflow.
Proficiency in both SQL and NoSQL databases (e.g., Postgres, DynamoDB, MongoDB, Cassandra) with a deep understanding of schema design, query tuning, and data partitioning.
Proven background in building data warehouse/data mart architectures using AWS services like Redshift, Athena, Glue, Lambda, DMS, and S3.
Strong programming and scripting ability in Python (preferred) or other AWS-compatible languages.
Solid understanding of data modeling techniques, versioned datasets, and performance tuning strategies.
Hands-on experience implementing data governance, lineage tracking, data cataloging, and compliance frameworks (GDPR, HIPAA, etc.).
Experience with real-time data streaming using tools like Kafka, Kinesis, or Flink.
Working knowledge of MLOps tooling and workflows, including automated model deployment, monitoring, and ML pipeline orchestration.
Familiarity with MLflow, Feature Store, and Databricks-native ML tooling is a plus.
Strong grasp of CI/CD for data and ML pipelines, automated testing, and infrastructure-as-code (Terraform, CDK, etc.).
Excellent communication, leadership, and mentoring skills with a collaborative mindset and the ability to influence across functions.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture Athena AWS AWS Glue Cassandra CI/CD Databricks Data governance Data warehouse DynamoDB ELT Engineering ETL Finance Flink Kafka Kinesis Lambda Machine Learning MLFlow MLOps Model deployment MongoDB NoSQL Pipelines PostgreSQL Privacy Python Redshift Security Spark SQL Streaming Terraform Testing
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.