Data Engineer (Databricks and AWS)

Pampanga, Manila, Philippines

Citco

At Citco, we don't just provide bespoke solutions and better results. We’re a true partner dedicated to developing rich, long-term relationships through gold standard services.

View all jobs at Citco

Apply now Apply later

 

Position: Data Engineer (Databricks & AWS)

Company Overview Citco is a global leader in financial services, delivering innovative solutions to some of the world's largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are looking for a Data Engineer with strong Databricks expertise and AWS experience to contribute to mission-critical data initiatives.

Role Summary as a Data Engineer, you will be responsible for developing and maintaining end-to-end data solutions on Databricks (Spark, Delta Lake, MLflow, etc.) while working with core AWS services (S3, Glue, Lambda, etc.). You will work within a technical team, implementing best practices in performance, security, and scalability. This role requires solid understanding of Databricks and experience with cloud-based data platforms.

Key Responsibilities

1.Databricks Platform & Development

  • Implement Databricks Lakehouse solutions using Delta Lake for ACID transactions and data versioning
  • Utilize Databricks SQL Analytics for querying and report generation
    Support cluster management and Spark job optimization
  • Develop structured streaming pipelines for data ingestion and processing
  • Use Databricks Repos, notebooks, and job scheduling for development workflows

2.AWS Cloud Integration

  • Work with Databricks and AWS S3 integration for data lake storage
  • Build ETL/ELT pipelines using AWS Glue catalog, AWS Lambda, and AWS Step Functions
  • Configure networking settings for secure data access
  • Support infrastructure deployment using AWS CloudFormation or Terraform

3.Data Pipeline & Workflow Development

  • Create scalable ETL frameworks using Spark (Python/Scala)
  • Participate in workflow orchestration and CI/CD implementation
  • Develop Delta Live Tables for data ingestion and transformations
  • Support MLflow integration for data lineage and reproducibility

4.Performance & Optimization

 

  • Implement Spark job optimizations (caching, partitioning, joins)
  • Support cluster configuration for optimal performance
  • Optimize data processing for large-scale datasets

5.Security & Governance

  • Apply Unity Catalog features for governance and access control
  • Follow compliance requirements and security policies
  • Implement IAM best practices

6.Team Collaboration

  • Participate in code reviews and knowledge-sharing sessions
  • Work within Agile/Scrum development framework
  • Collaborate with team members and stakeholders

7.Monitoring & Maintenance

  • Help implement monitoring solutions for pipeline performance
  • Support alert system setup and maintenance
  • Ensure data quality and reliability standards

Qualifications

1.Educational Background

  • Bachelor's degree in Computer Science, Data Science, Engineering, or equivalent experience

2.Technical Experience

  • Databricks Experience: 2+ years of hands-on Databricks (Spark) experience
  • AWS Knowledge: Experience with AWS S3, Glue, Lambda, and basic security practices
  • Programming Skills: Strong proficiency in Python (PySpark) and SQL
  • Data Warehousing: Understanding of RDBMS and data modeling concepts
  • Infrastructure: Familiarity with infrastructure as code concepts
Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Agile AWS AWS Glue CI/CD CloudFormation Computer Science Databricks Data quality Data Warehousing ELT Engineering ETL Lambda MLFlow Pipelines PySpark Python RDBMS Scala Scrum Security Spark SQL Step Functions Streaming Terraform

Region: Asia/Pacific
Country: Philippines

More jobs like this