Data Engineer (Databricks and AWS)

Pampanga, Manila, Philippines

Full Time Mid-level / Intermediate USD 49K - 91K * ^est.

Citco

At Citco, we don't just provide bespoke solutions and better results. We’re a true partner dedicated to developing rich, long-term relationships through gold standard services.

View all jobs at Citco

Apply now Apply later

Posted 1 day ago

Position: Data Engineer (Databricks & AWS)

Company Overview Citco is a global leader in financial services, delivering innovative solutions to some of the world's largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are looking for a Data Engineer with strong Databricks expertise and AWS experience to contribute to mission-critical data initiatives.

Role Summary as a Data Engineer, you will be responsible for developing and maintaining end-to-end data solutions on Databricks (Spark, Delta Lake, MLflow, etc.) while working with core AWS services (S3, Glue, Lambda, etc.). You will work within a technical team, implementing best practices in performance, security, and scalability. This role requires solid understanding of Databricks and experience with cloud-based data platforms.

Key Responsibilities

1.Databricks Platform & Development

Implement Databricks Lakehouse solutions using Delta Lake for ACID transactions and data versioning
Utilize Databricks SQL Analytics for querying and report generation
Support cluster management and Spark job optimization
Develop structured streaming pipelines for data ingestion and processing
Use Databricks Repos, notebooks, and job scheduling for development workflows

2.AWS Cloud Integration

Work with Databricks and AWS S3 integration for data lake storage
Build ETL/ELT pipelines using AWS Glue catalog, AWS Lambda, and AWS Step Functions
Configure networking settings for secure data access
Support infrastructure deployment using AWS CloudFormation or Terraform

3.Data Pipeline & Workflow Development

Create scalable ETL frameworks using Spark (Python/Scala)
Participate in workflow orchestration and CI/CD implementation
Develop Delta Live Tables for data ingestion and transformations
Support MLflow integration for data lineage and reproducibility

4.Performance & Optimization

Implement Spark job optimizations (caching, partitioning, joins)
Support cluster configuration for optimal performance
Optimize data processing for large-scale datasets

5.Security & Governance

Apply Unity Catalog features for governance and access control
Follow compliance requirements and security policies
Implement IAM best practices

6.Team Collaboration

Participate in code reviews and knowledge-sharing sessions
Work within Agile/Scrum development framework
Collaborate with team members and stakeholders

7.Monitoring & Maintenance

Help implement monitoring solutions for pipeline performance
Support alert system setup and maintenance
Ensure data quality and reliability standards

Qualifications

1.Educational Background

Bachelor's degree in Computer Science, Data Science, Engineering, or equivalent experience

2.Technical Experience

Databricks Experience: 2+ years of hands-on Databricks (Spark) experience
AWS Knowledge: Experience with AWS S3, Glue, Lambda, and basic security practices
Programming Skills: Strong proficiency in Python (PySpark) and SQL
Data Warehousing: Understanding of RDBMS and data modeling concepts
Infrastructure: Familiarity with infrastructure as code concepts

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Engineering Jobs

Tags: Agile AWS AWS Glue CI/CD CloudFormation Computer Science Databricks Data quality Data Warehousing ELT Engineering ETL Lambda MLFlow Pipelines PySpark Python RDBMS Scala Scrum Security Spark SQL Step Functions Streaming Terraform