Contract Data Engineer (2-year contract)

STB - TOURISM COURT BUILDING, Singapore

Full Time Contract SGD 88K - 163K *

The Singapore Public Service

View all jobs at The Singapore Public Service

Apply now Apply later

Posted 1 day ago

[What the role is]

The role of the Data Engineer is to collaborate with the existing team of data scientists, data engineers and analysts to create data tools, develop data ingestion and processing pipelines, ensuring optimized data processing, and ensuring that data systems meet STB's business requirements. The role requires working closely with the data science team to set up and deploying data pipelines to support machine learning models and analytics scripts, developing data integrations, assembling complex datasets and implementing process improvements. The Data Engineer plays a key role in enhancing data reliability and quality while ensuring scalable business processes and supporting the team's data-related initiatives.

[What you will be working on]

1. Project Management

a) Project manage and work closely with vendors and internal stakeholders to deliver on data engineering related implementations ensuring that deliverables and objectives are met within agreed scope and timelines.

b) Collaborate with cross-functional teams, including data scientists, data engineers, DevOps engineers, product managers, business analysts and business stakeholders, to integrate and deploy models into current analytics platforms and production systems.

c) Plan, execute and monitor project milestones and ensure timely update to management on project progress and issues.

2. Application of Engineering Disciplines in Support of Strategic Business Objectives

a) Prepare, process, cleanse and verify the integrity of data collected for analysis.

b) Design, develop and implement self-managed data processing and compilation pipelines related to key enterprise data domains so that data compilation business logic can be managed and maintained in-house to retain agility in responding to changing operational needs.

c) To review the design and implementation of data pipelines developed by the vendor to ensure that they meet the operational requirements of STB’s business and are integrated back to the self-managed data compilation pipelines for a seamless data processing and compilation process.

d) Work closely with vendors and internal stakeholders to project manage and coordinate Data Science & Analytics's (DS&A) data ingestion and data processing pipelines across platforms which can include mobile apps, SaaS platforms, on-premise and partner systems

e) Help architect DS&A’s data integrations and data processing flows between external / 3rd party data sources, AWS Cloud datawarehouses (e.g. Redshift, RDS) and internal on-premise systems for workloads at scale

f) Provide guidance to internal teams on best practices for Cloud data integrations

g) Identify, design and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.

h) Develop monitoring toolkits to ensure that integration is executed successfully and alerts where integrations have failed

i) Implement best practice DataOps processes to ensure continuous integration, deployment and governance of our data pipelines across the entire data lifecycle from data preparation to reporting.

3. Data Integration and Data Management

a) Collaborate with current team to review the existing data integration processes and make improvements to the current data processing pipelines.

b) Work with data and agency partners to assemble large, complex datasets that meet functional and non-functional business requirements.

c) Provide inputs to the design and development of an integrated data model to allow analysis across multiple structured and unstructured datasets.

d) Recommend different ways to constantly improve data reliability and quality, including helping review and enhance the existing data collection procedures to include data for building analytics models relevant for industry transformation

e) Analyse and assess the effectiveness and accuracy of data sources (e.g., datasets received from stakeholders) and ensure that they meet STB's Data Quality standards.

[What we are looking for]

Strong project management, planning, time management and organisational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
Experienced data pipeline builder and data wrangler who enjoys optimising data systems and building them from ground up.
Experience in using Qlik Sense and AWS services (e.g., SageMaker, Athena, RDS, ECR, ECS, EMR, Lambda, Redis) will be advantageous.
The following certifications would be advantageous:
- Certified AWS Cloud Architect / Data Engineer / DevOps Engineer
- Certified Qlik Sense Data Architect

Degree from a recognised university in a quantitative or engineering discipline: Computer Science, Computer Engineering, Informatics / Information Systems, Applied Mathematics or Statistics.
At least 5 years of work experience in a related field with demonstrable skills in developing, deploying and maintaining data workflows.
Proven track record in managing internal and external stakeholders and delivering on objectives according to project timelines and successfully deploying at least 1 medium to large scale analytics system.
Good command of written and spoken English with good presentation and communication skills with ability to express complex ideas, data / concepts and outcomes of analysis clearly to business audiences.
Strong analytical skills with a good eye for detail and possess an aptitude/experience in solving engineering problems to produce quality deliverables.
Ability to integrate and synthesise research and data across multiple sources to derive meaningful conclusions.
Experience working with structured and unstructured datasets is essential.
Proficient in statistical programming tools (e.g., R, Python), and database scripting languages (e.g., SQL - DQL, DML, DDL)
Experience with DataOps and deploying models and data workflows through DevOps process will be advantageous