Senior Data Engineer

Singapore, Singapore

Full Time Senior-level / Expert SGD 105K - 196K *

Singtel

The Singtel Group, Asia's leading communications group provides a diverse range of services including fixed, mobile, data, internet, TV, infocomms technology (ICT) and digital solutions.

View all jobs at Singtel

Apply now Apply later

Posted 2 hours ago

Be a Part of Something BIG!

We are seeking a Senior Data Engineer who will be responsible to design and deliver data engineering products, solutions, and run operations. Ensure the efficient and sustainable operation of Singtel Unified Data Platform and Event Streaming Platform, and to build and maintain large-scale, highly available, high-performance distributed systems based on system availability and performance. You will be part of Group IT - Data & Platform Management (DPM) team. In this position you will work with business, IT, and data professionals to drive Singtel data and insights transformation for our business and customers.

Develop new data solutions and accelerators that help to deploy our data platform and

engineering services at scale
Manage the full life cycle from requirement gathering and analysis, design of the architecture and deployment
Design and build scalable and reliable data pipelines for both batch and real-time streaming data
Drive optimization, testing and tooling to improve data quality and high availability
Maintenance and troubleshooting of data pipelines and services as per SLA
Review and approve solution design of data pipelines

Make An Impact By

Design, develop and automate large scale, high-performance distributed data processing systems (batch and/or real-time streaming)
Practice high quality data engineering/software engineering towards building data platform infrastructure and data pipelines at scale to deliver Big Data Analytics and Data-Science initiatives.
Lead data engineering projects End data pipelines delivery with reliable, efficient, testable, & maintainable artifacts, involves ingest & process data from a large number and variety of data sources.
Drive Cloud data engineering practices and Cloud Lake-house re-platform drive to build & scale Modern Data Platform & Infrastructure.
Design data models for optimal storages across data layers, workload and presentation retrieval to meet critical business requirements and platform operational efficiency.
Develop data pipelines for both real-time stream processing and batch processing on the Data platform (Cloud and On-Prem) that meet functional / non-functional business requirements
Build/Optimize and Contribute to shared Data Engineering Frameworks/Tools, Data Products & standards to improve the productivity and quality of output for Data Engineers across the data platform group.
Design and build scalable Data APIs to host Operational data and Unified Data Platform assert in Data-Mesh / Data Fabric Architecture.

Partnered with Business domain experts, data scientists, and solution designers to identify relevant data-assets, domain data model and data solutions. Also, communicate with product development engineers to coordinate backlog feature development of data pipelines patterns.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing data products for greater scalability
Deliver high level & detailed design to ensure that the solution delivers to the business needs and align to the data architecture principles and technology stacks. Also patriciate in peer design Review and approval process.
Understand Information and data security standards, use data security guidelines & tools to apply and adhere to the required data controls across Data Platform, Pipelines Data Applications and Access End Points.
Support and contribute to Date engineering product and data pipeline documentations, development guidelines & standards for data-pipeline, data model and layer design.
Drive and delivery industry standard Devops (CI/CD ) best practices, automate development and release management.
Drive Modern Data Platform operations using Data Ops, ensure data quality, monitoring the data system. Also support Data science MLOps platform.

Skills for Success:

Minimum of 8 years of experience in Data Engineering, Unified Data Platform (Datarbicks) Infrastructure, Data
Warehousing, Data Analytics tools or related, in design and developing of end to end scalable data pipelines and data products.
Experience designing, building and operating robust distributed data systems, and deploying high performance with reliable system with monitoring and logging practices
Experience Build and manage the data products and pipelines using some of the most scalable and resilient open source big data technologies – Spark, Delta -Lake, Kafka, Flink, Airflow, Pestro and related distributed data processing.
Experience building scalable and reliable Data pipelines/ ETL for large number and variety of data sources ingestion, processing and data integrations
Build and deploy performant modern data engineering & automation frameworks using programming languages such as Scala/Python and automate the big data workflows such as ingestion, aggregation, ETL processing etc.
Proficiency programming languages like Scala ,Python, Java, Go, Rust or scripting languages like Bash.
Experience in handling large Unified Data Platform (Datarbicks)s (multiple PBs)
Experience on cloud systems like AWS, Azure, or Google Cloud Platform.
- Cloud data engineering experience in at least one cloud (Azure, AWS, GCP).
- Fluent with Azure cloud services. Azure Certification is a plus.
- Experience with Azure – Databrick (Cloud Data Lakehouse) is big plus
Experience on OSS Hadoop stack Hadoop/HDFS, Yarn, Hive, HBase and Cloudera CDP/Hortonworks platform
Experience on Event Streaming Platform, Message Queues systems like Kafka , Pulsar, Rabbit - MQ, Redis MQ
- Event Processing systems– Kafka Streaming, KSQL, Spark Streaming, Apache Flink, Apache Beam etc.
Exposure for CDC Tools– Striim, Attunity, Goldengate for Near Real Time Data Ingestion and Low Latency processing.
Experience on NoSQL & Graph databases (KeyValue/Document/Graph) and similar– Cassandra, Hbase, Tiger graph DB, cloud Natvie nosql db.
Build, deploy and manage big data solutions with solid devops functions. Be able to manage CI/CD pipelines using Ansible & terraform as well as cloud infrastructure automations.
Experience building data solutions on different data formats in distributed data processing and data storage. Data Formats like : Parquet, Avro, Protobuf, ORC, Arrow, ; Open Table Formats: Delta Lake, Iceberg, Hudi and Hive.
Experts of SQL interface to tabular, relational datasets on distributed analytic engines like Databrick SQL, Synapse , Presto, Snowflake, BigQuery etc.
Excellent experience in handling relational and transactional databases like – Postgres, Mysql, Oracle and knowledge of Advance SQL in distributed OLAP environments.
Experience with data modelling, data warehousing, and building high-volume ETL/ELT pipelines.
Experience with data profiling and data quality tools like Apache Griffin, Deequ, and Great Expectations
Good understanding of data engineering / software engineering best practices - include handling and logging errors, monitoring the system, fault -tolerant pipelines, scale up, data quality and ensuring a deterministic pipeline with Data Ops
Experience in Data APIs, Domain Modelling for RESTful & GraphQL APIs, building using Spring Boot
Able to drive Devops and DataOps best practices like CI/CD, containerization, blue-green deployments,secrets management etc in the Data ecosystems. Also running and scaling applications on the cloud infrastructure and containerized services like Kubernetes & Openshift
Good to Have exposure in Telco Data Model and Warehouse and / or Data Lake
Good to Have exposure to Reporting/Visualization Stack – Power BI / Tableau.
Good to have exposure for Machine Learning , MLOps