Quantexa developer
Singapore, Singapore, Singapore
Unison Consulting Pte Ltd
- Implement data transformation, aggregation, and enrichment processes to support various data analytics and machine learning initiatives
- Collaborate with cross-functional teams to understand data requirements and translate them into effective data engineering solutions
- Design, develop, and implement Spark Scala applications and data processing pipelines to process large volumes of structured and unstructured data
- Integrate Elasticsearch with Spark to enable efficient indexing, querying, and retrieval of data
- Optimize and tune Spark jobs for performance and scalability, ensuring efficient data processing and indexing in Elasticsearch
- Implement data transformations, aggregations, and computations using Spark RDDs, DataFrames, and Datasets, and integrate them with Elasticsearch
- Develop and maintain scalable and fault-tolerant Spark applications, adhering to industry best practices and coding standards
- Troubleshoot and resolve issues related to data processing, performance, and data quality in the Spark-Elasticsearch integration
- Monitor and analyze job performance metrics, identify bottlenecks, and propose optimizations in both Spark and Elasticsearch components
- Ensure data quality and integrity throughout the data processing lifecycle
- Design and deploy data engineering solutions on OpenShift Container Platform (OCP) using containerization and orchestration techniques
- Optimize data engineering workflows for containerized deployment and efficient resource utilization
- Collaborate with DevOps teams to streamline deployment processes, implement CI/CD pipelines, and ensure platform stability
- Implement data governance practices, data lineage, and metadata management to ensure data accuracy, traceability, and compliance
- Monitor and optimize data pipeline performance, troubleshoot issues, and implement necessary enhancements
Requirements
- Must be Quantexa certified data engineer / data architect and proficient with the tool.
- Proven experience as a Data Engineer, working with Hadoop, Spark, and data processing technologies in large-scale environments
- Proficiency in Scala programming language and familiarity with functional programming concepts
- Experience with Quantexa tool is highly preferred.
- In-depth understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
- Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes
- Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java
- Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket)
- Experience with Graphana, Prometheus, Splunk will be an added benefit
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
2
0
0
Category:
Engineering Jobs
Tags: Ansible Architecture Bitbucket CI/CD Data Analytics Data governance Data quality DevOps Docker Elasticsearch Engineering Hadoop Java Jenkins Kubernetes Machine Learning Pipelines Python Scala Spark Splunk SQL Unstructured data
Region:
Asia/Pacific
Country:
Singapore
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Staff Machine Learning Engineer jobsStaff Data Scientist jobsBI Developer jobsData Scientist II jobsPrincipal Data Engineer jobsData Manager jobsJunior Data Analyst jobsResearch Scientist jobsSenior AI Engineer jobsData Science Manager jobsBusiness Data Analyst jobsData Engineer III jobsData Science Intern jobsData Specialist jobsLead Data Analyst jobsPrincipal Software Engineer jobsSoftware Engineer II jobsBI Analyst jobsData Analyst II jobsData Analyst Intern jobsSr. Data Scientist jobsSoftware Engineer, Machine Learning jobsAzure Data Engineer jobsJunior Data Engineer jobsSenior Data Scientist, Performance Marketing jobs
Snowflake jobsLinux jobsEconomics jobsOpen Source jobsBanking jobsHadoop jobsRDBMS jobsComputer Vision jobsKafka jobsPhysics jobsData Warehousing jobsGoogle Cloud jobsAirflow jobsJavaScript jobsNoSQL jobsScala jobsStreaming jobsKPIs jobsClassification jobsMLOps jobsData warehouse jobsR&D jobsScikit-learn jobsOracle jobsPostgreSQL jobs
SAS jobsTerraform jobsGitHub jobsPySpark jobsScrum jobsData Mining jobsPandas jobsCX jobsRobotics jobsDistributed Systems jobsIndustrial jobsBigQuery jobsJira jobsRedshift jobsMicroservices jobsPharma jobsLooker jobsUnstructured data jobsJenkins jobsData strategy jobsReact jobsdbt jobsE-commerce jobsNumPy jobsMySQL jobs