Quantexa developer

Singapore, Singapore, Singapore

Applications have closed
  • Implement data transformation, aggregation, and enrichment processes to support various data analytics and machine learning initiatives
  • Collaborate with cross-functional teams to understand data requirements and translate them into effective data engineering solutions
  • Design, develop, and implement Spark Scala applications and data processing pipelines to process large volumes of structured and unstructured data
  • Integrate Elasticsearch with Spark to enable efficient indexing, querying, and retrieval of data
  • Optimize and tune Spark jobs for performance and scalability, ensuring efficient data processing and indexing in Elasticsearch
  • Implement data transformations, aggregations, and computations using Spark RDDs, DataFrames, and Datasets, and integrate them with Elasticsearch
  • Develop and maintain scalable and fault-tolerant Spark applications, adhering to industry best practices and coding standards
  • Troubleshoot and resolve issues related to data processing, performance, and data quality in the Spark-Elasticsearch integration
  • Monitor and analyze job performance metrics, identify bottlenecks, and propose optimizations in both Spark and Elasticsearch components
  • Ensure data quality and integrity throughout the data processing lifecycle
  • Design and deploy data engineering solutions on OpenShift Container Platform (OCP) using containerization and orchestration techniques
  • Optimize data engineering workflows for containerized deployment and efficient resource utilization
  • Collaborate with DevOps teams to streamline deployment processes, implement CI/CD pipelines, and ensure platform stability
  • Implement data governance practices, data lineage, and metadata management to ensure data accuracy, traceability, and compliance
  • Monitor and optimize data pipeline performance, troubleshoot issues, and implement necessary enhancements

Requirements

  • Must be Quantexa certified data engineer / data architect and proficient with the tool.
  • Proven experience as a Data Engineer, working with Hadoop, Spark, and data processing technologies in large-scale environments
  • Proficiency in Scala programming language and familiarity with functional programming concepts
  • Experience with Quantexa tool is highly preferred.
  • In-depth understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
  • Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes
  • Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java
  • Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket)
  • Experience with Graphana, Prometheus, Splunk will be an added benefit

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  2  0  0
Category: Engineering Jobs

Tags: Ansible Architecture Bitbucket CI/CD Data Analytics Data governance Data quality DevOps Docker Elasticsearch Engineering Hadoop Java Jenkins Kubernetes Machine Learning Pipelines Python Scala Spark Splunk SQL Unstructured data

Region: Asia/Pacific
Country: Singapore

More jobs like this