Principal Data Engineer- Azure
Islamabad, Pakistan
Clustox
Clustox is a leading software development company focusing on end-to-end software solutions. From startups to enterprises, we specialize in end-to-end software development services, empowering businesses with innovative technology solutions.About the Project
We are a mission-driven team of developers, architects, ML engineers, and data specialists building an innovative cloud-based platform to combat coral reef degradation caused by global warming. By leveraging real-time data pipelines, AI/ML models, and scalable cloud architecture, we aim to deliver actionable insights for marine conservation.
What You'll Do
As a Senior Data Engineer, you'll design and optimize data systems that power our conservation efforts. Your work will directly impact our ability to monitor, analyze, and restore coral reefs at scale.
Core Responsibilities
- Build scalable ETL/ELT pipelines using Azure Data Factory, Databricks, and Synapse Analytics.
- Integrate real-time & batch data for AI/ML models (Azure ML, MLOps).
- Implement storage solutions (Azure Data Lake, Cosmos DB, SQL DB).
- Optimize pipelines for speed, cost, and reliability (caching, partitioning).
- Monitor, troubleshoot, and fine-tune data workflows.
- Prepare datasets for feature engineering and model training (PySpark, Pandas).
- Collaborate with data scientists to deploy and monitor ML models.
- Enforce data encryption, access controls, GDPR/HIPAA compliance.
- Work with frontend/backend engineers, DevOps, and conservation scientists.
- Enable data visualization (Power BI, Tableau) for stakeholders.
Who You Are
- 5+ years in data engineering, preferably with Azure cloud services.
- Expert in Python, PySpark, SQL, and big data frameworks.
- Experience with real-time data processing and ML pipeline integration.
- Passionate about sustainability, AI for good, or environmental tech.
- Strong problem-solver who thrives in collaborative, innovative teams.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture Azure Big Data Cosmos DB Databricks Data pipelines Data visualization DevOps ELT Engineering ETL Feature engineering Machine Learning ML models MLOps Model training Pandas Pipelines Power BI PySpark Python SQL Tableau
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.