GCP Data Engineer
Johannesburg, South Africa
- Remote-first
- Website
- @nagarro 𝕏
- Search
Nagarro
A digital product engineering leader, Nagarro drives technology-led business breakthroughs for industry leaders and challengers through agility and innovation.Company Description
We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (18000+ experts across 37 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!
Job Description
- Design, develop, and maintain scalable data pipelines and ETL processes on Google Cloud Platform (GCP).
- Implement and optimize data storage solutions using BigQuery, Cloud Storage, and other GCP services.
- Collaborate with data scientists, machine learning engineers, data engineer and other stakeholders to integrate and deploy machine learning models into production environments.
- Develop and maintain custom deployment solutions for machine learning models using tools such as Kubeflow, AI Platform, and Docker.
- Write clean, efficient, and maintainable code in Python and PySpark for data processing and transformation tasks.
- Ensure data quality, integrity, and consistency through data validation and monitoring processes. - Deep understanding of Medallion architecture.
- Develop metadata driven pipelines and ensure optimal processing of data
- Use Terraform to manage and provision cloud infrastructure resources on GCP.
- Troubleshoot and resolve production issues related to data pipelines and machine learning models.
- Stay up-to-date with the latest industry trends and best practices in data engineering, machine learning, and cloud technologies. - understands data lifecycle management, data pruning, model drift and model optimisations.
Qualifications
Must have Skills : Machine Learning - General Experience, Visualization,Google Cloud Platform, Pyspark,
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Proven experience as a Data Engineer with a focus on GCP.
- Strong proficiency in Python and PySpark for data processing and transformation.
- Hands-on experience with machine learning model deployment and integration on GCP.
- Familiarity with GCP services such as BigQuery, Cloud Storage, Dataflow, and AI Platform.
- Experience with Terraform for infrastructure as code.
- Experience with containerization and orchestration tools like Docker and Kubernetes.
- Strong problem-solving skills and the ability to troubleshoot complex issues.
- Excellent communication and collaboration skills.
Additional Information
*Preferred Qualifications:**
- Experience with custom deployment solutions and MLOps.
- Knowledge of other cloud platforms (AWS, Azure) is a plus.
- Familiarity with CI/CD pipelines and tools like Jenkins or GitLab CI.
- Visualisation experience is nice to have but not mandatory.
- Certification in GCP Data Engineering or related fields.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Azure BigQuery CI/CD Computer Science Dataflow Data pipelines Data quality Docker Engineering ETL GCP GitLab Google Cloud Jenkins Kubeflow Kubernetes Machine Learning ML models MLOps Model deployment Pipelines PySpark Python Terraform
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.