Data Engineer

Bengaluru, India

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time USD 49K - 91K * ^est.

The Customization Group

A world market leader in mass customization, serving millions of customers with our tailored B2B and API solutions. Sustainably.

View all jobs at The Customization Group

Apply now Apply later

Posted 14 hours ago

We are Make IT Real Tech!

An information technology firm that provides consulting services in IT, software, and advanced areas such as Artificial Intelligence. Our global team of experts is committed to creating AI-driven workflows and solutions that enhance efficiency across the manufacturing, supply chain management, and e-commerce space. By leveraging latest technologies, we offer customized IT solutions that place us at the forefront of innovation, driving advancements in both customer engagement and operational efficiency.

We have a brand-new opportunity for a Data Engineer.
In this role, you will help build and maintain the data infrastructure that powers our analytics and machine learning systems. Sounds good? Then keep reading!

Why you will love working with us:

Global Collaboration: Gain international experience by working with globally distributed teams
Flexible Work Options: Enjoy remote or hybrid work arrangements that suit your lifestyle
Work-Life Balance: Flexible working hours help you balance your professional and personal life
Private Health Insurance: Comprehensive coverage for your peace of mind
Extra Leave: Additional paid leave for special occasions
Growth Opportunities: Access to valuable knowledge and experience to support your career development
Team Building: Connect with colleagues through team-building activities and company events
Innovation and AI: Be part of an AI-first workplace that enables everyone to drive unique business solutions through state-of-the-art technology

Key Responsibilities:

Data Pipeline Development
- Design, build, and maintain efficient data pipelines using Apache Airflow for orchestration and scheduling
- Implement real-time data streaming solutions using Kafka for event-driven architectures
- Develop ETL/ELT processes to move and transform data between various sources and destinations ensuring reliability via monitoring mechanisms
Data Infrastructure & Architecture
- Design and optimize data warehouse schemas in Amazon Redshift for analytical workloads
- orchestrate S3 Data lakes & RDS sources for diverse formats
- Drive cost-efficient, high performance data storage designs
Platform & DevOps
- Deploy and manage data applications on Kubernetes for scalability and reliability
- Implement infrastructure as code practices for reproducible environments
- Build and maintain CI/CD pipelines for data engineering workflows

Data Quality & Governance
- Embed automated tests, validation and lineage tracking.
- Ensure compliance with data privacy and security requirements

Cross-functional Collaboration
- Partner with BI analysts to understand reporting requirements and optimize data models in partnership with ML Engineers
- Collaborate with software engineers to integrate data systems with applications
- Support product teams with data infrastructure needs and technical guidance

Required Qualifications:

Education & Experience
- Bachelor's degree in computer science, Engineering, or related technical field, or equivalent practical experience
- 3+ years of experience building and maintaining data pipelines in production environments
- Proven track record of designing scalable data architectures
Technical Skills
- Strong programming skills in Python and SQL
- Hands-on experience with Apache Airflow and Apache Kafka
- Proficiency with AWS data services including Redshift, RDS, and S3
- Experience with containerization and Kubernetes for deploying data applications
- Strong understanding of data modeling concepts for both transactional and analytical systems
- Experience with version control (Git) and collaborative development practices
Data Engineering Expertise
- Deep understanding of ETL/ELT patterns and best practices
- Experience with both batch and real-time data processing paradigms
- Knowledge of data warehouse design principles, including dimensional modeling
- Understanding of distributed computing concepts and big data technologies
- Experience with data quality frameworks and testing methodologies