Data Engineer

Lahore, Punjab

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

NorthBay Solutions

NorthBay Solutions, an AWS Premier Partner, specializes in Generative AI, AI/ML, Managed Cloud Services, and Cloud Migration. Transform your business today.

View all jobs at NorthBay Solutions

Apply now Apply later

Job Title: Data Engineer 

Location: Karachi, Lahore , Islamabad (Hybrid)
Experience: 5+ Years
Job Type: Full-Time
 

Job Overview:

We are looking for a highly skilled and experienced Data Engineer with a strong foundation in Big Data, distributed computing, and cloud-based data solutions. This role demands a strong understanding of end-to-end Data pipelines, data modeling, and advanced data engineering practices across diverse data sources and environments. You will play a pivotal role in building, deploying, and optimizing data infrastructure and pipelines in a scalable cloud-based architecture.


Key Responsibilities:

  • Design, develop, and maintain large-scale Data pipelines using modern big data technologies and cloud-native tools.

  • Build scalable and efficient distributed data processing systems using Hadoop, Spark, Hive, and Kafka.

  • Work extensively with cloud platforms (preferably AWS) and services like EMR, Glue, Lambda, Athena, S3.

  • Design and implement data integration solutions pulling from multiple sources into a centralized data warehouse or data lake.

  • Develop pipelines using DBT (Data Build Tool) and manage workflows with Apache Airflow or Step Functions.

  • Write clean, maintainable, and efficient code using Python, PySpark, or Scala for data transformation and processing.

  • Build and manage relational and columnar data stores such as PostgreSQL, MySQL, Redshift, Snowflake, HBase, ClickHouse.

  • Implement CI/CD pipelines using Docker, Jenkins, and other DevOps tools.

  • Collaborate with data scientists, analysts, and other engineering teams to deploy data models into production.

  • Drive data quality, integrity, and consistency across systems.

  • Participate in Agile/Scrum ceremonies and utilize JIRA for task management.

  • Provide mentorship and technical guidance to junior team members.

  • Contribute to continuous improvement by making recommendations to enhance data engineering processes and architecture.


Required Skills & Experience:

  • 5+ years of hands-on experience as a Data Engineer 

  • Deep knowledge of Big Data technologies – Hadoop, Spark, Hive, Kafka.

  • Expertise in Python, PySpark and/or Scala.

  • Proficient with data modeling, SQL scripting, and working with large-scale datasets.

  • Experience with distributed storage like HDFS and cloud storage (e.g., AWS S3).

  • Hands-on with data orchestration tools like Apache Airflow or StepFunction.

  • Experience working in AWS environments with services such as EMR, Glue, Lambda, Athena.

  • Familiarity with data warehousing concepts and experience with tools like Redshift, Snowflake (preferred).

  • Exposure to tools like Informatica, AbInitio, Apache Iceberg is a plus.

  • Knowledge of Docker, Jenkins, and other CI/CD tools.

  • Strong problem-solving skills, initiative, and a continuous learning mindset.


Preferred Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.

  • Experience with open table formats such as Apache Iceberg.

  • Hands-on with AbInitio (GDE, Collect > IT) or Informatica tools.

  • Knowledge of Agile methodology, working experience in JIRA.


Soft Skills:

  • Self-driven, proactive, and a strong team player.

  • Excellent communication and interpersonal skills.

  • Passion for data and technology innovation.

  • Ability to work independently and manage multiple priorities in a fast-paced environment.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Agile Airflow Architecture Athena AWS Big Data CI/CD Computer Science Data pipelines Data quality Data warehouse Data Warehousing dbt DevOps Docker Engineering Hadoop HBase HDFS Informatica Jenkins Jira Kafka Lambda MySQL Pipelines PostgreSQL PySpark Python Redshift Scala Scrum Snowflake Spark SQL Step Functions

Region: Asia/Pacific
Country: Pakistan

More jobs like this