Senior Data Engineer

Bangalore Office

Josys

Stop letting runaway SaaS costs & unsecured SaaS access ruin your day. Josys is the only SaaS Management Platform that provides true 360-degree control.

View all jobs at Josys

Apply now Apply later

Senior Data Engineer

Location: Bengaluru, Karnataka, India

About the Role:

We're looking for an experienced Senior Data Engineer (6-8 years) to join our data team. You'll be key in building and maintaining our data systems on AWS. You'll use your strong skills in big data tools and cloud technology to help our analytics team get valuable insights from our data. You'll be in charge of the whole process of our data pipelines, making sure the data is good, reliable, and fast.

What You'll Do:

  • Design and build efficient data pipelines using Spark / PySpark / Scala.

  • Manage complex data processes with Airflow, creating and fixing any issues with the workflows (DAGs).

  • Clean, transform, and prepare data for analysis.

  • Use Python for data tasks, automation, and building tools.

  • Work with AWS services like S3, Redshift, EMR, Glue, and Athena to manage our data infrastructure.

  • Collaborate closely with the Analytics team to understand what data they need and provide solutions.

  • Help develop and maintain our Node.js backend, using Typescript, for data services.

  • Use YAML to manage the settings for our data tools.

  • Set up and manage automated deployment processes (CI/CD) using GitHub Actions.

  • Monitor and fix problems in our data pipelines to keep them running smoothly.

  • Implement checks to ensure our data is accurate and consistent.

  • Help design and build data warehouses and data lakes.

  • Use SQL extensively to query and work with data in different systems.

  • Work with streaming data using technologies like Kafka for real-time data processing.

  • Stay updated on the latest data engineering technologies.

  • Guide and mentor junior data engineers.

  • Help create data management rules and procedures.

What You'll Need:

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

  • 6-8 years of experience as a Data Engineer.

  • Strong skills in Spark and Scala for handling large amounts of data.

  • Good experience with Airflow for managing data workflows and understanding DAGs.

  • Solid understanding of how to transform and prepare data.

  • Strong programming skills in Python for data tasks and automation..

  • Proven experience working with AWS cloud services (S3, Redshift, EMR, Glue, IAM, EC2, and Athena).

  • Experience building data solutions for Analytics teams.

  • Familiarity with Node.js for backend development.

  • Experience with Typescript for backend development is a plus.

  • Experience using YAML for configuration management.

  • Hands-on experience with GitHub Actions for automated deployment (CI/CD).

  • Good understanding of data warehousing concepts.

  • Strong database skills -  OLAP/OLTP

  • Excellent command of SQL for data querying and manipulation.

  • Experience with stream processing using Kafka or similar technologies.

  • Excellent problem-solving, analytical, and communication skills.

  • Ability to work well independently and as part of a team.

Bonus Points:

  • Familiarity with data lake technologies (e.g., Delta Lake, Apache Iceberg).

  • Experience with other stream processing technologies (e.g., Flink, Kinesis).

  • Knowledge of data management, data quality, statistics and data governance frameworks.

  • Experience with tools for managing infrastructure as code (e.g., Terraform).

  • Familiarity with container technologies (e.g., Docker, Kubernetes).

  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana).

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Airflow Athena AWS Big Data CI/CD Computer Science Data governance Data management Data pipelines Data quality Data Warehousing Docker EC2 Engineering Flink GitHub Grafana Kafka Kinesis Kubernetes Node.js OLAP Pipelines PySpark Python Redshift Scala Spark SQL Statistics Streaming Terraform TypeScript

Region: Asia/Pacific
Country: India

More jobs like this