Data Engineer (SQL, Pyspark, Databricks)

Bangalore, KA, IN, 560100

Full Time Mid-level / Intermediate USD 49K - 91K * ^est.

Gainwell Technologies

Gainwell is a strategic partner and solution provider enabling public health programs to elevate patient outcomes, cost savings and provider experiences.

View all jobs at Gainwell Technologies

Apply now Apply later

Posted 20 hours ago

Summary

We’re looking for a dynamic Data Engineer with Apache Spark and AWS experience to join the data analytics team at Gainwell Technologies. You will have the opportunity to work as part of a cross-functional team to define, design and deploy frameworks for data collection, normalization, transformation, storage and reporting on AWS to support the analytic missions of Gainwell and its clients

Your role in our mission

Design, develop and deploy data pipelines including ETL-processes for getting, processing and delivering data using Apache Spark Framework.
Monitor, manage, validate and test data extraction, movement, transformation, loading, normalization, cleansing and updating processes. Build complex databases that are useful, accessible, safe and secure.
Coordinates with users to understand data needs and delivery of data with a focus on data quality, data reuse, consistency, security, and regulatory compliance.
Collaborate with team-members on data models and schemas in our data warehouse.
Collaborate with team-members on documenting source-to-target mapping.
Conceptualize and visualize data frameworks
Communicate effectively with various internal and external stakeholders.

What we're looking for

Bachelor's degree in computer sciences or related field
3 years of experience working with big data technologies on AWS/Azure/GCP
2 years of experience in the Apache Spark/DataBricks framework (Python/Scala)
Experience working with different between database structures (e.g., transaction based vs. data warehouse)
Databricks and AWS developer/architect certifications a big plus

What you should expect in this role

Design, develop and deploy data pipelines including ETL-processes for getting, processing and delivering data using Apache Spark Framework.
Monitor, manage, validate and test data extraction, movement, transformation, loading, normalization, cleansing and updating processes. Build complex databases that are useful, accessible, safe and secure.
Coordinates with users to understand data needs and delivery of data with a focus on data quality, data reuse, consistency, security, and regulatory compliance.
Collaborate with team-members on data models and schemas in our data warehouse.
Collaborate with team-members on documenting source-to-target mapping.
Conceptualize and visualize data frameworks
Communicate effectively with various internal and external stakeholders.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 2 1 0

Category: Engineering Jobs

Tags: AWS Azure Big Data Data Analytics Databricks Data pipelines Data quality Data warehouse ETL GCP Pipelines PySpark Python Scala Security Spark SQL