002ATM - Senior Data Engineer
Tamil Nadu, Coimbatore, India
Augusta Hitech
Name of the position : Data Engineer
Location : India
Time Zone : UK
Remote : Yes
No.of resources needed for this position : 01
Type of contract : Contract
Years of experience : 6+ Years
Tentative Start Date : ASAP
Overview
We are looking for a talented Data Engineer to design, build, and optimize scalable data pipelines using AWS Glue and other AWS services. You will work with large datasets, implement ETL processes, and integrate external systems to ensure efficient data flow and high data quality. Strong experience with Python, SQL, and cloud technologies is essential. This role offers an opportunity to work on innovative data solutions while collaborating with cross-functional teams.
RESPONSIBILITIES:
Develop, maintain, and optimize AWS Glue jobs using Python to process, transform, and load large volumes of data.
Design and implement ETL (Extract, Transform, Load) processes to ensure seamless data integration from various sources.
Write efficient and scalable SQL queries for handling large datasets, ensuring optimal performance and data integrity.
Integrate external systems with data pipelines through API-based integration, including connecting and extracting data from Salesforce APIs & ERP systems.
Leverage AWS Glue related services, such as AWS Glue Catalog, Glue Crawlers, and Glue Triggers, to build and maintain scalable data pipelines.
Optimize data pipelines for performance and cost-efficiency, leveraging AWS services like S3, Redshift and Lambda.
Ensure data governance, data quality, and security practices are followed while designing data workflows.
Collaborate with the team to ensure smooth deployment, monitoring, and scaling of data pipelines.
Perform root-cause analysis of data issues and proactively identify improvements in the data engineering process.
Optimize data pipelines for performance and cost-efficiency, leveraging AWS services like S3, Redshift, and Lambda.
QUALIFICATIONS:
Strong experience in Python & PySpark programming, with a focus on writing AWS Glue jobs for large-scale data processing.
Advanced knowledge of SQL, including the ability to write complex, optimized queries for large datasets.
Hands-on experience with AWS Glue and related services, such as Glue Catalog, Glue Crawlers, Glue Triggers, and Glue ETL.
Familiarity with other AWS services like S3, Redshift, Lambda, and IAM for managing data pipelines.
Deep understanding of ETL processes, data integration, and data transformation methodologies.
Experience in handling large volumes of structured and unstructured data, ensuring scalability, and data integrity.
Familiarity with performance tuning of SQL queries, data partitioning, and data compression techniques.
Strong analytical and problem-solving skills with a focus on optimizing data workflows for performance and cost.
Location : India
Time Zone : UK
Remote : Yes
No.of resources needed for this position : 01
Type of contract : Contract
Years of experience : 6+ Years
Tentative Start Date : ASAP
Overview
We are looking for a talented Data Engineer to design, build, and optimize scalable data pipelines using AWS Glue and other AWS services. You will work with large datasets, implement ETL processes, and integrate external systems to ensure efficient data flow and high data quality. Strong experience with Python, SQL, and cloud technologies is essential. This role offers an opportunity to work on innovative data solutions while collaborating with cross-functional teams.
RESPONSIBILITIES:
Develop, maintain, and optimize AWS Glue jobs using Python to process, transform, and load large volumes of data.
Design and implement ETL (Extract, Transform, Load) processes to ensure seamless data integration from various sources.
Write efficient and scalable SQL queries for handling large datasets, ensuring optimal performance and data integrity.
Integrate external systems with data pipelines through API-based integration, including connecting and extracting data from Salesforce APIs & ERP systems.
Leverage AWS Glue related services, such as AWS Glue Catalog, Glue Crawlers, and Glue Triggers, to build and maintain scalable data pipelines.
Optimize data pipelines for performance and cost-efficiency, leveraging AWS services like S3, Redshift and Lambda.
Ensure data governance, data quality, and security practices are followed while designing data workflows.
Collaborate with the team to ensure smooth deployment, monitoring, and scaling of data pipelines.
Perform root-cause analysis of data issues and proactively identify improvements in the data engineering process.
Optimize data pipelines for performance and cost-efficiency, leveraging AWS services like S3, Redshift, and Lambda.
QUALIFICATIONS:
Strong experience in Python & PySpark programming, with a focus on writing AWS Glue jobs for large-scale data processing.
Advanced knowledge of SQL, including the ability to write complex, optimized queries for large datasets.
Hands-on experience with AWS Glue and related services, such as Glue Catalog, Glue Crawlers, Glue Triggers, and Glue ETL.
Familiarity with other AWS services like S3, Redshift, Lambda, and IAM for managing data pipelines.
Deep understanding of ETL processes, data integration, and data transformation methodologies.
Experience in handling large volumes of structured and unstructured data, ensuring scalability, and data integrity.
Familiarity with performance tuning of SQL queries, data partitioning, and data compression techniques.
Strong analytical and problem-solving skills with a focus on optimizing data workflows for performance and cost.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Category:
Engineering Jobs
Tags: APIs AWS AWS Glue Data governance Data pipelines Data quality Engineering ETL Lambda Pipelines PySpark Python Redshift Salesforce Security SQL Unstructured data
Region:
Asia/Pacific
Country:
India
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Sr. Data Engineer jobsData Scientist II jobsStaff Data Scientist jobsBI Developer jobsStaff Machine Learning Engineer jobsPrincipal Data Engineer jobsData Manager jobsSenior AI Engineer jobsJunior Data Analyst jobsData Science Intern jobsData Science Manager jobsResearch Scientist jobsBusiness Data Analyst jobsPrincipal Software Engineer jobsData Specialist jobsLead Data Analyst jobsSoftware Engineer II jobsData Analyst Intern jobsSr. Data Scientist jobsData Engineer III jobsBI Analyst jobsJunior Data Engineer jobsDevOps Engineer jobsSoftware Engineer, Machine Learning jobsAI/ML Engineer jobs
Snowflake jobsEconomics jobsLinux jobsOpen Source jobsData Warehousing jobsComputer Vision jobsMLOps jobsGoogle Cloud jobsAirflow jobsNoSQL jobsRDBMS jobsKafka jobsBanking jobsHadoop jobsJavaScript jobsClassification jobsScala jobsScikit-learn jobsPhysics jobsKPIs jobsData warehouse jobsOracle jobsTerraform jobsStreaming jobsGitHub jobs
PostgreSQL jobsScrum jobsPySpark jobsR&D jobsLooker jobsPandas jobsSAS jobsCX jobsBigQuery jobsData Mining jobsDistributed Systems jobsJira jobsdbt jobsRobotics jobsIndustrial jobsRedshift jobsUnstructured data jobsReact jobsMicroservices jobsJenkins jobsData strategy jobsNumPy jobsE-commerce jobsPharma jobsGPT jobs