Data Engineer
GCC, India
Advance Auto Parts
Job Description
WHO WE ARE
Come join our Technology Team and start reimagining the future of the automotive aftermarket. We are a highly motivated tech-focused organization, excited to be amid dynamic innovation and transformational change. Driven by Advance’s top-down commitment to empowering our team members, we are focused on delighting our Customers with Care and Speed, through delivery of world class technology solutions and products.
We value and cultivate our culture by seeking to always be collaborative, intellectually curious, fun, open, and diverse. You will be a key member of a growing and passionate group focused on collaborating across business and technology resources to drive forward key programs and projects building enterprise capabilities across Advance Auto Parts.
THE OPPORTUNITY
Advance Auto Parts, a leader in the automotive aftermarket, currently has an opening for a Data Engineer.
- Building the Foundation: Data engineers create the infrastructure to collect, store, and process this vast digital data. They design pipelines to move data between systems, ensuring accessibility, security, and quality.
- Data Warehousing & Modeling: They develop data warehouses and data models tailored for analysis within the digital space (e.g., understanding customer journeys across multiple platforms)
- Working with Analytics Teams: Data Engineers partner closely with data analysts and data scientists to provide clean, well-structured data sets that enable meaningful insights.
- Enabling Business Impact: Insights derived from digital data help optimize marketing campaigns, personalize customer experiences, and improve product development – data engineers are the architects making this possible.
Job Summary:
We seek a highly skilled Data Engineer to shape and lead our data collection, modeling, and analytics infrastructure. With expertise in tag management solutions, workflow orchestration, and a strong foundation in cloud technologies, you will design a robust data ecosystem that empowers business insights and decision-making.
Responsibilities & Duties:
- Architect and implement scalable, highly available, and secure data pipelines leveraging core AWS services (Lambda, IAM, Glue, S3, SNS, SQS, CloudWatch, EC2).
- Implementing AWS infrastructure based on best practices, maintaining architecture diagrams, and system processes.
- Managing various AWS services and optimizing them for performance, cost, and security.
- Implementing security measures, ensuring compliance with industry standards like GDPR and HIPAA.
- Automating tasks with tools like AWS CloudFormation, Ansible, or Terraform, and creating scripts for infrastructure management.
- Setting up monitoring systems, troubleshooting AWS infrastructure, service, and application issues.
- Optimizing AWS for performance, scalability, and cost efficiency through tuning and capacity planning.
- Working with teams, stakeholders, and management to provide technical solutions effectively.
- Staying updated with AWS services, identifying enhancements, and implementing improvements.
- Testing and implementing disaster recovery plans, ensuring high availability for critical AWS systems
- Collaborate with stakeholders across the organization to understand evolving data needs and design solutions that enable strategic business decision-making.
- Develop and maintenance of a robust data warehouse on Snowflake, optimizing data modeling and performance.
- Drive the development of complex data processing applications using Python/PySpark.
- Foster a collaborative work environment that promotes innovation and continuous improvement.
- Champion data quality, governance, and security, instituting processes that ensure data integrity across the organization.
- Proactively identify and implement opportunities to enhance the data platform, drive performance improvements, and introduce best-in-class solutions.
- Proven track record of building and mentoring high-performing data engineering teams.
- Proficiency with Git or other version control systems
- Experience with CI/CD tools and pipelines for automated testing and deployment.
Qualifications:
- Required:
- A minimum of 6 years in data engineering role.
- Deep expertise in AWS cloud architecture and core services (Lambda, Glue, S3, SNS, SQS, CloudWatch, EC2).
- Experienced in designing and implementing AWS infrastructure following best practices.
- Extensive experience in managing and optimizing various AWS services such as Lambda, EC2, S3, Glue, CloudWatch RDS, and Lambda for performance and cost-efficiency.
- Skilled in implementing and maintaining security measures to safeguard AWS resources.
- Proficient in automating tasks using tools like AWS CloudFormation, Ansible and Terraform
- Proven track record in optimizing AWS infrastructure for performance, scalability, and cost-effectiveness
- Proven ability to leverage Python (boto3, requests, json) and PySpark (parquet files, partitioning, data transformations, optimizations) for efficient data processing and insightful data analysis.
- Proven success in designing and implementing ETL solutions using Talend or similar tools.
- Experience in data warehouse design and optimization with Snowflake.
- Proficient in SQL for complex data manipulation, including joins, CTEs, de-duplication, aggregations, and data modeling principles.
- Skilled in utilizing advanced Snowflake concepts (cloning, clustering, transient tables, external stages, stored procedures, views, variant data processing) to solve complex data challenges and optimize data architecture design.
- Exposure to NoSQL databases (DynamoDB) and an eagerness to learn more.
- Proficient in infrastructure provisioning and management of various AWS services, including Lambda functions, layers, S3 buckets, IAM policies, SNS, and SQS using Terraform (minimum 1 year of experience). Expertise in upgrading both Python and Terraform is highly valued.
- Skilled in using Git for collaborative development. Adept at resolving merge conflicts, cherry-picking specific commits, and reverting changes as needed.
- Possess exceptional data pipeline troubleshooting skills for rapid issue resolution, proactive risk mitigation during upgrades, and ensuring data quality
- Preferred:
- Experience with Google Cloud Platform (GCP) and relevant services.
- Knowledge of Kafka for real-time data streaming.
- Understanding of data privacy regulations like GDPR or CCPA
California Residents click below for Privacy Notice:
https://jobs.advanceautoparts.com/us/en/disclosures* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Ansible Architecture AWS CI/CD CloudFormation Clustering Data analysis Data pipelines Data quality Data warehouse Data Warehousing DynamoDB EC2 Engineering ETL GCP Git Google Cloud JSON Kafka Lambda NoSQL Parquet Pipelines Privacy PySpark Python Security Snowflake SQL Streaming Talend Terraform Testing
Perks/benefits: Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.