AWS Data Engineer - Fully Remote - US Only
Plano, Texas, United States - Remote
❋ Why Scalepex?
Scalepex is a dynamic services firm specializing in providing solutions for premium brands like Nike, Pepsi, Toyota, Virgin and Walgreens. Our mission is to connect prominent market leaders with top-tier professionals from around the world, fostering collaboration, efficiency, and growth.
❋ Take your portfolio to the next level by working with one of our fastest growing clients.
Join the Innovation Frontier at Scalepex!
About the Role
We are seeking an experienced AWS Data Engineer with a strong background in building scalable data solutions and expertise in utilities-related datasets. The ideal candidate will have at least 5 years of experience in data engineering, a deep understanding of distributed systems, and proficiency with AWS services and tools like Step Functions, Lambda, Glue, and Redshift. This role will focus on designing, developing, and optimizing data pipelines to support analytics and decision-making in the utilities industry.
Key Responsibilities
- Design and Build Data Pipelines: Develop scalable, reliable data pipelines using AWS services (e.g., Glue, S3, Redshift) to process and transform large datasets from utility systems like smart meters or energy grids.
- Workflow Orchestration: Use AWS Step Functions to orchestrate workflows across data pipelines; experience with Airflow is acceptable but Step Functions is preferred.
- Data Integration and Transformation: Implement ETL/ELT processes using PySpark, Python, and Pandas to clean, transform, and integrate data from multiple sources into unified datasets.
- Distributed Systems Expertise: Leverage experience with complex distributed systems to ensure reliability, scalability, and performance in handling large-scale utility data.
- Serverless Application Development: Use AWS Lambda functions to build serverless solutions for automating data processing tasks.
- Data Modeling for Analytics: Design data models tailored for utilities use cases (e.g., energy consumption forecasting) to enable advanced analytics
- Optimize Data Pipelines: Continuously monitor and improve the performance of data pipelines to reduce latency, enhance throughput, and ensure high availability.
- Ensure Data Security and Compliance: Implement robust security measures to protect sensitive utility data and ensure compliance with industry regulations.
Requirements
Required Qualifications
- Minimum of 5 years of experience in data engineering
- Proficiency in AWS services such as Step Functions, Lambda, Glue, S3, DynamoDB, and Redshift.
- Strong programming skills in Python with experience using PySpark and Pandas for large-scale data processing.
- Hands-on experience with distributed systems and scalable architectures.
- Knowledge of ETL/ELT processes for integrating diverse datasets into centralized systems.
- Familiarity with utilities-specific datasets (e.g., smart meters, energy grids) is highly desirable.
- Strong analytical skills with the ability to work on unstructured datasets.
- Knowledge of data governance practices to ensure accuracy, consistency, and security of data.
- Strong experience in AWS data engineering
- Ability to work independently
- Ability to work with a cross-functional teams, including interfacing and communicating with business stakeholders
- Professional oral and written communication skills
- Strong problem solving and troubleshooting skills with experience exercising mature judgement
- Excellent teamwork and interpersonal skills
- Ability to obtain and maintain the required clearance for this role
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture AWS Data governance Data pipelines Distributed Systems DynamoDB ELT Engineering ETL Lambda Pandas Pipelines PySpark Python Redshift Security Step Functions
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.