Data Engineer
Houston, TX
PDI Technologies
PDI Technologies helps fuel and convenience businesses increase productivity, profitability, loyalty, and security by Connecting Convenience.
At PDI Technologies, we empower some of the world's leading convenience retail and petroleum brands with cutting-edge technology solutions that drive growth and operational efficiency. By “Connecting Convenience” across the globe, we empower businesses to increase productivity, make more informed decisions, and engage faster with customers through loyalty programs, shopper insights, and unmatched real-time market intelligence via mobile applications, such as GasBuddy. We’re a global team committed to excellence, collaboration, and driving real impact. Explore our opportunities and become part of a company that values diversity, integrity, and growth.
Role Overview
We are seeking a talented and adaptable Data Engineer to join our growing team, with deep experience in Databricks, Apache Spark, and the Azure data ecosystem—and a forward-looking mindset for cloud migration to AWS. This role focuses on building robust, scalable data pipelines while also preparing for a strategic shift toward a cloud-agnostic or AWS-based architecture.You'll work within a modern Lakehouse environment, integrating with a custom-built Databricks accelerator, and supporting both current Azure infrastructure and future AWS transformation.
Role Overview
We are seeking a talented and adaptable Data Engineer to join our growing team, with deep experience in Databricks, Apache Spark, and the Azure data ecosystem—and a forward-looking mindset for cloud migration to AWS. This role focuses on building robust, scalable data pipelines while also preparing for a strategic shift toward a cloud-agnostic or AWS-based architecture.You'll work within a modern Lakehouse environment, integrating with a custom-built Databricks accelerator, and supporting both current Azure infrastructure and future AWS transformation.
Key Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines using Databricks, Apache Spark, and cloud-native services.
- Leverage Azure data services (Data Lake, SQL DB, Service Bus, Cosmos DB, OLTP) for ingestion, transformation, and pipeline orchestration.
- Implement and manage Change Data Capture (CDC) processes, including change tracking and change data feed, to support incremental updates.
- Plug into and extend a custom metadata-driven Databricks accelerator framework used to manage Lakehouse operations.
- Optimize the downstream data environment by proactively identifying and resolving upstream data issues.
- Utilize Delta Lake, Unity Catalog, Delta Live Tables, and Databricks Workflows to enforce governance and streamline operations.
- Participate in planning and preparing for a future migration from Azure to AWS, including cross-training, architecture evaluation, and knowledge transfer.
Required Experience and Skills
- Strong expertise in Databricks and Apache Spark, especially within large-scale data environments.
- Proficiency in SQL and Python for data transformation and pipeline automation.
- Hands-on experience with key Azure services, including Storage, SQL DB, Cosmos DB, Service Bus, and OLTP systems.
- Experience implementing CDC mechanisms and managing real-time or batch data syncs.
- Familiarity with Delta Lake architecture, Unity Catalog, Delta Live Tables, and job orchestration in Databricks.
- Ability to diagnose and resolve upstream data quality issues to improve downstream consumption and performance.
- Willingness and ability to contribute to an upcoming cloud migration to AWS, with foundational knowledge or strong interest in AWS-native data tooling.
Preferred Qualifications
- Experience supporting data modernization or cloud migration initiatives (especially Azure to AWS).
- Familiarity with AWS data tools such as S3, Glue, Redshift, Kinesis, or DynamoDB is a plus.
- Background in metadata-driven architectures or accelerator-style data platforms.
- Strong collaboration skills and experience working with data architects, product owners, and analytics stakeholders.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Category:
Engineering Jobs
Tags: Architecture AWS Azure Cosmos DB Databricks Data pipelines Data quality DynamoDB ELT ETL Kinesis Pipelines Python Redshift Spark SQL
Perks/benefits: Career development Competitive pay
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Data Engineer II jobsSr. Data Engineer jobsBI Developer jobsBusiness Intelligence Developer jobsPrincipal Data Engineer jobsStaff Data Scientist jobsStaff Machine Learning Engineer jobsPrincipal Software Engineer jobsJunior Data Analyst jobsData Science Intern jobsDevOps Engineer jobsData Manager jobsSoftware Engineer II jobsData Science Manager jobsStaff Software Engineer jobsData Analyst Intern jobsLead Data Analyst jobsData Specialist jobsBusiness Data Analyst jobsAI/ML Engineer jobsSr. Data Scientist jobsBusiness Intelligence Analyst jobsData Governance Analyst jobsData Engineer III jobsSenior Backend Engineer jobs
Consulting jobsMLOps jobsAirflow jobsOpen Source jobsEconomics jobsLinux jobsKPIs jobsKafka jobsGitHub jobsTerraform jobsJavaScript jobsPostgreSQL jobsRDBMS jobsData Warehousing jobsPrompt engineering jobsNoSQL jobsStreaming jobsClassification jobsComputer Vision jobsBanking jobsRAG jobsScikit-learn jobsPhysics jobsGoogle Cloud jobsHadoop jobs
dbt jobsPandas jobsGPT jobsBigQuery jobsLooker jobsOracle jobsScala jobsR&D jobsData warehouse jobsReact jobsDistributed Systems jobsLangChain jobsScrum jobsPySpark jobsMicroservices jobsCX jobsELT jobsIndustrial jobsOpenAI jobsSAS jobsRedshift jobsJira jobsModel training jobsTypeScript jobsRobotics jobs