Data Engineer (Python | Kafka | Postgress)

Remote

Full Time Mid-level / Intermediate USD 83K - 155K * ^est.

Aspire, Jordan

Posted 17 hours ago

This is a remote position.

About the Role

As a Data Engineer you will focus on developing, optimising, and maintaining robust ETL pipelines using Python and Apache Spark, while also supporting a migration to Snowflake-based solutions. You'll be responsible for writing efficient SQL, designing scalable data workflows, and ensuring performance and reliability through monitoring and alerting systems like Airflow. Collaboration with cross-functional teams is key, as is contributing to code reviews and adhering to best practices across Python, Spark, and Snowflake. Ideal candidates will have strong experience in distributed data systems, cloud environments, and be available to work overlapping hours with both U.S. Eastern and Central European time zones.

What You'll Do

Develop & Maintain ETL Pipelines: Build, optimize, and troubleshoot data ingestion, transformation, and loading processes primarily in Python.
Spark Workloads: Design and tune Spark jobs (batch or streaming) for existing production workflows.
Snowflake Migration: Lead development of new ETL/ELT processes in Snowflake (e.g., using Snowpipe, Snowflake Tasks, stored procedures) and refactor existing Spark pipelines into Snowflake-based solutions.
SQL Development: Write complex SQL queries for data modeling, transformation, and performance tuning in Snowflake (and occasionally on other RDBMS systems).
Collaboration & Documentation: Work closely with data analysts, data scientists, and DevOps to gather requirements, document pipelines, and define data quality checks.
Monitoring & Alerting: Implement monitoring (e.g., using Airflow, dbt, or custom scripts) and alerting for data workflows to ensure SLAs are met.
Code Reviews & Best Practices: Contribute to code reviews, establish coding standards, and share knowledge about Python, Spark, and Snowflake best practices.
Performance Optimization: Profile and optimize Python code, Spark jobs, and Snowflake queries to ensure cost efficiency and acceptable SLAs.

What You'll Need

3+ years of professional experience in data engineering or a closely related field.
Proficiency in Python (including libraries such as pandas, PySpark, or equivalent).
Hands-on experience building and maintaining Apache Spark–based ETL pipelines.
Strong SQL skills (writing complex queries, window functions, CTEs, performance tuning).
Familiarity with Snowflake (e.g., designing schemas, writing Snowflake-specific SQL, understanding cost optimization in a cloud data warehouse).
Experience with workflow schedulers/orchestrators such as Apache Airflow, Prefect, or similar.
Knowledge of data serialization formats (Parquet, Avro, ORC) and cloud storage (e.g., S3, Azure Blob, GCS).
Comfort working with distributed compute frameworks and cloud environments (AWS, GCP, or Azure).
Understanding of distributed system design patterns (e.g., partitioning, shuffling, scaling out).
Excellent verbal and written communication skills.
Ability to collaborate asynchronously across time zones.
Immediate or very near-term availability is preferred.
Willingness to work hours that overlap U.S. Eastern Time (EST) and Central European Summer Time (CEST), as our core team operates between these regions.
Awareness or knowledge of IT security best practices as defined by ISO/SOC or similar.

Why Aspire
In addition to a competitive long-term total compensation with salary and performance-based bonus, we have a reward philosophy that goes beyond financials:

Be part of a “Remote is here-to-stay” organization.
Work and learn from experienced global tech leaders.
Continuous growth via technical and soft skills training programs.
Access to international conferences (virtual and onsite).
Nursery reimbursement benefit.
Exposure to work in an IT environment that adheres to rigorous security and compliance standards defined by ISO/SOC.