Data Engineer

Remote

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Mid-level / Intermediate USD 83K - 155K * ^est.

Sparkland

Sparkland is the HFT firm that's helping create the future of crypto in real-time. Join our hybrid team today.

Posted 3 weeks ago

We are a team of highly-driven individuals who are passionate about technology, algorithmic trading, and solving intellectually challenging problems. Being a part of Sparkland means you get to work with some of the brightest people in one of the world’s fastest-growing and most exciting industries. We are fully remote and have a flat corporate structure that values open-mindedness, entrepreneurial spirit, commitment to excellence, and continuous learning.

The Role

We are looking for a Data Engineer to help us build and maintain the data backbone of our trading platform. You will be working on high-volume data pipelines, ensuring the reliability and observability of our infrastructure, and preparing the system for upcoming ML initiatives. If you’ve worked with modern data stacks, enjoy building efficient pipelines, and thrive in environments where data precision and scalability matter, this role might be for you.

Responsibilities

Design and maintain robust data pipelines to support real-time and batch processing.
Manage and optimize our Clickhouse data warehouse, including cluster performance and schema tuning.
Ensure data quality, observability, and governance across critical pipelines.
Collaborate with backend engineers, trading teams, and data stakeholders to align on data requirements.
Support internal initiatives by building tooling and monitoring for business and technical metrics.
Take ownership of scheduling and workflow orchestration (Argo, Airflow, etc.) and contribute to CI/CD automation.

Required Skills & Experience

At least 5 years of professional experience in data engineering or backend infrastructure.
Proficiency in Python, including object-oriented programming and testing.
Solid experience with SQL: complex joins, window functions, and performance optimization.
Hands-on experience with Clickhouse (especially the MergeTree engine family) or similar columnar DBs.
Familiarity with workflow schedulers (e.g., Argo Workflows, Airflow, or Kubeflow).
Understanding of Kafka architecture (topics, partitions, producers, consumers).
Comfortable with CI/CD pipelines (GitLab CI, ArgoCD, GitHub Actions).
Experience with monitoring and BI tools such as Grafana for technical/business dashboards.

Bonus Points

Experience with AWS services (S3, EKS, RDS).
Familiarity with Kubernetes and Helm for deployment and scaling.
Exposure to data quality/observability frameworks.
Experience supporting ML infrastructure (e.g., feature pipelines, training data workflows).