Data Engineer
Remote
Sparkland
Sparkland is the HFT firm that's helping create the future of crypto in real-time. Join our hybrid team today.We are a team of highly-driven individuals who are passionate about technology, algorithmic trading, and solving intellectually challenging problems. Being a part of Sparkland means you get to work with some of the brightest people in one of the world’s fastest-growing and most exciting industries. We are fully remote and have a flat corporate structure that values open-mindedness, entrepreneurial spirit, commitment to excellence, and continuous learning.
The Role
We are looking for a Data Engineer to help us build and maintain the data backbone of our trading platform. You will be working on high-volume data pipelines, ensuring the reliability and observability of our infrastructure, and preparing the system for upcoming ML initiatives. If you’ve worked with modern data stacks, enjoy building efficient pipelines, and thrive in environments where data precision and scalability matter, this role might be for you.Responsibilities
- Design and maintain robust data pipelines to support real-time and batch processing.
- Manage and optimize our Clickhouse data warehouse, including cluster performance and schema tuning.
- Ensure data quality, observability, and governance across critical pipelines.
- Collaborate with backend engineers, trading teams, and data stakeholders to align on data requirements.
- Support internal initiatives by building tooling and monitoring for business and technical metrics.
- Take ownership of scheduling and workflow orchestration (Argo, Airflow, etc.) and contribute to CI/CD automation.
Required Skills & Experience
- At least 5 years of professional experience in data engineering or backend infrastructure.
- Proficiency in Python, including object-oriented programming and testing.
- Solid experience with SQL: complex joins, window functions, and performance optimization.
- Hands-on experience with Clickhouse (especially the MergeTree engine family) or similar columnar DBs.
- Familiarity with workflow schedulers (e.g., Argo Workflows, Airflow, or Kubeflow).
- Understanding of Kafka architecture (topics, partitions, producers, consumers).
- Comfortable with CI/CD pipelines (GitLab CI, ArgoCD, GitHub Actions).
- Experience with monitoring and BI tools such as Grafana for technical/business dashboards.
Bonus Points
- Experience with AWS services (S3, EKS, RDS).
- Familiarity with Kubernetes and Helm for deployment and scaling.
- Exposure to data quality/observability frameworks.
- Experience supporting ML infrastructure (e.g., feature pipelines, training data workflows).
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture AWS CI/CD Data pipelines Data quality Data warehouse Engineering GitHub GitLab Grafana Helm Kafka Kubeflow Kubernetes Machine Learning ML infrastructure OOP Pipelines Python SQL Testing
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.