Software Engineer - data orchestration

NY Office

Full Time Senior-level / Expert USD 147K - 274K * ^est.

Parable

Discover how Parable helps businesses manage their most valuable resource: time. Our AI solutions provide insights into how your organization spends its time, allowing teams to focus on what matters most.

View all jobs at Parable

Apply now Apply later

Posted 1 week ago

Overview

We are opening the search for a critical role at Parable, and hiring a Software Engineer focused on Data Orchestration.

This person will play an essential role in building the data infrastructure that transforms how companies understand and optimize their most precious resource - time.

As a key member of our data platform team, you'll design and implement the scalable data orchestration systems that power our AI-driven insights, working directly with our ML and AI Engineering teams to ensure data flows seamlessly throughout our platform.

If you're excited about building sophisticated data systems while working with seasoned entrepreneurs on a mission to make time matter in a world that hijacks our attention, we'd love to talk.

This role is for someone who:

Is passionate about building robust, scalable data systems. You're not just a developer - you're an architect who thinks deeply about data flows, pipeline efficiency, and system reliability. You've spent years building data infrastructure, and you're constantly exploring new approaches and technologies.
Combines technical excellence with business impact. You can architect complex data orchestration systems and write efficient code, but you never lose sight of what truly matters - enabling Research teams to deliver insights that to customers. You're as comfortable diving deep into technical specifications as you are collaborating with ML engineers to understand their data processing needs.
Has deep expertise in data engineering. You understand the intricacies of building reliable data pipelines at scale, with experience in modern data processing frameworks like PySpark and Polars. You have a knack for solving complex data integration challenges and a passion for data quality and integrity.
Is a lean experimenter at heart. You believe in shipping to learn, but you also know how to build for scale. You have a track record of delivering results in one-third the time that most competent engineers think possible, not by cutting corners, but through smart architectural decisions and iterative development.
Exercises extreme ownership. You take full responsibility for your work, cast no blame, and make no excuses. When issues arise, you're the first to identify solutions rather than point fingers. You see it as your obligation to challenge decisions when you disagree, and seek the scrutiny of your own ideas.

You will be responsible for:

Working closely with ML and AI Engineering teams to design, build, and maint orchestration solutions and pipelines at enable ML/AI teams to self serve the development and deployment of data flows at scale.
Ensuring data integrity, quality, privacy, security, and accessibility to internal and external clients
Participate in developing robust systems for data ingestion, transformation, and delivery across our platform
Creating efficient data workflows that optimize for both performance, resource utilization, and AI/ML team usage.
Implementing monitoring and observability solutions for data pipelines to ensure reliability
Researching and experimenting with new data platform technologies and solutions
Establishing best practices for data orchestration and pipeline development
Collaborating with cross-functional teams to understand data requirements and deliver solutions
Contributing to our infrastructure-as-code practices on Google Cloud Platform

In your first 3 months, you'll:

Work with our Data Platform Team + ML Team to build highly-scalable data pipelines, data lakes, and orchestration services
Enable the ML and AI Engineering teams to deploy their solutions with reliable and efficient data processing workflows
Help lay the groundwork for a scalable and secure data practice
Write production-grade code in Python, Rust, and SQL
Contribute to our Google Cloud Platform infrastructure using Infrastructure as Code
Implement monitoring and alerting for critical data pipelines
Experiment rapidly to deliver learnings and results in the first month
Help foster a community of technical and professional development

Requirements:

5+ years of experience building enterprise-grade data products and systems
Strong expertise in data orchestration frameworks and technologies
Demonstrated experience with PySpark, Polars, data lakes, and distributed data processing concepts
Proficiency in Python and/or Rust for production pipeline code
Experience connecting and integrating external data sources, specifically SaaS APIs
Familiarity with cloud platforms, particularly Google Cloud Platform
Knowledge of data modeling, schema design, and data governance principles
Experience with containerization and infrastructure-as-code
Bachelor's degree in Computer Science, Machine Learning, Information Science, or related field preferred