Data Engineering, Content Understanding

New York, NY

Spotify

We grow and develop and make wonderful things happen together every day. It doesn't matter who you are, where you come from, what you look like, or what music you love. Join the band!

View all jobs at Spotify

Apply now Apply later

Delivering the best Spotify experience possible. To as many people as possible. In as many moments as possible. That’s what the Experience team is all about. We use our deep understanding of consumer expectations to enrich the lives of millions of our users all over the world, bringing the music and audio they love to the devices, apps and platforms they use every day. Know what our users want? Join us and help Spotify give it to them.
As a Software Engineer in our Content Understanding teams, you will help define and build ML deployed at scale in support of a broad range of use cases driving value in media and catalog understanding.
We are looking for engineers who are very enthusiastic about data to focus on building structured, high-quality data solutions. These solutions will be used to evolve our products bringing better experiences to our users and the global artist community alike. We are processing petabytes of data using tools such as BigQuery, Dataflow and Pub/Sub. When needed, we also develop our own data tooling such as Scio, a Scala API for Apache Beam, and Luigi, a Python framework for scheduling. 

What You'll Do

  • Build large-scale batch and real-time data pipelines with data processing frameworks such as Scio, Spark on Google Cloud Platform.
  • Leverage best practices in continuous integration and delivery.
  • Help drive optimisation, testing and tooling to improve data quality.
  • Collaborate with other Software Engineers, ML Engineers, Data Scientists and other stakeholders, taking on learning and leadership opportunities that will arise every single day.
  • Create and maintain metrics datasets as well as dashboards that power data driven decisions
  • Work in an agile team to continuously experiment, iterate and deliver on new product objectives.
  • Work on machine learning projects powering the experience that suits each user individually.

Who You Are

  • You have professional data engineering experience and you know how to work with high volume, heterogeneous data, preferably with distributed systems such as Hadoop, BigTable, Cassandra, GCP, AWS or Azure.
  • You know Scala language well, and are interested in spreading this knowledge in the team.
  • You have experience with one or more higher-level JVM-based data processing frameworks such as Beam, Dataflow, Crunch, Scalding, Storm, Spark, Flink etc.
  • You might have worked with Docker as well as Luigi, Airflow, or similar tools.
  • You are passionate about crafting clean code and have experience in coding and building data pipelines.
  • You care about agile software processes, data-driven development, reliability, and responsible experimentation.
  • You understand the value of collaboration and partnership within teams.

Where You'll Be

  • For this role you will be based in New York City, USA

The United States base range for this position is $122,716 - $175,308, plus equity. The benefits available for this position include health insurance, six month paid parental leave, 401(k) retirement plan, monthly meal allowance, 23 paid days off, 13 paid flexible holidays, paid sick leave. These ranges may be modified in the future.
Apply now Apply later
Job stats:  0  0  0
Category: Engineering Jobs

Tags: Agile Airflow APIs AWS Azure BigQuery Bigtable Cassandra Dataflow Data pipelines Data quality Distributed Systems Docker Engineering Flink GCP Google Cloud Hadoop Machine Learning Pipelines Python Scala Spark Testing

Perks/benefits: Career development Flex vacation Health care Parental leave

Region: North America
Country: United States

More jobs like this