Staff Software Engineer, Data Platform
Remote - United States
The Data Warehouse Platform team is looking to hire a Staff software Engineer who is excited to solve large scale data platform and efficiency challenges.
Reddit’s mission is to bring community and belonging to everyone in the world. Reddit is a community of communities where people can dive into anything through experiences built around their interests, hobbies, and passions. With more than 50 million people visiting 100,000+ communities daily, it is home to the most open and authentic conversations on the internet. From pets to parenting, skincare to stocks, there’s a community for everybody on Reddit. For more information, visit redditinc.com.
Our community of users generates over 100B analytics events per day, each of which is ingested into a data warehouse that sees 55,000+ daily queries. We utilize this data to enable both batch and streaming based data analytics at the company. Critical teams such as ads, feed generation, and ML experimentation rely on the Data Warehouse to generate revenue for Reddit.
As a staff engineer, you will work within and across teams to create and improve scalable, fault tolerant, self-serve systems. You will also create and execute a strategy to ensure a high level of end-to-end data quality and help shape the data culture across all of Reddit!
Key projects associated with this role include:
- Building a user-friendly Data Warehouse platform at Reddit, which seeks to abstract technical details away from users of the warehouse and enable non-technical individuals to perform data analytics in a self-service manner. This project will involve gaining a deep understanding of user intent, developing clear interfaces and contracts for the platform, and building business-level abstractions on top of Kubernetes, BigQuery, and Airflow technologies.
- Driving improved Data Warehouse efficiency to flatten the rate of growth for analytics expenditure at Reddit. This may include, among other tasks: developing clear frameworks for priority tiering and cost attribution, introducing retention policies and data warehousing best practices, and analyzing alternative data storage solutions for some of the company’s most intensive batch workloads.
In your day-to-day, you can expect to:
- Contribute to developing the team and organization’s long term strategy
- Own the data warehouse that stores 100B+ daily events and exposes them to teams across Reddit for analytics and machine learning purposes.
- Refine and maintain our offline data technologies and associated tooling to support the analysis of hundreds of millions of users.
- Mentor other engineers on how to design, build, and evangelize services used by hundreds of engineers across Reddit
Who you might be:
- 7+ years of coding experience in a production setting writing clean, maintainable, and well-tested code.
- Excellent communication skills to collaborate with stakeholders in engineering, data science, machine learning, and product.
- Experience with object-oriented programming languages such as Go, Python, Scala, or Java.
- Degree in Computer Science or equivalent technical field.
- Experience working with Kubernetes, BigQuery, Airflow, Terraform, Helm, Prometheus, Debezium, and CI/CD.
- Experience developing user friendly platform technologies.
- Experience solving large scale efficiency problems.
Benefits:
- Comprehensive Healthcare Benefits
- 401k Matching
- Workspace benefits for your home office
- Personal & Professional development funds
- Family Planning Support
- Flexible Vacation (please use them!) & Reddit Global Wellness Days
- 4+ months paid Parental Leave
- Paid Volunteer time off
#LI-remote, #LI-JS5
Pay Transparency:
This job posting may span more than one career level.
In addition to base salary, this job is eligible to receive equity in the form of restricted stock units, and depending on the position offered, it may also be eligible to receive a commission. Additionally, Reddit offers a wide range of benefits to U.S.-based employees, including medical, dental, and vision insurance, 401(k) program with employer match, generous time off for vacation, and parental leave. To learn more, please visit https://www.redditinc.com/careers/.
To provide greater transparency to candidates, we share base pay ranges for all US-based job postings regardless of state. We set standard base pay ranges for all roles based on function, level, and country location, benchmarked against similar stage growth companies. Final offer amounts are determined by multiple factors including, skills, depth of work experience and relevant licenses/credentials, and may vary from the amounts listed below.
The base pay range for this position is:$217,000—$303,900 USDReddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve. Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If, due to a disability, you need an accommodation during the interview process, please let your recruiter know.
Tags: Airflow BigQuery CI/CD Computer Science Data Analytics Data quality Data warehouse Data Warehousing Engineering Helm Java Kubernetes Machine Learning OOP Python Scala Streaming Terraform
Perks/benefits: 401(k) matching Career development Equity / stock options Flex hours Flex vacation Health care Insurance Medical leave Parental leave Startup environment Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.