Data Engineer - Data Engineering
San Francisco
Plaid Inc.
Plaid helps companies build fintech solutions by making it easy, safe and reliable for people to connect their financial data to apps and services.
The main goal of the DE team in 2024-25 is to build robust golden data sets to power our business goals of creating more insights based products. Making data-driven decisions is key to Plaid's culture. To support that, we need to scale our data systems while maintaining correct and complete data. We provide tooling and guidance to teams across engineering, product, and business and help them explore our data quickly and safely to get the data insights they need, which ultimately helps Plaid serve our customers more effectively. Data Engineers heavily leverage SQL and Python to build data workflows. We use tools like DBT, Airflow, Redshift, ElasticSearch, Atlanta, and Retool to orchestrate data pipelines and define workflows. We work with engineers, product managers, business intelligence, data analysts, and many other teams to build Plaid's data strategy and a data-first mindset. Our engineering culture is IC-driven -- we favor bottom-up ideation and empowerment of our incredibly talented team. We are looking for engineers who are motivated by creating impact for our consumers and customers, growing together as a team, shipping the MVP, and leaving things better than we found them.
You will be in a high impact role that will directly enable business leaders to make faster and more informed business judgements based on the datasets you build. You will have the opportunity to carve out the ownership and scope of internal datasets and visualizations across Plaid which is a currently unowned area that we intend to take over and build SLAs on. You will have the opportunity to learn best practices and up-level your technical skills from our strong DE team and from the broader Data Platform team. You will collaborate with and have strong and cross functional partnerships with literally all teams at Plaid from Engineering to Product to Marketing/Finance etc.
You will be in a high impact role that will directly enable business leaders to make faster and more informed business judgements based on the datasets you build. You will have the opportunity to carve out the ownership and scope of internal datasets and visualizations across Plaid which is a currently unowned area that we intend to take over and build SLAs on. You will have the opportunity to learn best practices and up-level your technical skills from our strong DE team and from the broader Data Platform team. You will collaborate with and have strong and cross functional partnerships with literally all teams at Plaid from Engineering to Product to Marketing/Finance etc.
Responsibilities
- Understanding different aspects of the Plaid product and strategy to inform golden dataset choices, design and data usage principles.
- Have data quality and performance top of mind while designing datasets
- Advocating for adopting industry tools and practices at the right time
- Owning core SQL and Python data pipelines that power our data lake and data warehouse
- Well-documented data with defined dataset quality, uptime, and usefulness.
Qualifications
- 2+ years of dedicated data engineering experience, solving complex data pipeline issues at scale.
- You have experience building data models and data pipelines on top of large datasets (in the order of 500TB to petabytes)
- You value SQL as a flexible and extensible tool and are comfortable with modern SQL data orchestration tools like DBT, Mode, and Airflow.
- [Nice to have] You have experience working with different performant warehouses and data lakes; Redshift, Snowflake, Databricks
- [Nice to have] You have experience building and maintaining batch and real-time pipelines using technologies like Spark, Kafka.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Category:
Engineering Jobs
Tags: Airflow Business Intelligence Databricks Data pipelines Data quality Data strategy Data warehouse dbt Elasticsearch Engineering Finance Kafka MVP Pipelines Python Redshift Snowflake Spark SQL
Perks/benefits: Flex hours
Region:
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Data Scientist II jobsData Engineer II jobsStaff Data Scientist jobsSr. Data Engineer jobsPrincipal Data Engineer jobsBusiness Intelligence Analyst jobsPrincipal Software Engineer jobsStaff Machine Learning Engineer jobsData Science Manager jobsData Manager jobsData Science Intern jobsSoftware Engineer II jobsDevOps Engineer jobsJunior Data Analyst jobsData Analyst Intern jobsData Specialist jobsSr. Data Scientist jobsBusiness Data Analyst jobsLead Data Analyst jobsStaff Software Engineer jobsAI/ML Engineer jobsSenior Backend Engineer jobsResearch Scientist jobsData Engineer III jobsBI Analyst jobs
NLP jobsAirflow jobsOpen Source jobsTerraform jobsLinux jobsKPIs jobsMLOps jobsEconomics jobsKafka jobsJavaScript jobsNoSQL jobsComputer Vision jobsData Warehousing jobsPostgreSQL jobsGoogle Cloud jobsRDBMS jobsGitHub jobsPhysics jobsScikit-learn jobsBanking jobsStreaming jobsData warehouse jobsHadoop jobsScala jobsR&D jobs
Pandas jobsdbt jobsBigQuery jobsOracle jobsLooker jobsClassification jobsReact jobsScrum jobsCX jobsDistributed Systems jobsPySpark jobsRAG jobsRedshift jobsMicroservices jobsPrompt engineering jobsELT jobsJira jobsRobotics jobsIndustrial jobsGPT jobsTypeScript jobsSAS jobsNumPy jobsMySQL jobsLangChain jobs