Staff Data Engineer

Remote - USA

Base

Base is a secure, low-cost, builder-friendly Ethereum L2 built to bring the next billion users onchain.

View all jobs at Base

Apply now Apply later

Base is planning to bring a million developers and a billion users onchain. We need your help to make that happen.

The data engineering team develops and maintains robust data pipelines, builds trusted data sources, and creates analytics/data products that inject automation into data science processes, with a goal to empower users via self-serve analytics.

What we do:

  1. Trusted data sources: Build and maintain a foundational data layer (data marts), that serve as the single source of truth across Coinbase. 
  2. Reliable data pipelines: Design and implement robust data pipelines, guaranteeing data quality and timely data delivery across our organization.
  3. Data science developer tools: Build developer tools that inject automation into data science processes, improving efficiency and productivity. For example, data transformation, data modeling, and data quality tooling. 
  4. Self-serve analytics products: Deliver tailored data products, designed to empower users with self-serve capabilities and ensure accurate answers to their data inquiries.

What you’ll be doing (ie. job duties): 

Your primary responsibilities will include building tools and products that enhance data science productivity, enable self-serve analytics, and ensure data reliability and quality. This role deals specifically with blockchain data. 

More specifically: 

  • Data modeling: Design data models that support our data scientists in leveraging blockchain data in their analysis.
  • Data transformation and tooling: Build tooling and pipelines for getting blockchain data from APIs, Files, and Databases into a clean and useful format.
  • Data quality tooling: Build mechanisms of data quality measurement and enforcement. Ensure reliability and completeness of data.
  • Scrappy & creative problem solving: Get creative in applying blockchain knowledge to write heuristics and models that answer questions about what is happening on chain.
  • Cross-Functional collaboration: Work alongside fellow data engineers and cross-functional partners from Data Science, Data Platform, Machine Learning, and various analytics teams to ensure alignment on priorities and deliverables. 
  • Self-serve analytics: Enable stakeholders to self-serve their own analytics by making blockchain data available and approachable.

What we look for in you (ie. job requirements):

  • Blockchain data experience: You should know about blocks, transactions, token transfers, contract calls, account and UTXO models and labeling etc. Deep understanding of blockchains is a must. You will likely have used tools like Dune and Arkham before.
  • Python: Must be adept at coding in Python. Go is a plus. Major experience with one or more modern languages is essential, particularly for data oriented tasks.
  • SQL: Must have expert SQL experience for querying, transformation, and performance optimization. Experience with other types of DBs like graph databases and key-value stores is a plus.
  • Data Compute Frameworks: Experience with data compute frameworks such as Spark, Flink, Beam etc. are a plus.
  • ETL/ELT Processes: Experience in designing, building, and optimizing ETL/ELT data pipelines to process large datasets. Experience with both batch and streaming is a plus.
  • Apache Airflow: Experience in building, deploying, and optimizing DAGs in Airflow or a similar pipeline orchestration tool.
  • Data Visualization: Experience with tools like Superset, Hex, Looker, or Python visualization libraries (Matplotlib, Seaborn, Plotly…etc)
  • Data Modeling: Understanding of best practices for data modeling, including star schemas, snowflake schemas, and data normalization techniques. Experience with modeling for blockchain data or financial data is a plus.
  • Collaboration and Communication: Ability to work closely with data scientists, analysts, and other stakeholders to translate business requirements into technical solutions. Strong documentation skills for pipeline design and data flow diagrams.
  • Fundamental DevOps Practices: Knowledge of unit testing, CI/CD, and git repository management, Docker, kubernetes etc.
  • Prompt Engineering for LLMs: Expertise in leveraging LLMs and AI tools in your workflow, knowing where and how to apply them. Work with embeddings and vector search is a plus.

G2754

Pay Transparency Notice: Depending on your work location, the target annual salary for this position can range as detailed below. Full time offers from Coinbase also include target bonus + target equity + benefits (including medical, dental, vision and 401(k)).Pay Range: $207,000—$244,000 USD

Please be advised that each candidate may submit a maximum of four applications within any 30-day period. We encourage you to carefully evaluate how your skills and interests align with Coinbase's roles before applying.

Commitment to Equal Opportunity

Coinbase is committed to diversity in its workforce and is proud to be an Equal Opportunity Employer.  All qualified applicants will receive consideration for employment without regard to race, color, religion, creed, gender, national origin, age, disability, veteran status, sex, gender expression or identity, sexual orientation or any other basis protected by applicable law. Coinbase will also consider for employment qualified applicants with criminal histories in a manner consistent with applicable federal, state and local law.  For US applicants, you may view Pay TransparencyEmployee Rights and Know Your Rights notices by clicking on their corresponding links.  Additionally, Coinbase participates in the E-Verify program in certain locations, as required by law. 

Coinbase is also committed to providing reasonable accommodations to individuals with disabilities. If you need a reasonable accommodation because of a disability for any part of the employment process, please send an e-mail to accommodations[at]coinbase.com and let us know the nature of your request and your contact information.  For quick access to screen reading technology compatible with this site click here to download a free compatible screen reader (free step by step tutorial can be found here).

Global Data Privacy Notice for Job Candidates and Applicants

Depending on your location, the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) may regulate the way we manage the data of job applicants. Our full notice outlining how data will be processed as part of the application procedure for applicable locations is available here. By submitting your application, you are agreeing to our use and processing of your data as required. For US applicants only, by submitting your application you are agreeing to arbitration of disputes as outlined here.    

 

Apply now Apply later
Job stats:  0  0  0

Tags: Airflow APIs Blockchain CI/CD Data pipelines Data quality Data visualization DevOps Docker ELT Engineering ETL Flink Git Kubernetes LLMs Looker Machine Learning Matplotlib Pipelines Plotly Privacy Prompt engineering Python Seaborn Snowflake Spark SQL Streaming Superset Testing

Perks/benefits: Career development Equity / stock options Health care Salary bonus Transparency

Regions: Remote/Anywhere North America
Country: United States

More jobs like this