Senior Data Engineer (Remote, US)

United States

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Applications have closed

Sayari

Get instant access to public records, financial intelligence and structured business information on over 455 million companies worldwide.

View all jobs at Sayari

Find more jobs like this Jobs in the United States

Posted 6 months ago

ABOUT SAYARISayari is a venture-backed and founder-led global corporate data provider and commercial intelligence platform, serving financial institutions, legal and advisory service providers, multinationals, journalists, and governments. Thousands of analysts and investigators in over 30 countries rely on our products to safely conduct cross-border trade, research front-page news stories, confidently enter new markets, and prevent financial crimes such as corruption and money laundering.
Our company culture is defined by a dedication to our mission of using open data to prevent illicit commercial and financial activity, a passion for finding novel approaches to complex problems, and an understanding that diverse perspectives create optimal outcomes. We embrace cross-team collaboration, encourage training and learning opportunities, and reward initiative and innovation. If you like working with supportive, high-performing, and curious teams, Sayari is the place for you.
POSITION DESCRIPTIONSayari provides instant access to structured business information from hundreds of millions of corporate, legal, and trade records for a variety of use cases. As a member ofSayari's data team you will work with our Product and Software Engineering to build the graph that underlies Sayari’s products.

Job Responsibilities

Build and maintain ETL pipelines to process and export record data to Sayari Graph application
Develop and improve entity resolution processes
Implement logic to calculate and export risk information
Work with product team and other development teams to collect and refine requirements
Run and maintain regular data releases

Required Skills & Experience

Expertise with Python and a JVM programming language (e.g., Scala)
Expertise with SQL (e.g., Postgres) and NoSQL (e.g., Cassandra, Elasticsearch, Memgraph, etc.) databases
7+ years of experience designing, maintaining, and orchestrating ETL pipelines (e.g., Apache Spark, Apache Airflow) in cloud based environments (e.g., GCP, AWS, or Azure).

Desired Skills & Experience

Experience with entity resolution, graph theory, and/or distributed computing
Experience with Kubernetes
Experience working as part of an agile development team using Scrum, Kanban, or similar

Benefits

A collaborative and positive culture - your team will be as smart and driven as you
Limitless growth and learning opportunities
A strong commitment to diversity, equity, and inclusion
Performance and incentive bonuses
Outstanding competitive compensation and comprehensive family-friendly benefits, including full healthcare coverage plans, commuter benefits, 401K matching, generous vacation, and parental leave.
Conference & Continuing Education Coverage
Team building events & opportunities

Find more jobs like this Jobs in the United States

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 3 0 0

Category: Engineering Jobs

Tags: Agile Airflow AWS Azure Cassandra Elasticsearch Engineering ETL GCP Kanban Kubernetes NoSQL Pipelines PostgreSQL Python Research Scala Scrum Spark SQL