Data Engineer

São Paulo

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Senior-level / Expert USD 50K - 93K * ^est.

Avra

Descubra como a Avra transforma a gestão de crédito e análise de risco de PMEs com tecnologia avançada de inteligência artificial. Integre dados, monitore em tempo real e tome decisões informadas para impulsionar seu negócio.

View all jobs at Avra

Apply now Apply later

Posted 2 weeks ago

About Avra

Avra is a deep tech data intelligence platform powered by foundational AI that translates the complexity of SMEs into strategic decisions for large enterprises. We develop our own foundational models from the ground up—without relying on third-party solutions—to deliver innovative insights that empower some of the leading banks and fintechs across Latin America. Founded in 2024 by Bruno Alano (ex-OpenAI) and Viviane Meister, our team brings together expertise from NVIDIA, Palantir, Google, and more to drive real impact.

About the Role

At Avra, we’re building a data platform from the ground up — one that will serve as a trusted foundation for both our products and the AI models we develop. As a Senior Data Engineer, you’ll play a key role in designing and delivering this infrastructure, ensuring that critical data flows reliably, securely, and with full governance across systems and teams.

Our challenge goes far beyond ingestion and transformation: we’re focused on creating a robust data foundation that supports multiple consumers — from product teams to research — each with varying levels of experimentation, performance, and traceability. This mission involves making long-term architectural decisions while also shipping tactical solutions that deliver immediate value.

You’ll be responsible for designing, evolving, and maintaining the core components of our data platform, working closely with engineers, data scientists, and technical stakeholders to ensure our data is always ready for use — with quality, context, and full lineage.

Responsibilities

Data Lakehouse Architecture: Design and manage scalable Iceberg-based table structures on AWS and GCP, enabling time travel, versioning, and high-performance analytics.
Pipeline Engineering: Develop and maintain batch and streaming ETL pipelines using Spark, Ray, AWS Glue, Kinesis, and Airflow (MWAA).
Graph Management: Maintain and update our heterogeneous relationship graph.
Infrastructure as Code: Use Terraform and CI/CD to provision, deploy, and version control the entire data stack — from S3 buckets and IAM roles to Glue Jobs and DataSync.
Architecture Design: Develop robust data architectures that facilitate efficient storage, retrieval, and processing of high-volume data.
Best Practices: Implement and promote data quality, security, and scalability standards across all engineering practices.
Cross-Functional Alignment: Partner with various teams to ensure data engineering initiatives align with our strategic vision.

You Stand Out If

You have experience building lakehouse architectures with Apache Iceberg or Delta Lake.
You’re comfortable working with Spark and Ray for large-scale data transformation and ML processing.
You’ve designed and deployed workflows on Airflow (especially MWAA) and can manage pipeline infrastructure with Terraform.
You understand how to structure data projects for multi-cloud environments (AWS & GCP).
You have a product mindset and care about enabling downstream users with clean, accessible, and trustworthy data.
You thrive in a collaborative, fast-paced, and innovative deep tech startup environment.
You are proactive, detail-oriented, and have a strong analytical mindset.

Qualifications

Experience: 5+ years in data engineering with a proven track record of deploying scalable data architectures.
Tech Stack: Proficiency with Python and SQL. Familiarity with Spark, Ray, Glue, Airflow, Iceberg, Terraform, and AWS (S3, IAM, MWAA, Kinesis, etc.).
Data Expertise: Strong understanding of data modeling, ETL processes, and real-time data processing.
Methodologies: Familiarity with CI/CD pipelines and modern software development practices.
Collaboration: Excellent communication skills and the ability to work effectively within a remote, cross-functional team.

Why Join Avra?

Competitive Compensation: Attractive salary, equity participation, and full transparency in our compensation structure.
Impactful Work: Directly contribute to a platform that empowers strategic decisions for large enterprises.
Innovative Environment: Tackle challenging, cutting-edge data engineering projects in a deep tech startup used by leading banks and fintechs in LATAM.
Flexible Culture: Enjoy 100% remote work with the option of using our São Paulo office, unlimited vacation, and a comprehensive benefits package including a national health plan and generous parental leave.

If you are passionate about crafting robust data solutions and ready to drive innovation in a deep tech environment, we’d love to hear from you. Apply now to join Avra and help shape the future of data intelligence in Latin America.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 1 0 0

Category: Engineering Jobs

Tags: Airflow Architecture AWS AWS DataSync AWS Glue CI/CD Data quality Engineering ETL GCP Kinesis Machine Learning OpenAI Pipelines Python Research Security Spark SQL Streaming Terraform