Senior/Staff Software Engineer (Data Platform Emulators)

Barcelona, Catalonia, Spain - Remote

Apply now Apply later

We are a young, fast-growing startup building cutting-edge technology to revolutionize cloud development processes and support highly efficient dev&test feedback loops. At its core, LocalStack provides a high-fidelity emulator and local cloud development platform - imagine developing cloud applications and data pipelines entirely on your local machine within a lightweight cloud sandbox, running in Docker. Our mission is to empower developers to rapidly build and test their cloud applications, allowing for a more enjoyable dev experience, and saving valuable time and resources.

LocalStack has a large and active open source community (51k+ stars on GitHub) with several 100k active users worldwide and 250M+ downloads to date. With a growing international customer base across Fortune Global 500 companies for our advanced enterprise offerings, as well as a growing, globally distributed team of top-notch engineers and GTM experts, we are on an exciting growth journey to become the world’s leading platform for local cloud software development.

LocalStack is headquartered in Zurich/Switzerland, with a development office in Vienna/Austria and remote team members from around the world (incl. US, FR, UK, IN, IT, MX, IE).

Requirements

We are looking for a lead software engineer who can spearhead the development of our data platform emulators. LocalStack provides several localized versions of cloud-based data platforms, allowing our users to run their data pipelines directly on the local machine (or in CI pipelines), without requiring any connectivity to the real cloud. Among others, we provide support for AWS Athena, AWS RDS/Redshift, and AWS Glue, and more recently a first version of a Snowflake emulator running entirely locally, in Docker.

With our work at LocalStack we’re filling an important gap in the industry - most vendors at this point do not bother providing a fully local dev experience, however there is huge demand from the developer community to work with local tooling, take the data infrastructure with them on their laptop, and even work offline. The Snowflake emulator is currently in private beta, and so far there’s been great excitement among the beta customers who are evaluating this early version which we’re now further building out.

The ground-breaking solutions we’re implementing involve several exciting challenges - making the functionality of powerful data platforms available on the local machine in a lightweight and performance-optimized manner. Oftentimes this includes configuring and running database products like PrestoDB/TrinoDB, Apache Spark, or a PostgreSQL server on the local machine or in Docker, and then adding the required configuration, glue code and integrations to provide an API that closely resembles the behaviour of the real cloud system. We are developing innovative solutions to customize and extend database systems we’re building upon, for example creating custom plugins, extension functions, or SQL transformers that transpile queries from a source into a target format. To ensure that our implementation provides maximum parity with the real system, we employ a mechanism called snapshot testing that helps us create high-fidelity integration tests that systematically cover the entire API surface area.

As our customer base is growing and we’re seeing an increasing influx of users who are eager to leverage additional features, we are looking for a strong lead to take our data platform emulators to the next level. You’ll be working with a top-notch team of highly motivated and exceptionally skilled individuals who are all contributing towards our shared vision of providing the best local cloud development experience out there.

Responsibilities:

  • Drive and co-own the development of our Snowflake emulator - this is a brand-new product we’re bringing to the market, with a sizeable number of interested beta users already evaluating the private beta version.
  • Reverse-engineer data platform APIs and queries to reproduce the behaviour locally based on off-the-shelf database products and tools (e.g., Postgres, TrinoDB).
  • Write comprehensive unit and integration tests to ensure our implementation is on par with the real system.
  • Conduct technical spikes to evaluate new tools and technologies, document the process and insights on all key architectural decisions made.
  • Integrate suitable open source tools into our solutions, and contribute back to open source projects.
  • Maintain documentation about the internal implementation details, as well as technical roadmaps for new features with milestones and basic effort estimations.
  • Conduct performance evaluations, and apply various optimizations in the ongoing implementation, as applicable (e.g., lazy evaluation, operator pruning, parallel processing, etc).
  • Run internal demos and knowledge sharing sessions to educate the team on the developments in the data platform emulators space.
  • Communicate directly with our customers in different support channels to understand their requirements and use cases, create reproducible samples, and help them resolve any technical issues.
  • Work with our Data team to embed analytics into the product and ensure that we gain insights into common usage patterns and edge cases, learn how we can improve the product, and proactively detect user issues early on.

Qualifications:

  • Strong hands-on experience with modern Python development (type hinting, unit/integration testing with pytest, object-oriented software design).
  • Strong background in data processing, fundamentals in operating systems, and systems programming in Unix environments.
  • Strong understanding of SQL, different types of DDL/DML/DQL/DCL/DTL queries, syntactical differences between different flavors of SQL.
  • Strong understanding of sessions and transactions in relational databases, transaction isolation levels, managing remote and potentially distributed transactions.
  • Strong proficiency with PostgreSQL, running and configuring Postgres servers, writing custom functions with different language extensions (psql, plpython, plv8).
  • Experience with cloud computing APIs and platforms like AWS or Azure.
  • Ideally hands-on experience working with data platforms like Snowflake and/or AWS Athena, Glue, Redshift.
  • Experience with SQL parsing, query AST modification libraries like sqlglot or others.
  • Decent knowledge of Java and the ecosystem of open-source bigdata platforms, including Presto, Hive, Hadoop, Spark, etc.
  • Prior experience contributing to open source projects on Github is a plus.
  • Experience with compiler technologies and lexers/parsers is a plus.

Benefits

  • Competitive salary and performance-based bonuses.
  • Opportunities for professional development and training.
  • Dynamic and collaborative work environment.
  • Flexible work arrangements.

If you are an experienced software engineer seeking for a new exciting challenge, having a passion for data products and dev tooling, and an exceptionally strong technical background, we'd love to hear from you! Join us in shaping the future of cloud development at LocalStack.

Apply now Apply later
  • Share this job via
  • 𝕏
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  4  1  0

Tags: APIs Athena AWS AWS Glue Azure Data pipelines DDL Docker GitHub Hadoop Java Open Source Pipelines PostgreSQL Python RDBMS Redshift Snowflake Spark SQL Testing Transformers

Perks/benefits: Career development Competitive pay Flex hours Flex vacation Gear Startup environment

Regions: Remote/Anywhere Europe
Country: Spain

More jobs like this