Senior Data Engineer
Warsaw, Masovian Voivodeship, Poland - Remote
Sunscrapers
Meet Sunscrapers - an elite development shop from Warsaw that combines custom software, data engineering, and cloud to help forward thinking companies win their games.Advance your career with Sunscrapers, a leading force in software development, now expanding its presence in a data-centric environment. Join us in our mission to help clients grow and innovate through a comprehensive tech stack and robust data-related projects. Enjoy a competitive compensation package that reflects your skills and expertise while working in a company that values ambition, technical excellence, trust-based partnerships, and actively supports contributions to R&D initiatives.
The project:
We are carrying out the project for our client, an American private equity and investment management fund - listed on the Forbes 500 list - based in New York.
We support them in the area of the infrastructure and data platform, and very recently we also build and experiment with Gen AI applications. The client operates very widely in the world of finance, loans, investments and real estate.
As a Senior Data Engineer you’ll design and implement core systems that enable data science and data visualization at companies that implement data-driven decision processes to create a competitive advantage.
You’ll build data platform for data and business teams, including internal tooling, data pipeline orchestrator, data warehouses and more, using:
Technologies: Python, Terraform, SQL, Pandas, Shell scripts
Tools: git, Docker, Snowflake, Pinecone, Neo4j, Jenkins, Jupyter Notebook, OpenAI API, Apache Airflow / Astronomer, Kubernetes, Artifactory, Windows with WSL, Linux, Gitlab
AWS: EC2, ELB, IAM, RDS, Route53, S3, and more
Best Practices: Continuous Integration, Code Reviews
The ideal candidate will be well organized, eager to constantly improve and learn, driven and, most of all - a team player!
Your responsibilities will include:
- Developing PoCs using latest technologies, experimenting with third party integrations
- Delivering production grade applications once PoCs are validated
- Creating solutions that enable data scientists and business analysts to be self-sufficient as much as possible.
- Finding new ways how to leverage Gen AI applications and underlying vector and graph data storages
- Designing datasets and schemes for consistency and easy access
- Contributing data technology stacks including data warehouses and ETL pipelines
- Building data flows for fetching, aggregation and data modeling using batch and streaming pipelines
- Documenting design decisions before implementation
Requirements
What's important for us?
- At least 5+ years of professional experience in data-related role
- Undergraduate or graduate degree in Computer Science, Engineering, Mathematics, or similar
- Expertise in Python and SQL languages
- Experience with data warehouses (Snowflake)
- Experience with different types of database technologies (RDBMS, vector, graphs, document based, etc.)
- Expertise in AWS stack and services
- Proficiency in using Docker
- Experience with infrastructure-as-code tools, like Terraform
- Great analytical skills and attention to detail - asking questions and proactively searching for answers
- Excellent command in spoken and written English, at least C1
- Creative problem-solving skills
- Excellent technical documentation and writing skills
- Ability to work with both Windows and Unix-like operating systems as the primary work environments
You will score extra points for:
- Experience with integrating LLMs (OpenAI but also others, maybe open source)
- Understanding of LLMs fine tuning, embedding and vector semantic searching
- Experience with Pinecone or Neo4j
- Familiarity with data visualization in Python using either Matplotlib, Seaborn or Bokeh
- Proficiency in statistics and machine learning, as well as Python libraries like Pandas, NumPy, matplotlib, seaborn, scikit-learn, etc
- Experience in building ETL processes and data pipelines with platforms like Airflow or Luigi
- Knowledge of any Python web framework, like Django or Flask with SQLAlchemy
- Experience in operating within a secure networking environment, like a corporate proxy
- Experience in working with repository manager, for example Jfrog Artifactory
Benefits
What do we offer?
- Working alongside a talented team of software engineers who are changing the image of Poland abroad
- Culture of teamwork, professional development and knowledge sharing (https://www.youtube.com/user/sunscraperscom)
- Flexible working hours and remote work possibility
- Comfortable office in central Warsaw, equipped with all the necessary tools for conquering the universe (Macbook Pro/Dell, external screen, ergonomic chairs)
Sounds like a perfect place for you? Don’t hesitate to click apply and submit your application today!
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow APIs AWS Computer Science Data pipelines Data visualization Django Docker EC2 Engineering ETL Finance Flask Generative AI Git GitLab Jenkins Jupyter Kubernetes Linux LLMs Machine Learning Mathematics Matplotlib Neo4j NumPy OpenAI Open Source Pandas Pinecone Pipelines Python R R&D RDBMS Scikit-learn Seaborn Snowflake SQL Statistics Streaming Terraform
Perks/benefits: Career development Competitive pay Equity / stock options Flex hours Gear
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.