Senior Data Engineer (with Python and PySpark)
Warsaw, Poland
Talan
Nous croyons que seule une pratique humaniste de la technologie fera du nouvel âge numérique une ère de progrès pour tous. Engageons-nous ensembleCompany Description
Talan is an international advisory group on innovation and transformation through technology, with 5000 employees, and a turnover of 600M€.
We offer our customers a continuum of services to support you at each key stage of your organization's transformation, with 4 main activities:
- CONSULTING in management and innovation : supporting business, managerial, cultural, and technological transformations.
- DATA & TECHNOLOGY to implement major transformation projects.
- CLOUD & APPLICATION SERVICES to build or integrate software solutions.
- SERVICE CENTERS of EXCELLENCE to support the latter through technology, innovation, agility, sustainability of skills and cost optimization.
Talan accelerates it's clients' transformation through innovation and technology. By understanding their challenges, with our support, innovation, technology and data, we enable them to be more efficient and resilient.
We believe that only a human oriented-practice of technology will make the new digital age an era of progress for all. Together let's commit!
Job Description
Our client has started a greenfield project to build an in-house data lake solution. This will serve various BI services and projects as well as other Data Science tools and initiatives. It will ingest near real-time market data from multiple sources and centrally store them on the cloud (AWS). The scope and type of the data consumed, and target consumers will continue to expand.
Currently, the existing functionality includes the following:
• Ingestion of multiple data sources in batch / near real-time in a shared storage.
• Decoupled storage from compute and data processing to provide increased scalability and performance while reducing storage costs.
• The data is compressed and partitioned using a parquet data format which improves the performance while reducing the cost of data retrieval.
• Centralized metadata catalogue that maintains a single view of the data model for all its consumers.
• Integration with other AWS services (Athena, EMR, QuickSight, …) and third-party solutions (Apache Spark, Presto, Tableau, …).
The project is sponsored at high levels within the business and is a new initiative. Working in a small development team, the focus will be on delivering high-quality solutions for electronic trading and pricing workflows. The aim of the project is to build an agile team that works closely with the business sponsors to ensure a high-quality platform that delivers on the needs of a global trading business.
Data Engineer / Data Scientist
As a Senior developer, you will be experienced in working with big data ingestion and large data sets in general. You will be responsible for creating and maintaining high-quality ETLs.
Job Responsibilities / Role:
• Take responsibility for the software delivery by ensuring quality and scope expectations are met.
• Contribute and take ownership of the technical design and ensure all aspects of the system architecture are well documented.
• Work closely with partner technology teams and collaborate effectively.
Qualifications
Candidates must have the technical skills listed below, and in addition, have worked within a data team in the last 5–10 years. History of role stability is preferred.
Technical Skills Required:
• Very deep understanding of Python, Pyspark, Pandas, JupyterLab (working with notebooks).
• Experience in SQL, Hive, Hadoop.
• Experience using AWS platform.
• Solid experience with continuous integration and continuous delivery tools like Git, Jenkins, etc.
• Agile development/Software lifecycle.
Nice to have Skills:
• Experience with Kafka.
• Specifically, experience using EMR (Elastic Map Reduce) in AWS to run Spark clusters.
• Knowledge of Terraform.
• Experience with Ansible, Bash scripting, boto3.
• Experience configuring continuous integration and continuous delivery tools.
Additional Information
What do we offer you?
- Permanent, full-time contract
- Training and career development
- Benefits and perks such as private medical insurance, lunch pass card, MultiSport Plus card
- Possibility to be part of a multicultural team and work on international projects
- Hybrid position based in Warsaw, Poland
- Possibility to manage work permits
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Ansible Architecture Athena AWS Big Data Consulting ETL Git Hadoop Jenkins Jupyter Kafka Map Reduce Pandas Parquet PySpark Python QuickSight Spark SQL Tableau Terraform
Perks/benefits: Career development Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.