Senior Data Lake Developer

Lima, Peru

Full Time Senior-level / Expert USD 50K - 93K * ^est.

Talan

Nous croyons que seule une pratique humaniste de la technologie fera du nouvel âge numérique une ère de progrès pour tous. Engageons-nous ensemble

View all jobs at Talan

Apply now Apply later

Posted 1 month ago

Company Description

Join Us!

Why Talan?

For the 4th consecutive year, Talan Spain has been recognized as a Great Place to Work! 🎉 This year, we’re also celebrating our 2nd certification in Poland, a significant milestone since opening our office there.

Talan is an international advisory group specializing in innovation and transformation through technology, with 5,000 employees and an annual turnover of 600M€.

We offer our customers a continuum of services to support them at each key stage of their organization's transformation, with four main activities:

CONSULTING in Management and Innovation: Supporting business, managerial, cultural, and technological transformations.
DATA & TECHNOLOGY: Implementing major transformation projects.
CLOUD & APPLICATION SERVICES: Building or integrating software solutions.
SERVICE CENTERS of EXCELLENCE: Providing technology, innovation, agility, sustainability of skills, and cost optimization.

Talan accelerates its clients' transformation through innovation and technology. By understanding their challenges and supporting them with technology, innovation, and data, we enable them to be more efficient and resilient.

Job Description

Our client has started a greenfield project to build an in-house data lake solution, named Origin. This will serve various BI services and projects as well as other Data Science tools and initiatives within CIB. Origin will ingest near real-time market data from multiple sources and centrally store them on the cloud (AWS). The scope and type of the data consumed, and target consumers will continue to expand.

Currently, the existing functionality includes the following:

Ingestion of multiple data sources in batch / near real-time in a shared storage.
Decoupled storage from compute and data processing to provide increased scalability and performance while reducing storage costs.
The data is compressed and partitioned using parquet data formats, which improves performance while reducing the cost of data retrieval.
Centralized metadata catalogue that maintains a single view of the data model for all its consumers.
Integration with other AWS services (Athena, EMR, QuickSight) and third-party solutions (Apache Spark, Presto, Tableau).

The project is sponsored at high levels within the business and is a new initiative. Working in a small development team, the focus will be on delivering high-quality solutions for electronic trading and pricing workflows. The aim of the project is to build an agile team that works closely with the business sponsors to ensure a high-quality platform that delivers on the needs of a global trading business.

Your Role

As a Senior developer, you will be experienced in working with big data ingestion and large data sets in general. You will be responsible for creating and maintaining high-quality ETLs. This is an exciting opportunity to build a new greenfield solution, working as part of a small team, to help the business improve their market-making capabilities within the fixed income space. The platform itself is housed within the AWS ecosystem.

Job Responsibilities:

Take responsibility for the software delivery by ensuring quality and scope expectations are met.
Contribute and take ownership of the technical design and ensure all aspects of the system architecture are well documented.
Work closely with partner technology teams and collaborate effectively.
Candidates must have the technical skills listed below, and in addition, have worked within a data team in the last 5–10 years.

Qualifications

Education:

Bachelor degree in Computer Science / Information Technology or a related field, or substantial practical experience of software delivery at an advanced level

Technical Skills Required:

Very deep understanding of Tableau
Experience in SQL, Hive, Hadoop.
Experience Python, Pyspark, Pandas, JulyterLab (working with notebooks).
Experience using AWS platform.
Experience with continuous integration and continuous delivery tools like Git, Jenkins etc.
Agile development/Software life cycle.
Excellent interpersonal and communication skills in English (b2+)

Nice to have Skills:

Experience with Kafka
Specifically, experience using EMR (Elastic Map Reduce) in AWS to run Spark clusters.
Knowledge of Terraform
Experience with Ansible, Bash scripting, boto3
Experience configuring continuous integration and continuous delivery tools.

Soft Skills: