Data Engineer - Technology Platforms Insights Group, Ecosystem Platform Supervisory Department (EPSD)

Rakuten Crimson House, Japan

Full Time Mid-level / Intermediate USD 49K - 91K * ^est.

Rakuten

楽天グループ株式会社のコーポレートサイトです。企業情報や投資家情報、プレスリリース、サステナビリティ情報、採用情報などを掲載しています。楽天グループは、イノベーションを通じて、人々と社会をエンパワーメントすることを目指しています。

View all jobs at Rakuten

Apply now Apply later

Posted 3 weeks ago

Job Description:

Business Overview

Ecosystem Platform Supervisory Department (EPSD) is a part of the Technology Platforms Division of Rakuten Group. We build scalable platforms that empowers the Rakuten Ecosystem globally. Our mission is to deliver an innovate, accessible, and stable Ecosystem Platform through fostering a culture of ownership and data-driven decision-making.

Department Overview

The Technology Platform Insights Group (TPIG) is part of the Customer Support Section (CSS) of the EPSD. We find and implement business ideas which help expand Rakuten's ecosystem through analyzing platform service data.

TPIG mission is to enable stakeholders to make well-informed business and technology decisions by providing data-backed insights.

Position:

Why We Hire

We are looking for an ambitious experienced & skilled Data Engineer who is willing to accept new challenges while eager learning new technologies and technical methodologies. The key roles and responsibilities include designing, optimizing & advancing our data architecture to support our business objectives.

Position Details

Designing, building, and maintaining scalable data pipelines and architectures, including:

- Leads and hands-on the design, building and implementation of advanced, scalable cloud-based data warehouse solutions, optimize data pipelines and manage large-scale cloud architectures.

- Develop and implement data governance frameworks and policies to ensure data quality, security, and compliance.

- Communication: Collaborate with cross-functional teams to identify data requirements and align them with business objectives.

- Management: Participate in project planning, estimation, and execution to ensure successful delivery of data-centric initiatives. Provide technical guidance and mentorship to the data engineering team members. Oversee data platform and drive data-driven strategies to support business goals and innovation.

- Documentation: Maintain comprehensive documentation (platform system architecture, data dictionary & infrastructure, configurations, processes, & so on).

Stay up to date with the latest advancements in data management, cloud technologies, and project delivery methodologies.

- Data warehouse platform migration experiences (for example: migrate from Hadoop to GCP and so on) is a plus.

Data visualization report E2E development, including:

- Collaborate with Stakeholders: Work closely with business teams to understand requirements, gather feedback, and deliver customized reports that meet organizational goals.

- Data Analysis and Storytelling: Analyze data to identify trends, patterns, and insights to fulfill business need.

- Solution Design: Propose solutions and present them to relevant stakeholders.

- Data Modeling & reporting data-mart design.

- Data Pipeline: Develop, maintain, and optimize data pipelines, include DAG scheduling, data ingestion mechanisms, and ETL processes to ensure data is “collected from” & “delivered to” efficiently and reliably.

- Ensure Data Accuracy: Validate and ensure the accuracy, consistency, and reliability of data used in visualizations and reports.

- Frontend: Design, build and manage reports using data visualization tool.

Mandatory Qualifications:

- B.S. in Computer Science or in related fields, or equivalent education and experience.

- 5+ years of end-to-end experience with data engineering across SQL, and operational experience in cloud-based or Linux systems.

- 3+ years experiences in developing and building robust data warehouse ecosystems using Google Cloud Platform (GCP), Python, and Shell scripting. Less than 3 years but with two GCP or more project implementation experiences are also welcome.

- Proven experience as a Data Architect or in a similar role, with a strong data architecture design and implementation background.

- Deep understanding of data warehouse / big data concepts, data modeling, batch-processing data pipelines (ETL), data integration techniques, tools, and concepts that are used to process, store, analyze, and manage large volumes of data.

- ETL composed of either tool PubSub/CloudStorage/BigQuery/DataFlow, Glue/Lambda/Redshift, Data Factory/Databricks/Synapse Analytics, Snowflake, Airflow, & etc.

- Hands-on experience creating and leveraging data visualization platforms in any of DOMO, PowerBI, Tableau, & etc.

- Experience with Git and deployment pipeline composed of systems such as Jenkins, Ansible, Concourse, & etc.

Desired Qualifications:

Technical skills:

Experience working with data or statistical analytics.
Experience working on streaming data pipelines composed of subsystems such as Apache NiFi, Apache Kafka, Spark streaming, & etc.
Managing systems that have high service level objectives/agreements (SLOs / SLAs.)

Soft skills:

Effective Communication Skills
Problem-Solving / Solution Creation
Attention to Detail
Customer-Centric Approach
Project Management
Can-Do Attitude (Willing to accept new challenge)
Leadership
Teamwork and Collaboration
Adaptability & Continuous Learning

Additional bonus points:

Experience with "Big Data" systems (Hadoop, Hive, Spark), HDFS, Presto, containerization and container orchestration such as Docker, Kubernetes, & etc
Web application backend systems composed of Load Balancers, Apache Web Server, MySQL / PostgreSQL, & etc.
Experience participating open-source projects and data analysis competitions, Hackathons

#dataengineer #googlecloudplatform #GCP #ETL #datawarehouse #datamodeling #bigdata #datainsights

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 8 0 0

Category: Engineering Jobs

Tags: Airflow Ansible Architecture Big Data BigQuery Computer Science Data analysis Databricks Dataflow Data governance Data management Data pipelines Data quality Data visualization Data warehouse Docker Engineering ETL GCP Git Google Cloud Hadoop HDFS Jenkins Kafka Kubernetes Lambda Linux MySQL NiFi Open Source Pipelines PostgreSQL Power BI Python Redshift Security Shell scripting Snowflake Spark SQL Statistics Streaming Tableau