Platform Engineer - Merchandising and Advertisement Department (MAD)
Rakuten Crimson House, Japan
Rakuten
楽天グループ株式会社のコーポレートサイトです。企業情報や投資家情報、プレスリリース、サステナビリティ情報、採用情報などを掲載しています。楽天グループは、イノベーションを通じて、人々と社会をエンパワーメントすることを目指しています。Job Description:
Business Overview
The AI & Data Division (AIDD) creates powerful, customer-focused search, recommendation, data science, advertising, marketing, price and inventory optimization solutions to a variety of businesses in commerce industries. We design, develop, and deploy high performance, fault-tolerant distributed systems used by millions of Rakuten customers every day. We strive to deliver the most innovative solutions that are helpful to people and societies around the world.
Department Overview
The Merchandising and Advertisement Department (MAD) is a dynamic, cross-functional team comprised of talented individuals from around the globe. Our mission is to enhance the Rakuten experience for millions of consumers, merchants, advertisers, and partners worldwide. We specialize in cutting-edge solutions, including personalized recommendations, search advertising, retargeting, and incentive optimization. By harnessing our scale, data, and advanced machine learning algorithms, we strive to create innovative, optimized experiences that benefit society both online and offline.
Position:
Position Details
We are looking for a talented Platform Engineer with a strong passion for DevOps, DevSecOps, and MLOps engineering. The ideal candidate will have hands-on experience with technologies such as Kubernetes (k8s), Docker, Terraform, Prometheus, CephFS, JupyterHub, and Python. In this role, you will leverage your expertise to configure and maintain our Kubernetes-based platform, manage CephFS-based distributed storage systems, and enhance in-house frameworks. If you enjoy working in a collaborative environment and are excited by the prospect of solving complex challenges, we’d love to have you join our team!
Key Responsibilities:
Contribute to Platform: Design, code, test, release, and maintain various components of the platform.
Collaboration and Delivery: Collaborate with software engineers & data scientists to address their requests and ensure timely delivery of software solutions.
Continuous Improvement: Proactively propose and implement system and process improvements, such as refactoring, adopting new technologies, and enhancing system architecture.
Cloud Infrastructure Management: Design, deploy, and manage cloud infrastructure on AWS, Azure, or Google Cloud. Optimize resource usage and control costs in the cloud environment.
Containerization & Orchestration: Implement and manage containerized applications using Docker and Kubernetes. Automate deployments and manage container orchestration to ensure scalability and availability.
Infrastructure as Code (IaC): Develop and maintain infrastructure using IaC tools like Terraform to ensure consistency in deployments and version control.
CI/CD Pipelines: Design and implement continuous integration and continuous deployment pipelines. Automate testing and deployment processes to streamline software delivery.
Automation & Scripting: Write scripts in Python or Bash to automate repetitive tasks, improving operational efficiency. Create tools to simplify the management and monitoring of applications and infrastructure.
Monitoring & Logging: Implement monitoring solutions such as Prometheus and Grafana to track system performance and health. Set up logging frameworks like the ELK stack for log analysis and issue diagnosis.
Documentation & Support: Document processes, configurations, and best practices for future reference and onboarding. Provide support and guidance to development teams on deployment and operational concerns.
Mandatory Qualifications:
Bachelor's Degree (BS) in Computer Science or a related field, or equivalent education and experience.
Knowledge of Linux administration (Red Hat/Ubuntu) including encryption (LUKS/x.509), scripting, monitoring, security, logging, networking, and SSH.
Over three years of proven experience as a Platform Engineer.
Familiar with Kubernetes, with hands-on experience in developing and deploying operators(Airflow, Flink, Spark) on it.
Experience working with database systems such as Couchbase, Redis, or Cassandra.
Familiarity with CI/CD tools like Jenkins, Laminar etc.
Able to work independently and as part of a team.
Strong communication and collaboration skills.
Desired Qualifications:
Familiarity with distributed filesystems is a plus, such as CephFS, HDFS, Lustre, or BeeGFS.
Experience with Infrastructure as Code tools is a plus, including Chef, Ansible, Salt, Puppet, or Terraform.
Knowledge of cloud platforms like AWS, Azure, or Google Cloud Platform is a plus.
Certifications, such as Certified Kubernetes Administrator or Certified Kubernetes Application Developer, are a plus.
Knowledge of big data processing frameworks like Hadoop, Spark, Kafka, Flink, or Druid is a plus.
#ai #engineer #devops #mlops #platform #kubernetes #ceph #cloud
Languages:
English (Overall - 4 - Fluent)* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Ansible Architecture AWS Azure Big Data Cassandra CI/CD Computer Science DevOps Distributed Systems Docker ELK Engineering Flink GCP Google Cloud Grafana Hadoop HDFS Jenkins Kafka Kubernetes Linux Machine Learning MLOps Pipelines Puppet Python Security Spark Terraform Testing
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.