Senior Platform Engineer - Pricing Platform

AME (Amsterdam - Maple)

Apply now Apply later

Your day to day

-    Derive incident management and root cause analysis recommendations into improvement points for the platform.
-    Work with the latest (automation) tooling with a strong focus on performance,reliability,observability and security.
-    Define platform lifecycle management, resilience patterns, architecture and roadmaps together with solution, domain and enterprise architects.
-    Align platform expected changes with stakeholders, financial controllers, and report on platform volumes to area lead.
-    Present platform and automation best-practices to team and at in-/external engineering events.
-    Report on state of IT Risk & security controls on the platform as per ING Information Security Management Policy.
-    Apply CI/CD using Azure DevOps as well as remote operations on the platform.
-    Through Agile/Scrum, collaborate with the other engineers to bring live new sprint releases every 2-4 weeks to Acceptance and Production.
-    You are committed to staying updated with the latest developments in HPC and Cloud tech and participating in relevant workshops, conferences, and training programs is part of your nature.
-    You meet frequently with product managers, analysts and researchers to gather and incorporate stakeholder feedback to improve HPC services. 

What you’ll bring to the team
Experience: 5+ years of software engineering / operations experience

Tech stack/ knowledge:
Mandatory:

-    IT Operations/Support experience combined with analytical skills to identify root causes in incidents (data, technical, functional).
-    Strong understanding of high performance computing environments, including HPC (GPU) clusters, parallel computing principles, distributed computing principles and techniques
-    Strong understanding of using GPU technology as computational accelerator and proficiency in cluster management and job scheduling systems (e.g., DataSynapse, Slurm, PBS, LSF).
-    Knowledge of GPU architectures and technologies (e.g., NVIDIA CUDA, AMD ROCm).
-    Experience with deploying and maintaining parallel programming models and libraries (e.g., MPI, OpenMP, CUDA), middleware and supporting software.
-    Ability to identify and contribute to resolving performance bottlenecks in HPC applications via monitoring / observability practices.
-    Knowledge of CI/CD, experience with Git, Python, Ansible, Shell scripting and working experience with monitoring practices and alerting tools.
-    Strong Linux (RHEL 8 or 9), Azure DevOps experience, pipeline and Ansible skills and experience working with certificates / encryption technology.
-    Strong experience in translating computational requirements to IT concepts like system sizing.
-    Experience with Grafana and tools for alerting like Prometheus, as well as a strong understanding of complex subsystem monitoring and alerting
-    Good understanding of the ELK Stack and how to interact with it
-    Experience in mentoring junior engineers and providing technical guidance.
-    Thoroughness in testing validating the configurations ,optimizations and system reliability, performance.
-    Education at Master level with a strong analytical background in Computer/Data Science, Cybernetics, Software Engineering, Financial Engineering or equivalent. 
-    Due to the cross-border nature of IT teams at ING, we ask that English (advanced) is part of your skillset.

Nice to have:

-    Familiarity with Oracle 12c/19c with PL/SQL
-    Experience with cloud-based HPC solutions (e.g., AWS, Azure) and understanding of hybrid HPC environments and cloud integration is a plus.
-    Affinity with (GPU) programming languages and frameworks (e.g., CUDA, OpenCL, Pytorch) is a plus.
-    Familiarity with in-memory caching tools (Apache Ignite (GridGain), Redis et al).
-    Familiarity with shared storage configuration and design.
-    Programming skills in languages such as C, C++, and Java is a plus.
-    Familiarity with Docker and orchestration for it (Kubernetes, Openshift et al.)
-    Good Linux networking skills.
 

Apply now Apply later
  • Share this job via
  • 𝕏
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Engineering Jobs

Tags: Agile Ansible Architecture AWS Azure CI/CD CUDA Cybernetics DevOps Docker ELK Engineering Git GPU Grafana HPC Java Kubernetes Linux OpenMP Oracle Python PyTorch Scrum Security Shell scripting SQL Testing

Perks/benefits: Conferences Team events

Region: Europe
Country: Netherlands

More jobs like this