Senior Platform Engineer - Pricing Platform
AME (Amsterdam - Maple)
Applications have closed
Your day to day
- Derive incident management and root cause analysis recommendations into improvement points for the platform.
- Work with the latest (automation) tooling with a strong focus on performance,reliability,observability and security.
- Define platform lifecycle management, resilience patterns, architecture and roadmaps together with solution, domain and enterprise architects.
- Align platform expected changes with stakeholders, financial controllers, and report on platform volumes to area lead.
- Present platform and automation best-practices to team and at in-/external engineering events.
- Report on state of IT Risk & security controls on the platform as per ING Information Security Management Policy.
- Apply CI/CD using Azure DevOps as well as remote operations on the platform.
- Through Agile/Scrum, collaborate with the other engineers to bring live new sprint releases every 2-4 weeks to Acceptance and Production.
- You are committed to staying updated with the latest developments in HPC and Cloud tech and participating in relevant workshops, conferences, and training programs is part of your nature.
- You meet frequently with product managers, analysts and researchers to gather and incorporate stakeholder feedback to improve HPC services.
What you’ll bring to the team
Experience: 5+ years of software engineering / operations experience
Tech stack/ knowledge:
Mandatory:
- IT Operations/Support experience combined with analytical skills to identify root causes in incidents (data, technical, functional).
- Strong understanding of high performance computing environments, including HPC (GPU) clusters, parallel computing principles, distributed computing principles and techniques
- Strong understanding of using GPU technology as computational accelerator and proficiency in cluster management and job scheduling systems (e.g., DataSynapse, Slurm, PBS, LSF).
- Knowledge of GPU architectures and technologies (e.g., NVIDIA CUDA, AMD ROCm).
- Experience with deploying and maintaining parallel programming models and libraries (e.g., MPI, OpenMP, CUDA), middleware and supporting software.
- Ability to identify and contribute to resolving performance bottlenecks in HPC applications via monitoring / observability practices.
- Knowledge of CI/CD, experience with Git, Python, Ansible, Shell scripting and working experience with monitoring practices and alerting tools.
- Strong Linux (RHEL 8 or 9), Azure DevOps experience, pipeline and Ansible skills and experience working with certificates / encryption technology.
- Strong experience in translating computational requirements to IT concepts like system sizing.
- Experience with Grafana and tools for alerting like Prometheus, as well as a strong understanding of complex subsystem monitoring and alerting
- Good understanding of the ELK Stack and how to interact with it
- Experience in mentoring junior engineers and providing technical guidance.
- Thoroughness in testing validating the configurations ,optimizations and system reliability, performance.
- Education at Master level with a strong analytical background in Computer/Data Science, Cybernetics, Software Engineering, Financial Engineering or equivalent.
- Due to the cross-border nature of IT teams at ING, we ask that English (advanced) is part of your skillset.
Nice to have:
- Familiarity with Oracle 12c/19c with PL/SQL
- Experience with cloud-based HPC solutions (e.g., AWS, Azure) and understanding of hybrid HPC environments and cloud integration is a plus.
- Affinity with (GPU) programming languages and frameworks (e.g., CUDA, OpenCL, Pytorch) is a plus.
- Familiarity with in-memory caching tools (Apache Ignite (GridGain), Redis et al).
- Familiarity with shared storage configuration and design.
- Programming skills in languages such as C, C++, and Java is a plus.
- Familiarity with Docker and orchestration for it (Kubernetes, Openshift et al.)
- Good Linux networking skills.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Ansible Architecture AWS Azure CI/CD CUDA Cybernetics DevOps Docker ELK Engineering Git GPU Grafana HPC Java Kubernetes Linux OpenMP Oracle Python PyTorch Scrum Security Shell scripting SQL Testing
Perks/benefits: Conferences Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.