Senior Manager - Site Reliability Engineering (SRE -Big Data / Kafka)

Singapore, Singapore, Singapore

Visa

Visa digitaalinen ja mobiilimaksuverkko on eturintamassa uusien maksujen, sähköisten ja kontaktivarojen maksutekniikan, jotka muodostavat rahan maailman

View all jobs at Visa

Apply now Apply later

Company Description

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose – to uplift everyone, everywhere by being the best way to pay and be paid.

Make an impact with a purpose-driven industry leader. Join us today and experience Life at Visa.

Job Description

Essential Functions:

  • Design, build and manage Big Data and Kafka infrastructure on private Cloud  AWS, GCP and Azure.

  • Manage and optimize Apache Big Data and Kafka clusters for high performance, reliability, and scalability.

  • Develop tools and processes to monitor and analyze system performance and to identify potential issues.

  • Collaborate with other teams to design and implement Solutions to improve reliability and efficiency of the Big data cloud platforms.

  • Ensure security and compliance of the platforms within organizational guidelines.

  • Other responsibilities include effective root cause analysis of major production incidents and the development of learning documentation. The person will identify and implement high-availability solutions for services with a single point of failure.

  • The role involves planning and performing capacity expansions and upgrades in a timely manner to avoid any scaling issues and bugs. This includes automating repetitive tasks to reduce manual effort and prevent human errors.

  • The successful candidate will tune alerting and set up observability to proactively identify issues and performance problems. They will also work closely with Level 3 teams in reviewing new use cases and cluster hardening techniques to build robust and reliable platforms.

  • The role involves creating standard operating procedure documents and guidelines on effectively managing and utilizing the platforms. The person will leverage DevOps tools, disciplines (Incident, problem, and change management), and standards in day-to-day operations.

  • The individual will ensure that the platforms can effectively meet performance and service level agreement requirements. They will also perform security remediation, automation, and self-healing as per the requirement.

  • The individual will concentrate on developing automations and reports to minimize manual effort. This can be achieved through various automation tools such as Shell scripting, Ansible, or Python scripting, or by using any other programming language.

Team Leadership:

  • Lead and mentor a team of SRE engineers providing strategic and technical guidance and support.

  • Foster a culture of continuous improvement, innovation, and operational excellence.

  • Develop and implement professional development programs and succession planning for the team.

Technical Leadership

  • Provide technical leadership and oversight to engineers

  • Establish SRE best practices

  • Ensure engineering and operational excellence (quality, security, performance, scalability, availability, resilience).

Collaboration & Strategy

  • Collaborate with Product Office, Operations & Infrastructure, Cybersecurity, Client Support, and other Product Development teams.

  • Drive the coordination, organization, and execution of qualitative and quantitative decisions.

The Skills You Bring:

  • Energy and Experience: A growth mindset that is curious and passionate about technologies and enjoys challenging projects on a global scale

  • Challenge the Status Quo: Comfort in pushing the boundaries, ‘hacking’ beyond traditional solutions

  • Language Expertise: Expertise in one or more general development languages (e.g., Python ,Java, )

  • Learner: Constant drive to learn new technologies

This is a hybrid position. Expectation of days in office will be confirmed by your Hiring Manager.

Qualifications

Basic Qualifications
o 8+ years of relevant work experience and a Bachelor’s degree, OR 11+ years of
relevant work experience

Preferred Qualifications
9 or more years of relevant work experience with a Bachelor’s degree or 7 or
more relevant years of experience with an Advanced Degree (e.g. Masters,
MBA, JD, MD) or 3 or more years of experience with a PhD
o Experience with managing and optimizing Big Data and Kafka clusters.
o Proficient in scripting languages (Python, Bash) and SQL.
o Familiarity with big data tools (Big Data, Spark, Kafka, etc.) and frameworks (HDFS, MapReduce, etc.).
o Strong knowledge in system architecture and design patterns for high-performance computing.
o Good understanding of data security and privacy concerns.
o Excellent problem-solving and troubleshooting skills.
o Observability: knowledge on observability tools like Grafana, opera and Splunk.
o Linux: understanding of Linux, networking, CPU, memory, and storage.
o Programming Languages: Knowledge of and ability to code or program in one of Java, python or a widely used coding language.
o Communication: Excellent interpersonal skills, along with superior verbal and written communication abilities.
o Demonstrated experience with AWS and GCP cloud platforms
o Superior verbal, written & interpersonal communication skills with both technical & non-technical audiences
o Excellent team player, with strong collaboration skills and ability to influence cross-functional team for results

Additional Information

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Ansible Architecture AWS Azure Big Data DevOps Engineering GCP Grafana HDFS Java Kafka Linux PhD Privacy Python Security Shell scripting Spark Splunk SQL

Perks/benefits: Career development Startup environment

Region: Asia/Pacific
Country: Singapore

More jobs like this