Staff Site Reliability Engineer - PRE

Warsaw, POLAND, Poland

Visa

Visa digitaalinen ja mobiilimaksuverkko on eturintamassa uusien maksujen, sähköisten ja kontaktivarojen maksutekniikan, jotka muodostavat rahan maailman

View all jobs at Visa

Apply now Apply later

Company Description

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure payments network, enabling individuals, businesses, and economies to thrive while driven by a common purpose – to uplift everyone, everywhere by being the best way to pay and be paid.

Make an impact with a purpose-driven industry leader. Join us today and experience Life at Visa.

Job Description

Hadoop/Big-Data: 

  • Sound knowledge on managing large scale Hadoop platforms including monitoring the platform, debugging issues, and tuning the performance of the cluster.

  • In-depth knowledge of the Hadoop ecosystem, including Zookeeper, HDFS, Yarn, HIVE, SPARK, Trino and Kafka.

  • Proven experience in debugging issues on both Hadoop platform and applications.

  • Familiarity with security tools such as Kerberos, Ranger, and active directory integrations.

  • Experience on Cloud technologies preferably AWS EMR.

  • Knowledge on Kubernetes, AI, MLOPS will be advantageous.

Collaboration and Teamwork:

  • Collaborate closely with L-3 teams to review new use cases and implement cluster hardening techniques, ensuring the development of robust and reliable platforms.

  • Foster cross-team collaboration, building and maintaining strong relationships with customer teams, user communities, architects, and engineering teams.

  • Work jointly on key deliverables to ensure production scalability and stability.

Automation: Hands-on Experience with automations using Ansible, Shell, python, or any programming languages. The ability to automate the manual tasks is key in this role.

Observability: knowledge on observability tools like Grafana, opera, Prometheus and Splunk.

Linux: understanding of Linux, networking, CPU, memory, and storage. 

Programming Languages: Knowledge of and ability to code or program in one of python, Java or a widely used coding language.

Communication: Excellent interpersonal skills, along with superior verbal and written communication abilities.

This position is not ideal for a Hadoop developer.

This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.

 

Qualifications

Basic Qualifications
•As a Staff Site Reliability Engineer you will be part of a team that maintains and
supports Visa's Data Platform and provides support for key Big data Platforms. •You will be responsible for driving innovation for our partners and clients,
within Visa and globally. You will work on open-source Big Data clusters,
ensuring their availability, performance, reliability, and improving operational
efficiency.
•Master's degree in Math, Science, Engineering, or Computer Science,
Information Systems, or related field. OR
Bachelor's degree in Math, Science, Engineering, or Computer Science,
Information Systems, or related field AND minimum five (5) years of
experience in a directly related field. OR
Minimum five (5) plus years working on Hadoop systems.

Preferred Qualifications
•The role involves performing Big Data SRE and Engineering activities on
multiple open-source platforms such as Hadoop, Kafka, HBase, and Spark. The
candidate should possess strong troubleshooting and debugging skills.
•Other responsibilities include effective root cause analysis of major production
incidents and the development of learning documentation. The person will
identify and implement high-availability solutions for services with a single
point of failure.
•The role involves planning and performing capacity expansions and upgrades
in a timely manner to avoid any scaling issues and bugs. This includes
automating repetitive tasks to reduce manual effort and prevent human errors.
•The successful candidate will tune alerting and set up observability to
proactively identify issues and performance problems. They will also work
closely with Level-3 teams in reviewing new use cases and cluster hardening
techniques to build robust and reliable platforms.
•The role involves creating standard operating procedure documents and
guidelines on effectively managing and utilizing the platforms. The person will
leverage DevOps tools, disciplines (Incident, problem, and change
management), and standards in day-to-day operations.
•The individual will ensure that the Hadoop platform can effectively meet
performance and service level agreement requirements. They will also perform
security remediation, automation, and self-healing as per the requirement.
•The individual will concentrate on developing automations and reports to
minimize manual effort. This can be achieved through various automation
tools such as Shell scripting, Ansible, or Python scripting, or by using any other
programming language.
 

Additional Information

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: Ansible AWS Big Data Computer Science DevOps Engineering Grafana Hadoop HBase HDFS Java Kafka Kubernetes Linux Mathematics MLOps Open Source Python Security Shell scripting Spark Splunk

Perks/benefits: Career development Team events

Region: Europe
Country: Poland

More jobs like this