Manager, Site Reliability Engineer - HAI
IND - Telangana - Hyderabad (HITEC City), India
MSD
At MSD, we're following the science to tackle some of the world's greatest health threats. Get a glimpse of how we work to improve lives.Job Description
Manager, Site Reliability Engineer - HAI
The Opportunity
- Based in Hyderabad, join a global healthcare biopharma company and be part of a 130- year legacy of success backed by ethical integrity, forward momentum, and an inspiring mission to achieve new milestones in global healthcare.
- Be part of an organisation driven by digital technology and data-backed approaches that support a diversified portfolio of prescription medicines, vaccines, and animal health products.
- Drive innovation and execution excellence. Be a part of a team with passion for using data, analytics, and insights to drive decision-making, and which creates custom software, allowing us to tackle some of the world's greatest health threats.
Our Technology Centers focus on creating a space where teams can come together to deliver business solutions that save and improve lives. An integral part of our company’s IT operating model, Tech Centers are globally distributed locations where each IT division has employees to enable our digital transformation journey and drive business outcomes. These locations, in addition to the other sites, are essential to supporting our business and strategy.
A focused group of leaders in each Tech Center helps to ensure we can manage and improve each location, from investing in growth, success, and well-being of our people, to making sure colleagues from each IT division feel a sense of belonging to managing critical emergencies. And together, we must leverage the strength of our team to collaborate globally to optimize connections and share best practices across the Tech Centers.
Role Overview
- Our technology teams operate as business partners, proposing ideas and innovative solutions that enable new organizational capabilities. We collaborate internationally to deliver services and solutions that help everyone be more productive and enable innovation.
Responsibilities
- Design and implement monitoring solutions to ensure system reliability and performance metrics designed using observability tools / technology.
- Create and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs) for Products and platforms.
- As a Site Reliability Engineer, you will work with the SRE team which includes Software and System Engineers who are custodians of the availability, scalability, and performance of the SaaS products. You will monitor platform architecture through automation and help the individual product teams to identify performance bottlenecks and recommend system enhancements.
- SRE engineers are responsible for ensuring underlying infrastructure is running smoothly.
- Develop automation scripts (using python or PowerShell) and tools to streamline operations and reduce manual tasks.
- Monitor resource utilization and plan for future capacity requirements before it hits thresholds.
- Implement infrastructure as code (IaC) practices using tools like Terraform, CloudFormation, or similar.
- Work closely with product development teams to ensure reliability is built into the software lifecycle.
- Advocate for operational excellence and best practices across engineering teams.
- Collaborate with the security team to implement security measures and ensure compliance with relevant standards.
- Identify and remediate vulnerabilities in the platform & participate in audits, reviews or remediations.
- Create and maintain thorough documentation for systems, processes, and operational procedures.
Technical Skills
- Moderate proficiency in programming/scripting languages (e.g., Python, Bash).
- Practical experience with cloud service providers (e.g., AWS / Azure).
- Experience with container orchestration tools (e.g., Kubernetes, Docker).
- Familiarity with monitoring and logging tools (e.g., Nobl9, Prometheus, Grafana).
- Experience with CI/CD practices and tooling (e.g., Jenkins, Github actions).
- Understanding of microservices architecture and distributed systems.
- Familiarity with Agile or Kanban or Scrum methodologies.
- Familiarity with RPA tools (uipath) would be added.
Soft Skills:
- Excellent problem-solving skills and the ability to troubleshoot complex / production issues.
- Strong communication skills, with the ability to articulate technical concepts to non-technical stakeholders.
- A proactive mindset with a passion for continuous improvement.
Qualifications:
- Bachelor’s degree in computer science, Data Science, Information Technology, Engineering or a related field.
- 4+ plus years of experience with container orchestration tools (e.g., Kubernetes, Docker).
- Experience with CI/CD practices and tooling (e.g., Jenkins, Github actions). Power BI: Essential for the role, ThoughtSpot: Required for augmented analytics.
- Strong analytical and problem-solving skills, with an attention to detail.
- Communication: Effective communication at different levels, including supporting development work, hyper care, trainings, user onboardings, and requirements gathering with senior people.
Who we are:
We are known as Merck & Co., Inc., Rahway, New Jersey, USA in the United States and Canada and MSD everywhere else. For more than a century, we have been inventing for life, bringing forward medicines and vaccines for many of the world's most challenging diseases. Today, our company continues to be at the forefront of research to deliver innovative health solutions and advance the prevention and treatment of diseases that threaten people and animals around the world.
What we look for:
Imagine getting up in the morning for a job as important as helping to save and improve lives around the world. Here, you have that opportunity. You can put your empathy, creativity, digital mastery, or scientific genius to work in collaboration with a diverse group of colleagues who pursue and bring hope to countless people who are battling some of the most challenging diseases of our time. Our team is constantly evolving, so if you are among the intellectually curious, join us—and start making your impact today.
#HYDIT2025
Current Employees apply HERE
Current Contingent Workers apply HERE
Search Firm Representatives Please Read Carefully
Merck & Co., Inc., Rahway, NJ, USA, also known as Merck Sharp & Dohme LLC, Rahway, NJ, USA, does not accept unsolicited assistance from search firms for employment opportunities. All CVs / resumes submitted by search firms to any employee at our company without a valid written search agreement in place for this position will be deemed the sole property of our company. No fee will be paid in the event a candidate is hired by our company as a result of an agency referral where no pre-existing agreement is in place. Where agency agreements are in place, introductions are position specific. Please, no phone calls or emails.
Employee Status:
RegularRelocation:
VISA Sponsorship:
Travel Requirements:
Flexible Work Arrangements:
HybridShift:
Valid Driving License:
Hazardous Material(s):
Required Skills:
Applied Engineering, Data Engineering, Data Visualization, Design Applications, Maintenance Management, Management Process, Reliability Management, Safety Management, Social Collaboration, Software Configurations, Software Development, Software Development Life Cycle (SDLC), Solution Architecture, System Designs, Systems Integration, TestingPreferred Skills:
Job Posting End Date:
08/20/2025*A job posting is effective until 11:59:59PM on the day BEFORE the listed job posting end date. Please ensure you apply to a job posting no later than the day BEFORE the job posting end date.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Architecture AWS Azure CI/CD CloudFormation Computer Science Data visualization Distributed Systems Docker Engineering GitHub Grafana Jenkins Kanban Kubernetes Microservices Power BI Python Research Robotics RPA Scrum SDLC Security Terraform Testing
Perks/benefits: Relocation support
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.