Data Engineer

Pune DIA, India

Roche

As a pioneer in healthcare, we have been committed to improving lives since the company was founded in 1896 in Basel, Switzerland. Today, Roche creates innovative medicines and diagnostic tests that help millions of patients globally.

View all jobs at Roche

Apply now Apply later

Roche fosters diversity, equity and inclusion, representing the communities we serve. When dealing with healthcare on a global scale, diversity is an essential ingredient to success. We believe that inclusion is key to understanding people’s varied healthcare needs. Together, we embrace individuality and share a passion for exceptional care. Join Roche, where every voice matters.

The Position

DESCRIPTION 

         

We are looking for an experienced and creative Principal Data Engineer to join our dynamic data and analytics team.

 

In this role, you will need Hands-on expertise in Data Engineering AWS Tech Stack along with the ability to provide direction and guidance to developers overseeing the development, unit testing, and documentation for the developed solution and developing strong customer relationships for ongoing business. For success in this role, you will be experienced with Cloud-based Data Solution Architectures, Software Development Life Cycle (including both Agile and waterfall), Data Engineering and ETL tools/platforms, and data modeling practices.

PREFERRED LOCATION 

Pune, India 

 

KEY RESPONSIBILITIES 

  • Lead the design and implementation of the data & analytics architecture ensuring compliance, quality, and sustainable platform growth.

  • Build scalable end-to-end data pipelines to integrate and model datasets from different sources that meet functional and non-functional requirements.

  • Manage the technical scope and architecture of the project before, during, and after delivery

  • As a Tech Lead, you will be responsible for leading data engineering teams to deliver cutting-edge data products on the cloud for our customers.

  • Work with business and functional stakeholders to understand data requirements and downstream analytics needs.

  • Partner with multiple areas of business to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements.

  • Responsible for ratifying technology solutions, producing concise Design documents, contributing to work estimates

  • Translate business requirements & E2E designs into technical implementations based on system capabilities.

  • Define, and promote re-usable, extendible, scalable, and maintainable solutions considering trade-off for cost vs benefit

  • Communicate at all levels clearly and credibly about the importance of solution design.

  • Foster a data-driven culture throughout the team and lead data engineering projects that will have an impact throughout the organization.

  • Understand product requirements and contribute to design processes.

  • Drive innovation through a good understanding of data, business drivers, and business needs.

  • Perform technical walk-throughs to ensure effective communication of system architecture.

  • Work with data and analytics experts to strive for greater functionality in our data systems and products; and help to grow our data team with exceptional engineers.

REQUIRED EXPERIENCE, SKILLS & QUALIFICATIONS 

 

  • Around 10-12 years of relevant experience working with High-Performance Data Products or Data Systems as a Data Architect/Engineer. 

  • Advanced level proficiency in Designing and Developing Data Products (using i.e. PySpark, Spark SQL, Scala, etc.),  orchestration tools/services (i.e. Airflow, etc.)

  • Proficient with Software Engineering best practices, such as unit testing and integration testing, and software development tools, such as IntelliJ, Maven, Git, and Docker among others.

  • Extensive experience in at least one cloud platform (AWS preferred) with Big Data and AI/ML services (EMR, Bedrock, Sagemaker, Quicksight, Lake Formation etc.)

  • Advanced knowledge of Apache Spark, Kafka, or equivalent streaming/batch processing and event-based messaging.

  • Strong Data Analysis skills and ability to slice and dice the data as needed for stakeholders’ reporting.

  • Relevant experience in databases (columnar, NoSQL, and MPP databases: Redshift, Dynamodb, Aurora, Postgres, and/or Snowflake).

  • Should be aware of Security compliances and design practices.

  • Exceptional interpersonal, analytical, and communication skills including the ability to explain and discuss DevOps concepts with colleagues and teams.

  • Expertise in test management and defect tracking tools like HP Quality Center, and JIRA.

  • Fully adhere to and evangelize an entire CI/CD pipeline.

  • Idea on API development and use of JSON/XML as data formats.

  • Proficiency in Designing and Developing Data Products and leading a team of Data Engineers to drive end-to-end execution

 

 

 

DESIRED EXPERIENCE, SKILLS & QUALIFICATIONS 

  • Experience in the Healthcare Laboratory domain is a plus.

  • Experience with security and privacy regulations (GDPR, HIPAA).

  • Demonstrated ability to collaborate effectively with cross-functional teams in a fast-paced and dynamic environment.

  • Proven track record of conducting root cause analyses on both internal and external data and processes to address specific business inquiries and identify areas for enhancement.

 

EDUCATION 

Master’s degree/Bachelor’s degree in Computer Science or related

Who we are

At Roche, more than 100,000 people across 100 countries are pushing back the frontiers of healthcare. Working together, we’ve become one of the world’s leading research-focused healthcare groups. Our success is built on innovation, curiosity and diversity.

Roche is an Equal Opportunity Employer.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  2  0  0
Category: Engineering Jobs

Tags: Agile Airflow API Development APIs Architecture AWS Big Data CI/CD Computer Science Data analysis Data pipelines DevOps Docker DynamoDB Engineering ETL Git Jira JSON Kafka Lake Formation Machine Learning Maven MPP NoSQL Pipelines PostgreSQL Privacy PySpark QuickSight Redshift Research SageMaker Scala SDLC Security Snowflake Spark SQL Streaming Testing XML

Perks/benefits: Career development

Region: Asia/Pacific
Country: India

More jobs like this