Staff Data Engineer

India - Bengaluru - Embassy One Complex

Illumina

Illumina sequencing and array technologies drive advances in life science research, translational and consumer genomics, and molecular diagnostics.

View all jobs at Illumina

Apply now Apply later

What if the work you did every day could impact the lives of people you know? Or all of humanity?

At Illumina, we are expanding access to genomic technology to realize health equity for billions of people around the world. Our efforts enable life-changing discoveries that are transforming human health through the early detection and diagnosis of diseases and new treatment options for patients.

Working at Illumina means being part of something bigger than yourself. Every person, in every role, has the opportunity to make a difference. Surrounded by extraordinary people, inspiring leaders, and world changing projects, you will do more and become more than you ever thought possible.

JOB DESCRIPTION:

The position is an exciting opportunity to be a member of the Data Integration & Analytics team within GIS Application & Platform Services Dept. The team’s scope includes data services on enterprise data platforms like Snowflake cloud data platform, SAP HANA analytics and Denodo data virtualization. The team is responsible for managing the full software development lifecycle of data, its data quality, and operations. This role will support strategic solutions like the enterprise data lake on AWS/ Snowflake and the enterprise data warehouse on Snowflake. The role is responsible for collaborating with cross functional teams, planning, and coordinating requirements, providing data engineering services and helping build trust in the data being managed.

JOB DUTIES:

  • Translate business requirements into data requirements, data warehouse design and sustaining data management strategies on enterprise data platforms like Snowflake.
  • Work with project leads, stakeholders, and business SMEs to define technical specifications to develop data modeling requirements and maintain data infrastructure to provide business users with the tools and data needed.
  • Solution architect, design and develop large scale and optimized analytics solutions. 
  • Requirements gathering and analysis, development planning and co-ordination in collaboration with stakeholder teams.
  • Understand data architecture & solution design, design & develop dimensional models in an enterprise data warehouse environment.
  • Development and automation of enterprise data transformation pipelines, ELT/ ETL processes.
  • Work with cross functional teams and process owners on the development of test cases and scripts, test models and solutions to verify that requirements are met and ensuring high levels of data quality. 
  • Develop and apply quality assurance best practices.
  • Design and apply data engineering best practices for an enterprise data warehouse.
  • Analyze data and data behaviors to support business user queries.
  • Excellent understanding of impact due to changes in data platforms, data models and data behaviors.
  • Excellent problem-solving skills and ability to troubleshoot complex data engineering issues.
  • Benchmark application operational performance periodically, track (metrics) and fix issues.
  • Understand and comply with data governance and compliance practices as defined for risk management. This includes data encryption practices, RBAC and security policies.
  • Promote and apply metadata management best practices supporting enterprise data catalogs.
  • Support change and release management processes.
  • Support incident and response management including problem solving and root cause analysis, documentation.
  • Support automation and on-call processes (Tier 1 / Tier 2).  

SPECIFIC SKILLS OR OTHER REQUIREMENTS:


Requires 8 + years of experience with:

Primary Experience: Data Warehouses and Data Transformations

Data Platforms:

  • Required: Snowflake cloud data platform
  • Preferred: Denodo data virtualization, SAP HANA analytics

Data Engineering:

  • Snowflake: 5+ years of required expertise with Snowflake SnowSQL, Snowpipe (integrated with AWS S3), Streams and Tasks, Stored Procedures, Merge statements, Functions, RBAC, Security Policies, Compute and Storage usage optimization techniques, Performance optimization techniques.
  • Dbt Cloud: 3+ years of expertise with dbt cloud platform, very good understanding of data models (views, data materializations, incremental data loads, snapshots), cross functional references, DAGs and its impact, job scheduling, audit and monitoring, working with code repositories, deployments.
  • SnapLogic: Data integrations with SnapLogic integration platform is a plus.
  • Certifications: Snowflake and dbt Cloud data engineering certifications is a plus.

Data Orchestration:

  • Required: Control-M, Apache Airflow

Cloud Storage Platforms:

  • Required: Amazon Web Services
  • Preferred: Microsoft Azure

Programming/ Scripting:

  • Snowflake: SQL Scripting (Snowpipes, Tasks, Streams, Merge Statements, Stored Procedures, Functions, Security Policies – DDM, Row-Access), SQL Scripting, Python

Code Management:

  • Required: Excellent understanding of working with code repositories like GitHub, GitLab ,code version management, branching and merging patterns in a central repository managing cross functional code, deployments.

Data Operations: Excellent understanding of Data Ops practices for data management.

Solution Design: Good understanding of end-to-end solution architecture and design practices, ability to document solutions (maintain diagrams)

Stakeholder Engagement: Ability to take the lead and drive project activities collaborating with Analytics stakeholders and ensure the requirement is completed.

Data Warehousing: Excellent on fundamental concepts of dimensional modeling, experience working on data warehouse solutions, requirement gathering, design & build, data analysis, data quality. data validations, developing data transformations using ELT/ ETL patterns.

Source Systems:

  • Required: Knowledge and experience maintaining dimensional data model built on data from source systems like SAP ERP (On premises, Cloud), Salesforce CRM, Workday, Planisware, Team Center PLM. Other source systems are a plus.

Data As-A-Product: Preferred knowledge and experience working with data treated as data products. Illumina is following a hybrid data mesh architecture which promotes data as a product for data lifecycle management.

Governance: Good understanding of working with companies having regulated systems and processes for data. Adherence to data protection practices using tagging, security policies and data security (object-level, column-level, row-level). Promoting and applying best practices for data catalogs, following data classification practices and metadata management for data products within your scope.

Operating Systems: Windows, Linux

EDUCATION & EXPERIENCE:


Bachelor’s degree equivalent in Computer Science/Engineering or equivalent degree.

#LI-HYBRID

#illuminacareers


Illumina believes that everyone has the ability to make an impact, and we are proud to be an equal opportunity employer committed to providing employment opportunity regardless of sex, race, creed, color, gender, religion, marital status, domestic partner status, age, national origin or ancestry, physical or mental disability, medical condition, sexual orientation, pregnancy, military or veteran status, citizenship status, and genetic information.
Apply now Apply later
  • Share this job via
  • 𝕏
  • or

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Airflow Architecture AWS Azure Classification Computer Science Data analysis Data governance Data management DataOps Data quality Data warehouse Data Warehousing dbt ELT Engineering ETL GitHub GitLab Linux Pipelines Python Salesforce Security Snowflake SQL

Region: Asia/Pacific
Country: India

More jobs like this