Data Engineer

India - Hyderabad

Full Time Mid-level / Intermediate USD 49K - 91K *

Amgen

Amgen is committed to unlocking the potential of biology for patients suffering from serious illnesses by discovering, developing, manufacturing and delivering innovative human therapeutics.

View all jobs at Amgen

Apply now Apply later

Posted 1 month ago

Career Category

Information Systems

Job Description

Join Amgen’s Mission of Serving Patients

At Amgen, if you feel like you’re part of something bigger, it’s because you are. Our shared mission—to serve patients living with serious illnesses—drives all that we do.

Since 1980, we’ve helped pioneer the world of biotech in our fight against the world’s toughest diseases. With our focus on four therapeutic areas –Oncology, Inflammation, General Medicine, and Rare Disease– we reach millions of patients each year. As a member of the Amgen team, you’ll help make a lasting impact on the lives of patients as we research, manufacture, and deliver innovative medicines to help people live longer, fuller happier lives.

Our award-winning culture is collaborative, innovative, and science based. If you have a passion for challenges and the opportunities that lay within them, you’ll thrive as part of the Amgen team. Join us and transform the lives of patients while transforming your career.

What you will do

Let’s do this. Let’s change the world. In this vital role you will serve a critical function within Organizational, Planning, Analytics & Insights with a goal of enterprise-wide, long-term workforce transformation by connecting people, financial, procurement & capability data to enable business insights & decisions. The Data Engineer will collaborate with the Tech and Data lead for OPA&I and will be responsible for building and maintaining data engineer solutions based on OPA&I requirements.

Roles & Responsibilities:

Data Integration and Management: Develop and maintain robust data pipelines to integrate data from various sources, including HR, Finance, Procurement, and Activities. Ensure seamless data flow and synchronization between systems and databases. Implement strategies and tools to efficiently run and process unstructured data. Develop methods to extract important insights from unstructured data sources. Establish and implement data validation procedures to ensure data accuracy and consistency. Develop and maintain data standards and ensure data integrity across all systems. Develop and maintain a single source of truth for organizational data. Ensure data requirements for applications, data scientists, and multi-functional use cases are met, and ensure dedication and reliability of the system. Maintain, patch, and upgrade data pipeline solutions to maintain security and regulatory compliance, performance standards, and platform best practices.
Automation development: Design and implement automation solutions for data processes, including one-click base-lining and case generation. Continuously seek opportunities to improve efficiency through automation. Lead the development and implementation of automation solutions for data processes. Ensure automation initiatives enhance efficiency and support strategic business goals.
Process and Data Improvement: Identify and implement improvements in data processes and workflows. Collaborate with other teams to enhance data quality and operational efficiency. Design, code, test, and review data pipelines to standards and best practices for data solutions.

What we expect of you

Basic Qualifications:

Master’s degree in computer science or engineering field and 1 to 3 years of relevant experience OR
Bachelor’s degree in computer science or engineering field and 3 to 5 years of relevant experience OR
Diploma and Minimum of 8+ years of relevant work experience

Must-Have Skills:

Experience with Databricks (or Snowflake), including cluster setup, execution, and tuning
Experience building ETL or ELT pipelines; Hands-on experience with SQL/NoSQL
Experience with one or more programming languages, Python, R, SAS, Scala, or Java.
Experience with common data processing libraries: Pandas, PySpark, SQLAlchemy.
Experience with software engineering best-practices, including but not limited to version control, infrastructure-as-code, CI/CD, and automated testing
Experience with data lake, data fabric and data mesh concepts
Experience with data modeling, performance tuning, and experience on relational and graph databases

Good-to-Have Skills:

Experience operationalizing and handling production data systems: Airflow, Linux, Monte Carlo, Kafka.
Experience with AWS services: EC2, S3, EMR, RDS, Redshift/Spectrum, Lambda, Glue, Athena, API gateway, and design patterns (Containers, Serverless, Kubernetes, Docker, etc.)
Experince with DevOps tools (Ansible/ Gitlab CI/CD / GitHub / Docker /Jenkins)
Experience working in Agile-based teams

Soft Skills:

Excellent analytical and solving skills
Strong verbal and written communication skills
Ability to work effectively with global, virtual teams
High degree of initiative and self-motivation
Ability to handle multiple priorities successfully
Team-oriented, with a focus on achieving team goals
Strong presentation and public speaking skills

EQUAL OPPORTUNITY STATEMENT

Amgen is an Equal Opportunity employer and will consider you without regard to your race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability status.

We will ensure that individuals with disabilities are provided with reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request an accommodation.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 1 0 0

Category: Engineering Jobs

Tags: Agile Airflow Ansible APIs Athena AWS CI/CD Computer Science Databricks Data pipelines Data quality DevOps Docker EC2 ELT Engineering ETL Finance GitHub GitLab Java Jenkins Kafka Kubernetes Lambda Linux Monte Carlo NoSQL Pandas Pipelines PySpark Python R Redshift Research SAS Scala Security Snowflake SQL Testing Unstructured data