GCP Data Engineer
United States
Saama
Saama automates key clinical development and commercialization processes, with artificial intelligence (AI), machine learning (ML) and advanced-analytics, accelerating your time to market.- GCP Data Engineer certification preferred
- Bachelor's in computer engineering or equivalent field or equivalent foreign degree required
- Minimum of 10+ years of work experience
- 5+ years of experience in an engineering role using Python, Java, Spark, and SQL.
- 5+ experience working as a Data Engineer in GCP
- Demonstrated proficiency with Google’s Identity and Access Management (IAM) API
- Demonstrated proficiency with Airflow
- Coding experience with Python, Java, Spark, and SQL
- Strong Linux/Unix background and hands on knowledge.
- Past experience with big data technologies including HDFS, Spark, Impala, Hive
- Experience with gcp platform development tools Pub/sub, cloud storage, big table, big query, data flow, data proc, and composer desired.
- Knowledge in Hadoop and cloud platforms and surrounding ecosystems.
- Experience with web services and APIs as in RESTful and SOAP.
- Strong experience working with real time streaming applications and batch style large scale distributed computing applications using tools like Spark, Kafka, Flume, pubsub, and airflow.
- Ability to work with different file formats like Avro, Parquet, and JSON.
- Experience with Shell scripting and bash.
- Experience with version control platform github
- Experience unit testing code.
- Experience with development ecosystem including Jenkins, Artifactory, CI/CD, and Terraform.
- Works on problems of diverse scope and complexity ranging from moderate to substantial
- Assists senior professionals in determining methods and procedures for new tasks
- Leads basic or moderately complex projects/activities on semi-regular basis
- Must possess excellent written and verbal communication skills
- Ability to understand and analyze complex data sets
- Exercises independent judgment on basic or moderately complex issues regarding job and related tasks
- Makes recommendations to management on new processes, tools and techniques, or development of new products and services
- Makes decisions regarding daily priorities for a work group; provides guidance to and/or assists staffon non-routine or escalated issues
- Decisions have a moderate impact on operations within a department
- Works under minimal supervision, uses independent judgment requiring analysis of variable factors
- Requires little instruction on day-to-day work and general direction on more complex tasks and projects
- Collaborates with senior professionals in the development of methods, techniques and analytical approach
- Ability to advise management on approaches to optimize for data platform success.
- Able to effectively communicate highly technical information to numerous audiences, including management, the user community, and less-experienced staff.
- Consistently communicate on status of project deliverables
- Consistently provide work effort estimates to management to assist in setting priorities
- Deliver timely work in accordance with estimates
- Solve problems as they arise and communicate potential roadblocks to manage expectations
- Adhere strictly to all security policies
- Proficient in multiple programming languages, frameworks, domains, and tools.
- Coding skills in Scala
- Ability to document designs and concepts
- API Orchestration and Choreography for consumer apps
- Well rounded technical expertise in Apache packages and Hybrid cloud architectures
- Pipeline creation and automation for Data Acquisition
- Metadata extraction pipeline design and creation between raw and finally transformed datasets
- Quality control metrics data collection on data acquisition pipelines
- Able to collaborate with scrum team including scrum master, product owner, data analysts, Quality Assurance, business owners, and data architecture to produce the best possible end products
- Experience contributing to and leveraging jira and confluence.
- Managing and scheduling batch jobs.
- Hands on experience in Analysis, Design, Coding and Testing phases of Software Development Life Cycle (SDLC).
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow APIs Architecture Avro Big Data BigQuery Bigtable CI/CD Confluence Data Warehousing Engineering GCP GitHub Google Cloud Hadoop HDFS Java Jenkins Jira JSON Kafka Linux Parquet Pipelines Python Scala Scrum SDLC Security Shell scripting Spark SQL Streaming Terraform Testing
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.