GCP Data Engineer

United States

Full Time Senior-level / Expert USD 63K - 147K * ^est.

Saama

Saama automates key clinical development and commercialization processes, with artificial intelligence (AI), machine learning (ML) and advanced-analytics, accelerating your time to market.

View all jobs at Saama

Apply now Apply later

Posted 1 day ago

GCP Data Engineer Responsible for construction and development of "large-scale cloud data processing systems" in the Google Cloud Platform (GCP). The GCP Data Engineer must have considerable expertise in data warehousing and the job requires proven coding expertise with Python, Java, SQL, and Spark languages. Must be able to implement enterprise cloud data architecture designs, and will work closely with the rest of the scrum team and internal business partners to identify, evaluate, design, and implement large scale data solutions, structured and unstructured, public and proprietary data. The GCP Data Engineer will work iteratively on the cloud platform to design, develop and implement scalable, high performance solutions that offer measurable business value to customers. Qualifications and Education:

GCP Data Engineer certification preferred
Bachelor's in computer engineering or equivalent field or equivalent foreign degree required

Required Work Experience:

Minimum of 10+ years of work experience
5+ years of experience in an engineering role using Python, Java, Spark, and SQL.
5+ experience working as a Data Engineer in GCP
Demonstrated proficiency with Google’s Identity and Access Management (IAM) API
Demonstrated proficiency with Airflow

Desired Work Experience:

Coding experience with Python, Java, Spark, and SQL
Strong Linux/Unix background and hands on knowledge.
Past experience with big data technologies including HDFS, Spark, Impala, Hive
Experience with gcp platform development tools Pub/sub, cloud storage, big table, big query, data flow, data proc, and composer desired.
Knowledge in Hadoop and cloud platforms and surrounding ecosystems.
Experience with web services and APIs as in RESTful and SOAP.
Strong experience working with real time streaming applications and batch style large scale distributed computing applications using tools like Spark, Kafka, Flume, pubsub, and airflow.
Ability to work with different file formats like Avro, Parquet, and JSON.
Experience with Shell scripting and bash.
Experience with version control platform github
Experience unit testing code.
Experience with development ecosystem including Jenkins, Artifactory, CI/CD, and Terraform.
Works on problems of diverse scope and complexity ranging from moderate to substantial
Assists senior professionals in determining methods and procedures for new tasks
Leads basic or moderately complex projects/activities on semi-regular basis
Must possess excellent written and verbal communication skills
Ability to understand and analyze complex data sets
Exercises independent judgment on basic or moderately complex issues regarding job and related tasks
Makes recommendations to management on new processes, tools and techniques, or development of new products and services
Makes decisions regarding daily priorities for a work group; provides guidance to and/or assists staffon non-routine or escalated issues
Decisions have a moderate impact on operations within a department
Works under minimal supervision, uses independent judgment requiring analysis of variable factors
Requires little instruction on day-to-day work and general direction on more complex tasks and projects
Collaborates with senior professionals in the development of methods, techniques and analytical approach
Ability to advise management on approaches to optimize for data platform success.
Able to effectively communicate highly technical information to numerous audiences, including management, the user community, and less-experienced staff.
Consistently communicate on status of project deliverables
Consistently provide work effort estimates to management to assist in setting priorities
Deliver timely work in accordance with estimates
Solve problems as they arise and communicate potential roadblocks to manage expectations
Adhere strictly to all security policies
Proficient in multiple programming languages, frameworks, domains, and tools.
Coding skills in Scala
Ability to document designs and concepts
API Orchestration and Choreography for consumer apps
Well rounded technical expertise in Apache packages and Hybrid cloud architectures
Pipeline creation and automation for Data Acquisition
Metadata extraction pipeline design and creation between raw and finally transformed datasets
Quality control metrics data collection on data acquisition pipelines
Able to collaborate with scrum team including scrum master, product owner, data analysts, Quality Assurance, business owners, and data architecture to produce the best possible end products
Experience contributing to and leveraging jira and confluence.
Managing and scheduling batch jobs.
Hands on experience in Analysis, Design, Coding and Testing phases of Software Development Life Cycle (SDLC).