Cloud Data Engineer Job
Bangalore, KA, IN
Yash Technologies
YASH specialists provide information, establish contacts and build bridges between the local decision-makers in German companies and the YASH teams.YASH Technologies is a leading technology integrator specializing in helping clients reimagine operating models, enhance competitiveness, optimize costs, foster exceptional stakeholder experiences, and drive business transformation.
At YASH, we’re a cluster of the brightest stars working with cutting-edge technologies. Our purpose is anchored in a single truth – bringing real positive changes in an increasingly virtual world and it drives us beyond generational gaps and disruptions of the future.
We are looking forward to hire Terraform Professionals in the following areas :
Job Description:
Experience required: 6+ years.
Job Title: Cloud Data Engineer - Enterprise Big Data Platform
Right to Hire requirement
In this role, you will be part of a growing, global team of data engineers, who collaborate in DevOps mode, to enable business with state-of-the-art technology to leverage data as an asset and to take better informed decisions.
The Enabling Functions Data Office Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Enabling Function’s data management and analytics platform (Palantir Foundry, AWS and other components).
Developing pipelines and applications on cloud platform requires:
- Hands-on experience with Terraform or Cloudformation and other infrastructure automation tools
- Experience with Azure DEVOPS
- Proven track record in setting up CI/CD pipelines and automating cloud infrastructure
- Strong understanding of cloud infrastructure, with experience in AWS or other cloud providers
- Experience with Azure DEVOPS , GitOps approach for automation
- Experience with automation of DBT orchestration using Azue Devops pipelines
- Experience working with services like Glue, EC2, ELB, RDS, Dynamo DB and S3
- Ability to work independently, troubleshoot issues, and optimize performance
- Practical experience is valued more than certifications
This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.
Roles & Responsibilities:
- B.Tech / B.Sc./M.Sc. in Computer Science or related field and overall 6+ years of industry experience
- Strong experience in Big Data & Data Analytics
- Experience in building robust ETL pipelines for batch as well as streaming ingestion.
- Big Data engineers with a firm grounding in Object Oriented Programming and an advanced level knowledge with commercial experience in Python, PySpark and SQL
- Interacting with RESTful APIs incl. authentication via SAML and OAuth2
- Experience with test driven development and CI/CD workflows
- Knowledge of Git for source control management
- Agile experience in Scrum environments like Jira
- Knowledge of container technologies such as Docker and Kubernetes is an advantage
- Experience in Palantir Foundry, AWS or Snowflake is an advantage
- Problem solving abilities
- Proficient in English with strong written and verbal communication
Primary Responsibilities
- Responsible for designing, developing, testing and supporting data pipelines and applications
- Industrialize data pipelines
- Establishes a continuous quality improvement process to systematically optimize data quality
- Collaboration with various stakeholders incl. business and IT
Education
- Bachelor (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences or related fields
Professional Experience
- 6+ years of experience in system engineering or software development
- 3+ years of experience in engineering with experience in ETL type work with databases and Cloud platforms.
Skills
Big Data General Deep knowledge of distributed file system concepts, map-reduce principles and distributed computing. Knowledge of Spark and differences between Spark and Map-Reduce. Familiarity of encryption and security in a Hadoop cluster.
- Data management / data structures: Must be proficient in technical data management tasks, i.e. writing code to read, transform and store data XML/JSON knowledge
- Experience working with REST APIs
- Spark Experience in launching spark jobs in client mode and cluster mode. Familiarity with the property settings of spark jobs and their implications to performance.|
- SCC/Git Must be experienced in the use of source code control systems such as Git|
- ETL Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.|
- Authorization: Basic understanding of user authorization (Apache Ranger preferred)|
- Programming: Must be at able to code in Python or expert in at least one high level language such as Python, Java, Scala.
- Must have experience in using REST APIs
- SQL: Must be an expert in manipulating database data using SQL. Familiarity with views, functions, stored procedures and exception handling.|
- AWS: General knowledge of AWS Stack (EC2, S3, EBS, …)|
- IT Process Compliance: SDLC experience and formalized change controls
- Working in DevOps teams, based on Agile principles (e.g. Scrum)
- ITIL knowledge (especially incident, problem and change management)
Languages: Fluent English skills
Specific information related to the position:
- Physical presence in primary work location (Bangalore)
- Flexible to work CEST and US EST time zones (according to team rotation plan)
- Willingness to travel to Germany, US and potentially other locations (as per project demand)
At YASH, you are empowered to create a career that will take you to where you want to go while working in an inclusive team environment. We leverage career-oriented skilling models and optimize our collective intelligence aided with technology for continuous learning, unlearning, and relearning at a rapid pace and scale.
Our Hyperlearning workplace is grounded upon four principles
- Flexible work arrangements, Free spirit, and emotional positivity
- Agile self-determination, trust, transparency, and open collaboration
- All Support needed for the realization of business goals,
- Stable employment with a great atmosphere and ethical corporate culture
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile APIs AWS Azure Big Data CI/CD CloudFormation Computer Science Data Analytics Data management Data pipelines Data quality DB2 dbt DevOps Docker EC2 ELT Engineering ETL Git Hadoop ITIL Java Jira JSON Kubernetes Mathematics MySQL Oracle Pipelines PySpark Python RDBMS Scala Scrum SDLC Security Snowflake Spark SQL Streaming Terraform Testing XML
Perks/benefits: Career development Flex hours Transparency
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.