Databricks Engineer

Bengaluru, KA, India

Full Time Senior-level / Expert USD 28K - 67K * ^est.

Version 1

We don’t just deliver transformation, we empower you to define the future. Our global footprint and mindset accelerates your business at a pace that matches your ambition. This lets you focus on what matters most – discovering new opportunities...

View all jobs at Version 1

Apply now Apply later

Posted 1 month ago

Company Description

Version 1 has celebrated over 26+ years in Technology Services and continues to be trusted by global brands to deliver solutions that drive customer success. Version 1 has several strategic technology partners including Microsoft, AWS, Oracle, Red Hat, OutSystems and Snowflake. We’re also an award-winning employer reflecting how employees are at the heart of Version 1.

We’ve been awarded: Innovation Partner of the Year Winner 2023 Oracle EMEA Partner Awards, Global Microsoft Modernising Applications Partner of the Year Award 2023, AWS Collaboration Partner of the Year - EMEA 2023 and Best Workplaces for Women by Great Place To Work in UK and Ireland 2023.

As a consultancy and service provider, Version 1 is a digital-first environment and we do things differently. We’re focused on our core values; using these we’ve seen significant growth across our practices and our Digital, Data and Cloud team is preparing for the next phase of expansion. This creates new opportunities for driven and skilled individuals to join one of the fastest-growing consultancies globally.

Job Description

This is an exciting opportunity for an experienced developer of large-scale data solutions. You will join a team delivering a transformative cloud hosted data platform for a key Version 1 customer.

The ideal candidate will have a proven track record as a senior/self-starting data engineer in implementing data ingestion and transformation pipelines for large scale organisations. We are seeking someone with deep technical skills in a variety of technologies, specifically SPARK performance\tuning\optimisation and Databricks, to play an important role in developing and delivering early proofs of concept and production implementation.

You will ideally have experience in building solutions using a variety of open source tools & Microsoft Azure services, and a proven track record in delivering high quality work to tight deadlines.

Your main responsibilities will be:

Designing and implementing highly performant metadata driven data ingestion & transformation pipelines from multiple sources using Databricks and Spark
Streaming and Batch processes in Databricks
SPARK performance\tuning\optimisation
Providing technical guidance for complex geospatial problems and spark dataframes
Developing scalable and re-usable frameworks for ingestion and transformation of large data sets
Data quality system and process design and implementation.
Integrating the end to end data pipeline to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times
Working with other members of the project team to support delivery of additional project components (Reporting tools, API interfaces, Search)
Evaluating the performance and applicability of multiple tools against customer requirements
Working within an Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints.

Qualifications

Direct experience of building data piplines using Azure Data Factory and Databricks
Experience Required is 6 to 8 years.
Building data integration with Python
Databrick Engineer certification
Microsoft Azure Data Engineer certification.
Hands on experience designing and delivering solutions using the Azure Data Analytics platform.
Experience building data warehouse solutions using ETL / ELT tools like Informatica, Talend.
Comprehensive understanding of data management best practices including demonstrated experience with data profiling, sourcing, and cleansing routines utilizing typical data quality functions involving standardization, transformation, rationalization, linking and matching.

Nice to have

Experience working in a Dev/Ops environment with tools such as Microsoft Visual Studio Team Services, Chef, Puppet or Terraform
Experience working with structured and unstructured data including imaging & geospatial data.
Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J)
Experience with Azure Event Hub, IOT Hub, Apache Kafka, Nifi for use with streaming data / event-based data

Additional Information

At Version 1, we believe in providing our employees with a comprehensive benefits package that prioritises their well-being, professional growth, and financial stability.

One of our standout advantages is the ability to work with a hybrid schedule along with business travel, allowing our employees to strike a balance between work and life. We also offer a range of tech-related benefits, including an innovative Tech Scheme to help keep our team members up-to-date with the latest technology.

We prioritise the health and safety of our employees, providing private medical and life insurance coverage, as well as free eye tests and contributions towards glasses. Our team members can also stay ahead of the curve with incentivized certifications and accreditations, including AWS, Microsoft, Oracle, and Red Hat.

Our employee-designed Profit Share scheme divides a portion of our company's profits each quarter amongst employees. We are dedicated to helping our employees reach their full potential, offering Pathways Career Development Quarterly, a programme designed to support professional growth.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Engineering Jobs

Tags: Agile APIs AWS Azure Cassandra Data Analytics Databricks Data management Data quality Data warehouse DevOps ELT ETL Informatica Kafka MongoDB Neo4j NiFi NoSQL Open Source Oracle Pipelines Puppet Python Snowflake Spark Streaming Talend Terraform Unstructured data