Data Engineer
Thame
We are a leading provider of tech in the property sector - founded in 2003, our product focus has been our conveyancer two-sided marketplace, connecting consumers with a range of quality conveyancers to choose from at competitive prices via our easy-to-use tech platform. We are now building out our ecosystem so consumers can benefit from our services either via their Estate Agent or their Mortgage Broker, through smarter conveyancing platforms, making the home buying or selling process easier, quicker, safer and more transparent
Why join Smoove?Great question! We pride ourselves on attracting, developing and retaining a diverse range of people in an equally diverse range of roles and specialisms – who together achieve outstanding results. Our transparent approach and open-door policy make Smoove a great place to work and as our business expands, we are looking for ambitious, talented people to join us.
We are looking for a Data Engineer who is capable of maintaining and optimising our current data solution as well as advancing it to the next level. We have created an initial gem of a Data Lake and Lakehouse (Azure Data Lake, ADF, Databricks, Airflow, DBT) to enable Business Intelligence and Data Analytics (Superset, RStudio Connect). Our Data Lake is fully metadata driven, cost efficient, documented, and reproducible.
We need our one-source-of-truth to be reliable and of high quality for both internal and external reporting and insight generation. But it is just the start – we will need best practice solutions around how we share data and insight back to our partners in a scalable and maintainable way. We have big data plans and new products on their way, and we need you to think innovatively and design and implement fast and scalable, but also resilient customer facing solutions that will enable us to provide our customer with the best experience through insight and interaction.
Our source data lives in different systems (including SQL servers, Google Analytics and Salesforce). If you have experience setting up data pipelines and in-depth knowledge of Azure/ Databricks technologies, you are comfortable taking a lead, taking responsibility and you want to put your stamp on our evolving data solution, then this will be the job for you. We work in a fast-paced environment, and it is a priority to enable Analytics and Data Science with access to quality data and the right tools and technologies. But there is no stopping there – we will be a leader in the Proptech Data space, and we need you to drive innovation in your solutions as we grow.
Main responsibilities:
- Take ownership of the centralisation of different data sources into our Lakehouse (Databricks on Azure Data Lake) and its architecture.
- Be responsible for the reliability and quality of data in the Data Lake (including anomaly detection, data quality checks, reconciliations, access, permission, and retention management, PII treatment, and back-up/ restoration plans).
- Set up and manage platform technologies to support Analytics/ Data Science enablement.
- Be innovative and lead the data solution to meet our requirements now and in future.
Skills / Experience required:
- (Required) 3+ years of relevant Data Engineering experience (Preferably Databricks/Azure - or Snowflake/Redshift/BigQuery)
- (Required) Experience with infrastructure as code (e.g. Terraform)
- (Required) Proficiency in using Python both for scheduling (e.g. Airflow) and manipulating data (PySpark)
- (Required) Experience building deployment pipelines (e.g. Azure Pipelines)
- (Required) Deployment of web apps using Kubernetes (Preferably ArgoCD & Helm)
- (Preferred) Experience working on Analytics and Data Science enablement (dbt, DS deployments)
- (Preferred) Experience in MDM, Data Cataloguing and Lineage optimisation
- A strong preference for simplicity and transparency over complexity
- Best practice and detail focussed to enable scalability in future
- Strong written/ verbal communication skills and data presentation skills
When we process your applicant personal data for recruitment purposes, we do so as a controller. If as part of the recruitment process, we share your personal data with another company within the PEXA Group, that company may process your personal data as either an independent controller or, in certain circumstances, a joint controller. By applying for this role, you consent to us processing your personal data in accordance with the UK General Data Protection Regulation ("UK GDPR") and the Data Protection Act 2018, and further information can be found in our privacy noticehttps://pexa.co.uk/applicant-policy/.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow Architecture Azure Big Data BigQuery Business Intelligence Data Analytics Databricks Data pipelines Data quality dbt Engineering Helm Kubernetes Pipelines Privacy PySpark Python Redshift Salesforce Snowflake SQL Superset Terraform
Perks/benefits: Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.