Data Engineer (Junior)
Sandton - 1 Discovery Place, Gauteng, ZA
Discovery
Discovery offers award-winning products - Medical Aid Administration, Car and Life Insurance, Bank Accounts and Investments, all with Vitality rewards.
Achieve more than YOU BELIEVE
Discovery Corporate & Employee Benefits Data Engineer (Junior)
About Discovery
Discovery’s core purpose is to make people healthier and to enhance and protect their lives. We seek out and invest in exceptional individuals who understand and support our core purpose, and whose own values align with those of Discovery. Our fast-paced and dynamic environment enables smart, self-driven people to be their best. As global thought leaders, Discovery is passionate about innovating in order to not only achieve financial success, but to ignite positive and meaningful change within our society.
About Discovery Corporate & Employee Benefits
Discovery Corporate and Employee Benefits is the first and only employee benefits provider to shape employee behaviour, creating healthier and wealthier workforces. It is an exciting business to be in as we reimagine the way retirement savings and life insurance are brought to companies and employees.
Key Purpose
The role entails building a reusable sustainable framework to ensure collection, processing, and availability of high-quality health care data to enable us to achieve the core purpose. The Data Engineer will work collaboratively with the Program Managers, Data Scientists, Systems Architects to define data sources and to build a custom data framework that facilitates Machine Learning, AI and productionising AI models based on the principles of ETL/ELT. Together these teams will enable data driven actionable insights.
Areas of responsibility may include but not limited to
- Work within a highly specialized and growing team to enable delivery of data and advanced analytics system capability.
- Develop and implement a reusable architecture of data pipelines to make data available for various purposes including Machine Learning (ML), Analytics and Reporting
- Work collaboratively as part of team engaging with system architects, data scientists and business in a healthcare context
- Define hardware, tools, and software to enable the reusable framework for data sharing and ML model productionization
- Work comfortably with structured and unstructured data in a variety of different programming languages such as SQL, R, python, Java etc
- Understanding of distributing programming and advising data scientists on how to optimally structure program code for maximum efficiency
- Build data solutions that leverage controls to ensure privacy, security, compliance, and data quality
- Understand meta-data management systems and orchestration architecture in the designing of ML/AI pipelines
- Deep understanding of cutting-edge cloud technology and frameworks to enable Data Science
- System integration skills between Business Intelligence and source transactional
- Improving overall production landscape as required
- Define strategies with Data Scientists to monitor model’s postproduction
- Write unit tests and participate in code reviews
Personal Attributes and Skills
- Exceptional analytical, conceptual thinking and problem solving skills
- Excellent oral and written communication skills.
- Ability to understand entity-relationship diagrams, normalized and de-normalized structures
- Excellent planning, organizational, and time management skills
- Scope and size BI initiatives.
- Work breakdown management
- Coach and co-ordinate team members
- Data manipulation, storytelling, and visualization
- Experience in Excel Pivot
- Honours or Master’s degree in BSc Computer Science specialising in Data Science or Data Engineering (Honours or Master’s), IT degree, Data Engineering or related.
- Experience in data pipelines, data modelling and machine learning (advantageous).
- 3 - 5 years working experience.
- A working knowledge of the Operations environment throughout Employee Benefits – Compass or Sonata (advantageous).
- Expert in programming languages such as MS SQL, Oracle PL\SQL, DAX, MQL, SSIS, (Mongo Query Language) and (R, Python, Scala,.Net or Java).
- Essential skills required (MS SQL, Oracle PL\SQL, DAX, MQL).
- Expert database knowledge in SQL and experience with MS Azure tools such as Data Factory, Synapse Analytics, Data Lake, Databricks, Azure stream analytics and PowerBI.
- Modern Azure data warehouse skills.
- Expert Unix/Linux admin experience including shell script development (Not essential).
- Exposure to AI or Data modelling development.
- Experience working on OLTP systems.
- Experience working on large and complex datasets.
- Understanding and application of Big Data and distributed computing principles (Azure Data Lake, Azure Databricks, Azure Synapse Analytics, Azure HDInsight and Azure Synapse Analytics).
- Data and ML model optimization skills in a production environment.
- Production environment machine learning and AI (advantageous).
- DevOps / DataOps and CI/CD experience (essential).
- AWS experience (advantageous).
Employment Equity
The Company’s approved Employment Equity Plan and Targets will be considered as part of the recruitment process. As an Equal Opportunities employer, we actively encourage and welcome people with various disabilities to apply.
EMPLOYMENT EQUITY
The Company’s approved Employment Equity Plan and Targets will be considered as part of the recruitment process. As an Equal Opportunities employer, we actively encourage and welcome people with various disabilities to apply.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture AWS Azure Big Data Business Intelligence CI/CD Computer Science Databricks Data management DataOps Data pipelines Data quality Data warehouse DevOps ELT Engineering ETL Excel Java Linux Machine Learning MS SQL Oracle Pipelines Power BI Privacy Python R Scala Security SQL SSIS Unstructured data
Perks/benefits: Career development Equity / stock options Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.