Machine Learning Engineer
Cheltenham OR London, England, United Kingdom
Ripjar
AML Name Screening, Data Fusion, Adverse Media Screening, Threat Intelligence & more. Powerful solutions enhanced with A.I. machine learning.At Ripjar, we help governments and organisations automate the detection, investigation, and monitoring of threats from criminal activity.
Ripjar originally spun out from GCHQ and now has 140 staff based across Cheltenham, Bristol, London and Canberra, as well as a smaller presence in the USA and Singapore. We have two successful, inter-related products; Labyrinth Screening and Labyrinth Intelligence. Labyrinth Screening allows companies to monitor their customers or suppliers for entities that they aren’t allowed to or do not want to do business with (for ethical or environmental reasons). Labyrinth Intelligence empowers organisations to perform deep investigations into varied datasets to find interesting patterns and relationships.
Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an always-growing archive of 10 billion news articles in (nearly!) every language in the world going back over 30 years, sanctions and watchlist data provided by governments, plus 250 million organisations and ownership data from global corporate registries.
This is a great time to join a growing group of highly talented technologists and data scientists who are building products that solve real-world issues and are changing the way criminal activities are detected and prevented.
Team Mission
The core analytics team, which sits within the engineering team, enables the delivery of high-quality data science products and software to a variety of environments through technical skills, process implementation and software management, anchored in a continuous innovation culture.
What you'll be doing
We're looking for an experienced, highly motivated Machine Learning Engineer to support the design, development, and ongoing maintenance of Ripjar's analytics and data products. You will design and implement machine learning solutions, develop and optimise their models, and ensure their integration into Ripjar's software products and data processing pipelines, focusing on enhancing system performance and scalability. You will be working with a variety of Language models (including LLMs), machine learning tools and large-scale distributed clusters.
You will have a strong technical and theoretical background, and be proficient in at least one programming language that includes Python. You will have a good understanding of machine learning and large-scale data analysis, and will be adept at implementing and optimising algorithms to handle complex data at scale
Some recent developments, Ripjar’s data professionals have been involved with:
- AI Risk Profiles - Whitepaper – Entity Resolution across tens of millions of news articles
- Profile Summaries (generated using LLMs) – Using LLMs to summarise news articles linking an entity to financial crime (or other similar risks)
Key Tasks:
- Developing architectures and frameworks for machine learning systems that can handle large-scale data and complex computations.
- Develop and evaluate machine learning models to enhance Ripjar’s software and data products.
- Integrate ML models into new and existing components and consider the lifecycle and practical use of each model.
- Implement feature requests for Ripjar’s analytics components.
- Work with Ripjar's Data Engineers and engineering teams to support the scaling up and integration of new analytics and models into Ripjar's products and data processing pipelines.
- Produce statistical tests and summarise test outputs.
- Document system designs, models and test methodologies.
- Provide support to stakeholders in understanding the implementation of analytics, models and test results.
- Make use of Ripar’s large-scale data processing and analysis infrastructure to analyse data sets in order to identify patterns and to produce statistical outputs to support the development of new analytics and models.
Requirements
Key Skills
We value diversity of experience and thought and recognise successful candidates may not tick all the following boxes. If you think you have something to offer, then we'd love to chat to you and hear how you would contribute to this role.
- A good understanding of machine learning and experience training and deploying machine learning models within products at scale, including ongoing maintenance.
- Proficiency using a range of machine learning techniques including Natural Language Processing techniques for solving problems, and ideally making use of Large Language Models
- Proficiency in Python, particularly with machine learning and data science libraries such as PyTorch/Tensorflow, scikit-learn, numpy and pandas.
- Good communication and interpersonal skills.
- Experience working with large-scale data processing systems such as Spark and Hadoop.
- Experience in software development in agile environments and be an advocate of the software development lifecycle.
- Experience using and implementing ML Operations approaches
- Working knowledge of statistics and statistical models is valuable.
Benefits
Why we think you’ll enjoy it here:
- Base Salary of up to £70,000 per year DOE
- 25 days annual leave, rising to 30 days after 5 years of service
- Hybrid working option for employees
- Company Share Scheme
- Private Family Healthcare
- Employee Assistance Programme
- Company contributions to your pension
- Enhanced maternity/paternity pay
- The latest tech including a top of the range MacBook Pro
- Offices equipped with well-stocked pantries with food, snacks and drinks when in the office
Ripjar's Commitment to Diversity
“Diversity is essential in the way we operate. Having people from different backgrounds, genders and experiences ensures that we make decisions with a truly global perspective. Diversity gives us strength in our technology, analysis and relationships.” - Maria Cox, Head of People Operations
Tags: Agile Architecture Data analysis Engineering Hadoop LLMs Machine Learning ML models NLP NumPy Pandas Pipelines Python PyTorch Scikit-learn Spark Statistics TensorFlow
Perks/benefits: Career development Gear Parental leave Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.