Technical Staff, Data
London
Role Overview:
Data quality and diversity is the foundation for training the best agents in any domain. As a member of the Data Team at Reflection, you will play a pivotal role in shaping how we collect and analyze human, synthetic, and internet data. This is an interdisciplinary role that primarily requires engineering, research, and communication skills, along with a sharp attention to detail and willingness to “roll up your sleeves” and look at the data.
Key Responsibilities
1. Experiment and Benchmark Design
Develop techniques for collecting, augmenting, filtering, or synthesizing training and evaluation data using creativity and analytical thinking
Design experiments, in collaboration with machine learning researchers, to assess the impact of different datasets on model performance
When required, manage human annotators working on data collection efforts – this could include tracking payments and hours, training annotators, and providing technical support, feedback, and quality control
2. Qualitative and Quantitative Data Analysis
Analyze collected data, e.g. coding tasks, both qualitatively and quantitatively
Evaluate model behavior to identify its strengths and weaknesses
Clearly communicate findings with machine learning research and product teams
3. Data Engineering
Design, implement, and optimize scalable data pipelines to support reinforcement learning and supervised finetuning
Leverage LLMs to perform data filtering, cleaning, and augmentation
Qualifications:
Software engineering background with experience building data processing pipelines at scale, particularly with LLM integration
Proficiency in Python or other programming languages (Go, TypeScript, etc.)
Detail-oriented and analytical, with the ability to conduct careful qualitative and quantitative data analysis.
Excellent organizational and communication skills to collaborate closely with cross-functional teams and manage human data operations
Experience with machine learning, reinforcement learning, and LLMs is a plus, but not strictly required.
What We Offer:
The opportunity to work at the forefront of AI research and data collection for training cutting-edge models.
Collaboration with a team of world-class researchers and engineers from top AI labs and companies.
Competitive compensation and benefits, with opportunities for professional growth.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Data analysis DataOps Data pipelines Data quality Engineering LLMs Machine Learning Pipelines Python Reinforcement Learning Research TypeScript
Perks/benefits: Career development Competitive pay
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.