Data Scientist - Annotation Contract (3+ years exp, London UK)

London

Cohere

Cohere provides industry-leading large language models (LLMs) and RAG capabilities tailored to meet the needs of enterprise use cases that solve real-world problems.

View all jobs at Cohere

Apply now Apply later

Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.
We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.
Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is the one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.
Join us on our mission and shape the future!
Why this role?At Cohere, we’re obsessed with language and technology— we believe we need great writers and developers and always will. We also believe that remarkable talent, enthusiasm, and creative thinking add up to great work. We’re looking for someone with superb python and data science skills to join our team and help shape the future of language technology. The most successful candidate will be a quick learner who is excited to train our model by working on a wide variety of writing and code based prompts.
We are on a mission to build machines that understand the world and make them safely accessible to all. Data quality is foundational to this process. Machines (or Large Language Models to be exact) learn in similar ways to humans - by way of feedback. 
Our AI Data Trainers ensure that all samples fed to our AI model are well-written, technically sound and useful to the end user. By creating content that data scientists would find useful, working python, or on use cases you will be an essential component of improving our Large Language Model’s performance for iterations to come, thus having a lasting impact on Cohere’s tech. 
Please Note: This is role may require occasional work on-site at our Soho office in London, UK. We are looking for candidates who are able to commit 18-24 hours a week minimum to this project. 

As a AI Data Trainer, you will:

  • Spend the majority of your time writing or reading/proofreading code and natural language to create perfect samples to train our models.
  • Label, proofread, and improve machine-written and human-written code.
  • Raise the bar continually by writing new code that is of exceptional quality to solve a variety of tasks, with a particular focus on data analytics.
  • Adeptly vary the style, functionality of code examples.
  • Follow our style guide, and make recommendations on unique situations that fall outside of its scope.
  • Work with intense attention to detail while citing sources of information.

You might be a good fit if you are:

  • 3+ years of industry experience working on real-world data science problems and pipelines. You excel in data analysis and visualization.
  • a meticulous coder with an eye for readability, with experience in python and industry standard data science packages (numpy, pandas, matplotlib, sqlite, or others).
  • Able to use sql syntax writing and workflows.
  • You have good familiarity with file/data formats, such as markdown, json, xml, yaml, html.
  • A thoughtful and thorough code reviewer. You've spent time re-writing, proofreading, and giving feedback on others' code in a previous role. You've worked with a code style guide before and enjoyed it.

Other things you’ll need:

  • Located in the UK.
  • A fast, thorough reader with great comprehension skills.
  • A curiosity about ML or AI or LLMs, (bonus points if you have any experience in these).
  • Expert in web-based research skills that you've used for your code before.
  • Ability to follow complex instructions, navigate ambiguity and work independently in a remote or hybrid environment.
Interview Process:
- 20-30 minute video interview- Technical assessment

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.
Our Perks:🤝 An open and inclusive culture and work environment 🧑‍💻 Work with cutting-edge AI technology 🪴 A vibrant & central location 🥨 A great selection of office snacks 🏆 Performance-based incentives

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  4  1  0
Category: Data Science Jobs

Tags: CoHere Data analysis Data Analytics Data quality Excel JSON LLMs Machine Learning Matplotlib NumPy Pandas Pipelines Python RAG Research SQL XML

Regions: Remote/Anywhere Europe
Country: United Kingdom

More jobs like this