NLP Developer (Gen AI Evaluation Tools)

United States, United States

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Sigma

Create smarter AI with better training data. Sigma.AI provides highest quality data annotation and data collection at scale, custom-fit to your machine learning needs.

View all jobs at Sigma

Apply now Apply later

🌟 Join Sigma.AI – Shaping the Future of Artificial Intelligence 🌍

🔹 What is Sigma?
Sigma is a leading global technology company specializing in data collection and annotation for Artificial Intelligence. With over 30 years of experience, offices in Spain, the US, and the UK, and operations in more than 200 languages, we support top multinational clients in developing cutting-edge AI solutions.

About the Job

We’re looking for a pragmatic, Python-focused engineer to join our R&D team supporting the evaluation of Generative AI systems. This role is responsible for the internal tools that power our annotation workflows, evaluation pipelines, dashboards, while streamlining key processes across the team. You'll help develop scalable, well-documented applications used across internal R&D as well as client-facing projects, contributing to papers and articles as needed.

You’ll work closely with linguists and project lead to design tools that are efficient, user-friendly, and robust enough to support both exploratory and production use cases. You should be comfortable rapidly prototyping internal demos, annotation pilots, or experimental evals, and just as capable of evolving those into maintainable, production-grade tools.

Required Qualifications

  • 3+ years of experience programming in Python
  • Experience building with LLM APIs and frameworks (e.g., OpenAI, Anthropic, Google, Langchain)
  • Ability to transition from quick prototypes to robust, maintainable production code
  • Experience with web frameworks (e.g., Flask or FastAPI) and basic frontend development (HTML, JS, Bootstrap)
  • Strong familiarity with Linux and Bash scripting
  • Experience managing and querying SQL databases (esp. SQLite)
  • Experience with containerized development and Linux-based toolchains
  • Comfortable handling structured data pipelines (e.g., JSONL, CSV, file systems)
  • Familiarity with version control, reproducibility, and lightweight CI workflows
  • Strong communication and collaboration skills across technical and non-technical teams
  • Fluent in English

Preferred Qualifications

  • Experience designing and implementing Agentic AI applications
  • Familiarity with annotation tools (e.g., Label Studio, Doccano) and evaluation workflows
  • Exposure to Hugging Face libraries, prompt templating, or model evaluation frameworks
  • Basic understanding of NLP task structures and GenAI evaluation goals
  • Experience building dashboards and visualizations (e.g., using Plotly, DataTables, or D3)

Salary: 80-90 K $US

Apply now Apply later
Job stats:  3  1  0

Tags: Anthropic APIs CSV D3 Data pipelines FastAPI Flask Generative AI LangChain Linux LLMs NLP OpenAI Pipelines Plotly Prototyping Python R R&D SQL

Region: North America
Country: United States

More jobs like this