Computational Linguist (Gen AI Evaluation)

United States, United States

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Sigma

Create smarter AI with better training data. Sigma.AI provides highest quality data annotation and data collection at scale, custom-fit to your machine learning needs.

View all jobs at Sigma

Apply now Apply later

🌟 Join Sigma.AI – Shaping the Future of Artificial Intelligence 🌍

🔹 What is Sigma?
Sigma is a leading global technology company specializing in data collection and annotation for Artificial Intelligence. With over 30 years of experience, offices in Spain, the US, and the UK, and operations in more than 200 languages, we support top multinational clients in developing cutting-edge AI solutions.

About the Job

We’re looking for a versatile Computational Linguist to join our R&D team focused on evaluating and supporting Generative AI systems. This role combines linguistic expertise, data analysis, and hands-on experimentation with large language models. You’ll help design annotation workflows, create and refine guidelines and internal documentation, prototype task-specific evaluation metrics, configure annotation tools, and analyze annotator and model performance using real-world data, contributing to papers and articles as needed.

This is a hybrid linguistics + data science role: ideal for someone who can move between qualitative language analysis and quantitative evaluation. You'll be working cross-functionally with researchers, and annotators to design creative, rigorous, and scalable evaluation processes for LLM-driven workflows.

Required Qualifications

  • Master’s degree (or equivalent experience) in Computational Linguistics, NLP, Linguistics, or a related field
  • 2+ years of experience in NLP or AI projects (industry or research)
  • Experience using and fine-tuning transformer-based language models (e.g., BERT, GPT)
  • Proficiency in Python programming
  • Comfortable with Linux environments and Bash scripting
  • Experience working with public datasets (such as from HuggingFace, Kaggle, etc)
  • Familiarity with LLM behavior, prompt-based evaluation, and generative model outputs
  • Comfortable with structured data formats (JSONL, CSV), Jupyter notebooks, and pandas-based analysis
  • Fluent in English

Preferred Qualifications

  • Strong understanding of current trends and techniques in generative AI
  • Experience with annotation tools (e.g., Label Studio, Prodigy) and quality metrics for human data
  • Experience creating and curating bespoke datasets
  • Familiarity with evaluation challenges in creative or subjective NLP tasks
  • Proficient with Python NLP and data science libraries: pandas, numpy, scikit-learn, NLTK
  • Experience with generative AI SDKs and frameworks (e.g., OpenAI, Google, Anthropic, LangChain)
  • Understanding of linguistic typology, multilingual NLP, or sociolinguistic variation
  • Experience working in WSL environments
  • Experience collaborating with annotation teams and QA processes

Salary: 80-90 K $US

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0
Category: Deep Learning Jobs

Tags: Anthropic BERT CSV Data analysis Generative AI GPT HuggingFace Jupyter LangChain Linguistics Linux LLMs NLP NLTK NumPy OpenAI Pandas Python R R&D Research Scikit-learn

Region: North America
Country: United States

More jobs like this