Computational Linguist (Gen AI Evaluation)

United States, United States

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Mid-level / Intermediate USD 52K - 123K * ^est.

Sigma

Create smarter AI with better training data. Sigma.AI provides highest quality data annotation and data collection at scale, custom-fit to your machine learning needs.

View all jobs at Sigma

Apply now Apply later

Posted 9 hours ago

🌟 Join Sigma.AI – Shaping the Future of Artificial Intelligence 🌍

🔹 What is Sigma?
Sigma is a leading global technology company specializing in data collection and annotation for Artificial Intelligence. With over 30 years of experience, offices in Spain, the US, and the UK, and operations in more than 200 languages, we support top multinational clients in developing cutting-edge AI solutions.

About the Job

We’re looking for a versatile Computational Linguist to join our R&D team focused on evaluating and supporting Generative AI systems. This role combines linguistic expertise, data analysis, and hands-on experimentation with large language models. You’ll help design annotation workflows, create and refine guidelines and internal documentation, prototype task-specific evaluation metrics, configure annotation tools, and analyze annotator and model performance using real-world data, contributing to papers and articles as needed.

This is a hybrid linguistics + data science role: ideal for someone who can move between qualitative language analysis and quantitative evaluation. You'll be working cross-functionally with researchers, and annotators to design creative, rigorous, and scalable evaluation processes for LLM-driven workflows.

Required Qualifications

Master’s degree (or equivalent experience) in Computational Linguistics, NLP, Linguistics, or a related field
2+ years of experience in NLP or AI projects (industry or research)
Experience using and fine-tuning transformer-based language models (e.g., BERT, GPT)
Proficiency in Python programming
Comfortable with Linux environments and Bash scripting
Experience working with public datasets (such as from HuggingFace, Kaggle, etc)
Familiarity with LLM behavior, prompt-based evaluation, and generative model outputs
Comfortable with structured data formats (JSONL, CSV), Jupyter notebooks, and pandas-based analysis
Fluent in English

Preferred Qualifications

Strong understanding of current trends and techniques in generative AI
Experience with annotation tools (e.g., Label Studio, Prodigy) and quality metrics for human data
Experience creating and curating bespoke datasets
Familiarity with evaluation challenges in creative or subjective NLP tasks
Proficient with Python NLP and data science libraries: pandas, numpy, scikit-learn, NLTK
Experience with generative AI SDKs and frameworks (e.g., OpenAI, Google, Anthropic, LangChain)
Understanding of linguistic typology, multilingual NLP, or sociolinguistic variation
Experience working in WSL environments
Experience collaborating with annotation teams and QA processes

Salary: 80-90 K $US

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Deep Learning Jobs

Tags: Anthropic BERT CSV Data analysis Generative AI GPT HuggingFace Jupyter LangChain Linguistics Linux LLMs NLP NLTK NumPy OpenAI Pandas Python R R&D Research Scikit-learn