ML Research Resident (Q1 2025)

Oakland, or remote within US timezones

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Applications have closed

Elicit

Use AI to search, summarize, extract data from, and chat with over 125 million papers. Used by over 2 million researchers in academia and industry.

View all jobs at Elicit

Find more jobs like this Jobs in the United States

Posted 6 months ago

Elicit is building a research agent that can use an unlimited amount of test-time compute while keeping its reasoning transparent and verifiable.

The residency

Transformers do a fixed amount of computation per token, and the quality of work degrades rapidly when they are applied iteratively. As research resident, you'll work with us for 3 months on developing computational procedures (operators) that can reliably improve a knowledge state over thousands of iterations.

What is a knowledge state? A knowledge state consists of structured information - for example, a scientific paper might be represented as a set of claims supported by evidence and connected through logical reasoning; this might be combined with scratchpads, evergreen “notes to self”, search trees, and other information.

What counts as improvement? Like scientists, we want LLMs to make genuine progress in understanding - separating inferences from raw evidence, finding connections between ideas, building clearer explanations, and identifying gaps in reasoning. But unlike typical ML systems that are often trained to do “whatever works”, we need improvements that are epistemically sound - each step should make the knowledge state more useful while remaining human-readable. An improvement might reorganize information to better answer a question, find an implicit assumption in an argument, or connect evidence across multiple sources.

As research resident, your work will focus on designing and testing improvement operators that maintain stability over 1000+ iterations while making genuine progress. You'll start with simple cases (e.g., shallow refactoring of scientific papers) and demonstrate reliable iteration before scaling to more complex reasoning tasks.

Developing systems that perform legible reasoning over long horizons addresses core challenges in AI transparency and scalable reasoning.

About you

Strong candidates will have experience with LLMs, good intuitions about what makes reasoning systematic and verifiable, and care about AI transparency.

The best applicants will additionally have a strong software engineering background and concrete examples of how they've applied this background to come up with novel abstractions that push the frontiers of automated reasoning.

Logistics