Head of AI

New York / San Francisco

Applications have closed

Patronus AI

Deliver AI products safely and confidently. Based on industry-leading AI research, evaluation models, and tools.

View all jobs at Patronus AI

Find more jobs like this Jobs in the United States

Posted 2 months ago

About Patronus AI, Inc.

Patronus AI’s mission is to develop intelligent systems that supervise the next generation of AI applications. We are solving the problem of scalable oversight - how can humans continue to supervise AI systems when AI far outperforms them in many real world scenarios? Our vision is a world in which AI evaluates AI.

Our team comes from top applied ML and research backgrounds, including Facebook AI Research (FAIR), Airbnb, Meta Reality Labs, and quant finance. As a team, we have published research papers at top ML conferences (NeurIPS, EMNLP, ACL). We have trained and released models (Lynx, Glider) and novel benchmarks (FinanceBench, SimpleSafetyTests) that are used by the world’s leading AI companies. Our product is used by startups and Fortune 100 enterprises in education, finance, healthcare and more.

We are backed by Lightspeed Venture Partners, Notable Capital and high profile operators like Amjad Masad, Gokul Rajaram, and Fortune 500 executives and board members. We are advised by Douwe Kiela, Adjunct Professor at Stanford University and former Head of Research at HuggingFace.

Responsibilities

As Head of AI at Patronus AI, you will lead a team of researchers and engineers to develop AI that provides supervision and feedback to AI systems. You will work with the CTO to solve the most important and challenging open research problems facing society’s adoption of AI today, focused on evaluation, explainability and robustness.

In this role, you will

Lead a team of Research and ML engineers to conduct research experiments and translate findings to applied AI features in the product. Manage timelines, provide technical guidance, and ensure high-velocity output across the team.
Solve challenging, open ended problems in AI evaluation research. Develop SOTA systems that leverage memory, long context reasoning, multimodal capabilities and tool use to achieve semi-autonomous, scalable supervision of AI systems.
Work closely with the CTO to set and drive research vision for the company.
Scope and drive research projects, including experiment design, timelines for research deliverables, results analysis.
Synthesize literature on AI evaluation and LLM development. Implement algorithms based on state-of-the-art NLP advancements in the areas of evaluation, alignment, RAG and agentic systems.
Set the culture for rigorous, high quality research. This includes developing processes for dataset collection, model training, benchmarking and inference.
Experiment with latest technologies and proactively suggest experiments and improvements to research and ML systems. Adapt to changes in generative AI landscape, and incorporate new models and frameworks when applicable.
Guide the construction of novel benchmarks to probe capabilities of SOTA AI systems, such as reasoning in agentic systems and real world domains.
Collaborate closely with product and engineering in our globally-based team.
Maintain relationships with AI community, including foundation model developers, academic and industry researchers, AI startups, policymakers. Contribute to the research community through open source and publications. Represent the company in thought leadership by presenting and publishing findings, speaking at industry events and engaging in industry-wide discussions.
Recruit research and ML engineering talent and manage external collaborators (PhD fellows, industry collaborators).

Qualifications

Above all, we look for an eagerness to learn, passion for research, creativity in problem solving and a proactive mindset. You are a great fit if you have a background in the following:

PhD in Computer Science, Mathematics, Statistics, Linguistics or other quantitative field.
Publications at leading AI conferences, journals or workshops, such as NeurIPS, ICML, EMNLP, ACL, AAAI.
Experience conducting empirical NLP research in an academic or industry research lab.
Knowledge and understanding of state-of-the-art machine learning concepts, with a focus on NLP and search architectures. Familiarity with transformer-based architectures, attention mechanisms, evaluation metrics and benchmarks. Knowledge of search and RAG systems is a bonus.
Experience training language models in applied or research settings.
Experience working and communicating cross functionally in a team environment.
Creativity in problem solving and strong communication skills.
Have good character, integrity and respect for others.

Benefits