Lead Research Scientist (Finetuning)

San Francisco

Full Time Senior-level / Expert USD 184K - 240K

Twelve Labs

Recognized by leading researchers as the most performant AI for video understanding; surpassing benchmarks from cloud majors and open-source models.

View all jobs at Twelve Labs

Apply now Apply later

Posted 2 days ago

Who We Are

At TwelveLabs, we are pioneering the development of frontier multimodal foundation models that can see, hear and understand the world as humans do. Our models have redefined the standards in video-language modeling, allowing developers to build programs with state-of-the-art semantic search, summarization and analysis capabilities.

TwelveLabs has raised $107 million in Seed + Series A funding from world-class VC & corporate partners: NVIDIA, NEA, Radical Ventures, Index Ventures, Snowflake and Databricks. Our advisory team features AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.

About the Science Team

The Science team is at the forefront of multimodal AI research, tackling the most critical technical challenges in video understanding. Our core research areas include video embedding and search, multimodal language models capable of reasoning over video content, and intelligent agents that can interact with and analyze video data.

We go beyond academic research: our goal is to ensure that research outcomes are directly integrated into products and platforms, delivering real value to users. We work closely with the Engineering and Product teams and foster a collaborative culture centered around open communication and dynamic idea exchange.

About the Role

As a Lead Research Scientist (Finetuning), you will play a key role in driving TwelveLabs' core technology research and helping define its direction. You’ll conduct pioneering work in video understanding, multimodal learning, and AI agents; identifying critical research problems, designing innovative solutions, and running effective experiments. This also involves developing data strategies and defining evaluation methodologies.

You will lead finetuning efforts for video embedding and video language models, closely collaborating with the MLE and Solutions Engineering team to productionize finetuning efforts. You’ll collaborate closely with team leads, fellow scientists and researchers, clearly communicating your findings and contributing to TwelveLabs’ broader research roadmap and culture.

What Makes This Role Unique at TwelveLabs

TwelveLabs takes a focus-and-collaborate approach to tackling complex video AI challenges. Rather than solving isolated problems, we work together as a unified team toward the broader goal of understanding video.

Our research philosophy strikes a balance between rigorous scientific experimentation and real-world application. We aim to build multimodal systems that are not only powerful, but also trustworthy and interpretable. Open communication and mutual learning are central to our culture, enabling us to quickly evolve ideas and pursue the most impactful research directions. These align closely with our core values: integrity, growth mindset, and tenacity.

You Might Be a Great Fit If You Have

We’re looking for candidates with strong research experience in areas like video (multimodal) understanding, large language models, domain adaptation, representation learning, or action recognition; especially where those align with our mission. Your experience should be supported by past projects, your contributions, and related publications.
You should be capable of independently leading research projects from ideation to execution. Strong proficiency in Python and Pytorch is essential.
Excellent written and verbal communication in Korean as well as English is important for working effectively with colleagues from diverse backgrounds.
A Phd or Masters Degree in addition to significant research experience in computer science, AI, or related fields. Additional experience developing and deploying large-scale ML models in production, or optimizing large model training, is a major plus.

Even if there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply! If you are a 0-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at TwelveLabs.

We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.