Lead Researcher - Multimodal AI

Bangalore, IN

Full Time Senior-level / Expert USD 28K - 67K *

Dolby Laboratories

Dolby entwickelt Audio-, Bild- und Sprachtechnologien für Film, TV, Musik und Spiele. Erleben Sie alles mit beeindruckendem Klang und atemberaubendem Bild

View all jobs at Dolby Laboratories

Apply now Apply later

Posted 1 month ago

Join the leader in entertainment innovation and help us design the future.

At Dolby, science meets art, and high tech means more than computer code. As a member of the Dolby team, you’ll see and hear the results of your work everywhere, from movie theaters to smartphones. We continue to revolutionize how people create, deliver, and enjoy entertainment worldwide. To do that, we need the absolute best talent.

This is an opportunity to play a key role in Dolby's new R&D Center in Bangalore as a Research Lead in our Advanced Technology Group "ATG", the research and technology arm of Dolby Labs. With multiple competencies that innovate on technologies in audio, video, AR/VR, gaming, music, broadcast and user-generated content, areas of expertise related to computer science and electrical engineering, such as AI/ML, computer vision, data science & analytics, distributed systems, cloud, edge & mobile computing, natural language processing, social network analysis, and computer graphics are highly relevant to our research.

What you’ll do:

As a Research Lead within Dolby ATG research, you will focus on three key areas of expertise:

• State-of the art, cutting edge, hands-on research:

As a researcher, you will invent and develop the next generation of AI based multimodal image, audio, and speech technologies.
With an in-depth understanding of the latest AI technologies and a good understanding of audio/video technologies, you will explore the applications of AI to the delivery, analysis, and creation of multimedia technologies including video and audio enhancement, analysis, classification, and separation. You will create technologies for the next generation of Dolby’s offerings.

• Managing, nurturing, and grooming top research talent:

You will lead and mentor a group of AI researchers working in the application of AI to multimodal analysis, processing, playback, and enhancement technologies. You will work with your team as a coach and mentor. You are passionate about developing junior, highly talented staff into researchers that work fully independently in a corporate environment.
Contribute to developing a dynamic, flexible, transparent, results-oriented and innovative working atmosphere.

• Technology strategy and direction setting:

You will define and lead research initiatives that leverage cross-media intelligence and analytics, invent multimodal machine learning algorithms, and utilize a deep understanding of multimodal perception to develop the next generation of multimedia technologies.
Jointly with Dolby’s world-class global research teams, you will set directions, identify projects, and build the next wave of AI based technologies driving Dolby’s cloud and licensing business.
You work with ATG technology leaders to co-define projects and assign your staff to global R&D initiatives led by other technology initiative leads. Work jointly with upper management, lead resource and work allocation.
You will also work with Dolby’s Business Groups (BG) to bring the research to life in many products, working closely with product managers, program managers, and BG engineering teams worldwide.

Education and desired experience

Ph.D. plus 5-10 years of corporate research experience with a degree in Physics, Electrical Engineering, Mathematics, Computer Science with a strong focus on AI.
You are an absolute top expert in AI with a deep and thorough theoretical understanding of the latest state-of the art AI technologies. You have a detailed understanding of all main network architectures, deployment modes, data augmentation and preparation, and theoretical performance analysis of model architectures. Knowledge of NLP and/or multi-modal architectures is highly desired. You have a good understanding of
Diffusion, autoregressive, or other generative models.
Self-supervised, contrastive learning, auto-encoders.
Audio, image, or text applications – Source separation, text-to-speech, music synthesis, image segmentation, image captioning, question answering, language models, etc.
Multimodal architectures and algorithms.
You have a track record of successfully applying AI technologies to multimodal problems including the combination of audio and video technologies.
You have deep knowledge of GPU/CPU implementations, algorithm validation and testing, implementation of ML/AI algorithms.
Strong track of inventing, developing and productizing novel AI based technologies in an industrial research environment.
Strong publication record, with publications in major machine learning conferences (e.g. NeurIPS, ICLR, ICML, etc.).
Strong background in statistical signal processing, decision theory, greedy algorithms, Bayesian modelling, random algorithms, time series, hypothesis testing, classification, clustering, hypothesis testing and multilinear regression analysis.
Experience with audio and video processing is a plus.
Highly skilled in C/C++, Python, TensorFlow or PyTorch.
Experience in managing, guiding and mentoring younger researchers.
Team-oriented work ethic and interest to work in cross-continental teams.
Excellent communication, collaboration, and presentation skills in English.