Senior Researcher: Artificial General Intelligence (Audio, Speech and Multimodal Processing)
US-Bellevue
Full Time Senior-level / Expert USD 129K - 219K
Tencent
腾讯于1998年11月成立,是一家互联网公司,通过技术丰富互联网用户的生活,助力企业数字化升级。我们的使命是“用户为本 科技向善”。Founded in 1998, Tencent is an Internet-based platform company using technology to enrich the lives of Internet users and assist the digital upgrade of enterprises. Our mission...Tencent is seeking researchers in artificial general intelligence (AGI) with a focus in audio, speech and multimodal processing at the senior and principal levels to join our AI Lab in Seattle, Beijing, and Shenzhen. We are looking for recognized experts and thought leaders specializing in speech, audio and multimodal processing to tackle a variety of tasks, including (but not limited to) speech enhancement, speech recognition, audio/speech synthesis, speech codec, music processing, and spatial audio in unified multi-modal foundation models. The ideal candidates are those who are self-motivated and passionate about advancing the state of the art of AGI by developing novel model architectures and algorithms and solving real-world problems. The job level will be determined based on the experience and accomplishments of the candidate.
- Work with other researchers to identify new and upcoming research areas, long-term ambitious research goals, and intermediate milestones by interacting with potential external and internal collaborators. Own long-term research strategy and plans to expand the impact of Tencent AI Lab.
- Identify undefined problems in existing technology and develop theoretically sound novel models and algorithms to address them.
- Design experiments, write reusable code, run evaluations, and analyze results.
- Collaborate with other researchers and engineers across functional groups to push forward the state-of-the art of AGI.
- Prioritize research that can be applied to Tencent's products. Deploy promising ideas quickly and broadly.
- Author research papers to share and generate the impact of research results across organizations and in the research community.
- Share research trends and best practices in the community by reviewing academic papers, serving on program committees and grant panels, speaking at Tencent events or research conferences, or organizing research conferences and visioning activities.
- Currently has or is in the process of obtaining a PhD degree in AI, computer science, electrical engineering, math, physics, or related technical fields.
- Proven record of influential publications in AI or speech, music and audio-specific conferences/journals (e.g., NeuIPS, ICML, IEEE Trans. ASLP, ICASSP, Interspeech, ISMIR, AES.)
- Expertise in speech, music and audio processing from both a signal processing standpoint and machine learning standpoint and ability to integrate traditional signal processing techniques with deep learning models to advance current speech, music and audio systems.
- Proficient in building and optimizing models for speech recognition, synthesis, enhancement, or other audio-related tasks.
- Hands-on experience with deep learning frameworks such as PyTorch. Has proven ability to design, train, and deploy deep learning models for speech, music and audio processing tasks with ability to write efficient, reusable code for processing large volumes of high-dimensional audio data.
- Strong communication skills for articulating research ideas, results, and the impact of innovations both within the organization and in the broader research community.
- Work authorization in the country of employment at the time of hire and maintains ongoing work authorization during employment.
Qualifications (Preferred):
- Familiarity with state-of-the-art (SOTA) approaches in speech, music and audio processing, such as transformer-based models, self-supervised learning (SSL) for speech, or end-to-end speech recognition and text-to-speech systems.
- Understanding of related fields such as acoustics, auditory perception, computer vision, natural language processing, or neuroscience as they apply to speech, music and audio processing. Ability to incorporate insights from these fields into the development of novel speech, music and audio technologies.
- Experience working with large-scale speech, music, audio and video datasets and developing big models that scale across multiple GPUs or cloud-based systems.
- Experience in multi-modal foundation models.
- Experience in model optimization for deployment in production environments.
- Experience in setting up and managing recordings using different types of microphone equipped devices, understanding their characteristics, and how they affect the captured audio quality.
We are interested in both new graduates and those with post-PhD academic or industry experience. Priority will be given to candidates who have demonstrated the ability to develop original research agendas and perform hands-on research, and who work well in a collaborative and dynamic environment.
About Tencent
Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.
Founded in 1998 with its headquarters in Shenzhen, China, Tencent's guiding principle is to use technology for good. Our communication and social services connect more than one billion people around the world, helping them to keep in touch with friends and family, access transportation, pay for daily necessities, and even be entertained.
Tencent publishes some of the world's most popular video games and other high-quality digital content, enriching interactive entertainment experiences for people around the globe.
Tencent also offers a range of services such as cloud computing, advertising, FinTech, and other enterprise services to support our clients' digital transformation and business growth.
Location State(s)
WashingtonThe base pay range for this position in the state(s) above is $129,600.0 to $219,600.0 per year. Actual pay is based on market location and may vary depending on job-related knowledge, skills, and experience. A sign on payment, relocation package, and restricted stock units may be provided as part of the compensation package, as well as other medical, financial, and/or other benefits, dependent on the specific position offered.Employees (and their families) are covered by medical, dental, vision, and basic life insurance. Employees are also able eligible to participate in the Company’s 401(k) plan, accrue from 15 up to 25 days of vacation leave per year, up to 10 paid holidays per year, 2 floating holidays and accrue up to 10 days of paid sick leave per year. Your benefits eligibility requirement will be adjusted to reflect your location, employment status, duration of employment with the company, and position level. Benefits may be pro-rated for those who start working during the calendar year.
Tags: AGI Architecture ASR Computer Science Computer Vision Deep Learning Engineering FinTech ICML Machine Learning Mathematics NLP Open Source PhD Physics PyTorch R R&D Research Speech synthesis
Perks/benefits: Career development Conferences Equity / stock options Health care Medical leave Relocation support Startup environment Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.